Antibodies are critical for biological and medical research. Yet, since early 2015, the state of the antibody industry has been described as being in a reproducibility crisis. One of the main issues is the improper characterization of primary antibodies. The health research industry is saturated with discrepant validation data on the sensitivity, specificity and reactivity of a variety of antibodies against the same immunizing agents (antigens). Aside from testing with a small number of assays, such as DNA sequencing, many vendors and manufacturers do not have a clear-cut quality assurance process for validating their antibody-based products such as commercial assay antibodies and/or therapeutics.
This inconsistency in data validation can cost researchers time and money, and in some extreme cases, their projects. In a Nature article in 2015, Andrew Bradbury and Andreas Plückthun estimated that in the United States alone, $800 million is spent on antibodies, and half of this sum accounts for improperly validated antibodies that lead to lengthy troubleshooting and delays for important research. One of the most prominent examples to date is that of a research group which hoped to develop a diagnostic assay against a novel ovarian cancer prognostic marker, CUZD1.
They ordered a kit that contained the antibody against their marker to validate their assay and spent two years of research work, thousands of patient samples and $500,000 on additional experiments only to realize that the kit’s anti-CUZD1 was actually anti-CA125, an antibody targeting another ovarian cancer prognostic marker, which was a product already in the diagnostic market. This situation could have been prevented if researchers safeguarded their work by testing against the right antigen prior to conducting experiments, by asking vendors how they routinely verify for the correct identity of their antibody products and/or by utilizing recombinant antibodies.
Standard commercial antibodies are typically purified from animals after an immune response to an antigen. The genomic DNA of the antibody-producing cells of these animals undergoes many rearrangements and other combinatorial events that result in many different antibody clones (polyclonal pool). The antibodies in this pool are then separated and tested using traditional molecular biology methods to produce monoclonal antibodies (mAbs) destined for commercial use.
In contrast, recombinant antibodies are antibodies whose known, unique sequence has been incorporated into a piece of DNA that can be used by bacterial or mammalian cells to continuously output the same antibody in a reproducible manner. Recombinant antibody technology exploits nature’s proofreading abilities and biological processes to generate reliable antibodies.
By nature, conventional commercial antibody production is commonly affected by batch-to-batch variability. This variability often stems from changes in or death of antibody-production animals, loss of hybridoma (antibody-producing) cell lines and/or antibody function artifacts from long-term propagation of the latter cell lines. The accidental loss of hybridoma cell lines is especially detrimental as genetic material might be unavailable to recover the coding sequence for continued antibody use.
However, if a small sample of antibody remains, protein sequencing based on mass spectrometry (MS) can “resuscitate” the antibody by providing its primary amino acid code, which can then be used to derive the DNA sequence and regenerate the antibody for storage and future use.
MS distinguishes proteins or protein fragments (peptides) based on mass (m) and charge (z). Proteins are digested with enzymes known as proteases, and the digestions are put through a mass spectrometer. Each observed spectral peak or signal in the MS instrument corresponds to a unique m/z ratio for a given peptide. The Mayo Clinic research group tracks unique m/z values to accurately and rapidly identify the type of antibody (isotyping) used in clinical laboratories. They frequently and successfully employ their mass spectrometers as tools for inspecting mAbs prior to assay development.
MS researchers also utilize tandem MS, which involves multiple MS instruments within one machine to yield diverse m/z spectra that can be analyzed to deduce the sequence of proteins. Traditional tandem MS methods commonly make use of only one protease, and do not result in many overlapping peptides. Our group recently unveiled a study showing that protein digestion with multiple proteases prior to tandem MS is beneficial for protein sequencing; using multiple proteases in MS experiments allowed us to piece 100 percent of the sequence of a protein with increased statistical certainty compared to using only one protease.
Knowing the full sequence of a protein with certainty is important for researchers because they can use the protein sequence to produce a recombinant antibody that will ensure their assays are reproducible.
In addition to protein sequencing, MS methodologies can also output data on protein modifications such as glycosylation (addition of sugar groups to a protein) that are unique to the target antibody and important for antibody structure, stability and function. MS can be used to pinpoint the exact location of sugar groups that tend to aggregate in the antigen binding region of antibodies. The exact location of sugar groups can be particularly difficult to assess through means other than MS.
Though the protein code is present in the genome, there are no known DNA sequences that can predict the specific sugar group that will attach to a protein and in some instances it is difficult to know the protein region to which the sugar group will be added. Thus, commonly used assays such as DNA sequencing might not be able to fully characterize an antibody; the scientific and medical community would also benefit from MS data on protein modifications such as sugar groups important for antibody activity.
Though MS studies have fully utilized DNA sequencing information to generate databases to identify antibodies, our group has also been able to show that protein sequencing is just as accurate as DNA sequencing and is able to fill in gaps when DNA sequencing data is unavailable. Furthermore, in combination with other biophysical techniques, MS can provide additional information to characterize antibody structure, binding strength and other properties.
Together, these features make MS-based protein sequencing a staple for future standard validation of commercial antibodies, and therapeutics. Protein sequencing remains an essential tool for combating inherent antibody woes for the betterment of medical research.
The authors would like to acknowledge editing and discussions with Mingjie Xie, Clayton Moore, and Drs. Bin Ma, Thierry Le Bihan, Zac McDonald, and Mohamed Shaad Hasim.