An analysis of all US Food and Drug Administration (FDA) approvals for protein-based assays through 2008 reveals 109 unique protein targets in plasma or serum, as well as 62 additional tests for peptides, protein posttranslational modifications, protein complexes, autoantibodies against endogenous proteins, and blood cell proteins. A further 96 unique protein targets are assayed in plasma by laboratory-developed tests available for clinical use in the US, yielding a total of 205 proteins that include products of approximately 211 genes (excluding immunoglobulins). These tests provide quantitative measurements for approximately 1% of the human protein gene products, defining a practical clinical plasma proteome. The rate of introduction of new protein analytes has remained essentially flat over the past 15 years, averaging 1.5 new proteins per year (median of 1 per year). This rate falls far short of that needed to support projected medical needs and indicates serious deficiencies in the protein biomarker pipeline, from which no proteomics-discovered analytes have yet emerged.
Proteins are molecular machines responsible for performing most catalytic and structural, as well as many signaling, functions of living organisms. Measurements of proteins thus offer abundant opportunities to detect and characterize molecular malfunctions related to disease and its progression as manifested in the individual patient. Given that blood and its liquid component (plasma or, after clotting, serum) are the overwhelmingly predominant clinical specimens available for routine molecular analysis, molecules present in blood have the widest diagnostic potential. Among these, proteins frequently have the greatest clinical significance. Proteins provide a broad picture of patient phenotype (current status), rather than indicating unchanging probabilistic risks that can be inferred from sequence analysis of genomic DNA (e.g., from white blood cells). Analysis of plasma mRNA (1), although capable of detecting a range of fetal and other phenotypic abnormalities, has not yet achieved widespread use in diagnostics. Likewise circulating tissue cells, useful in cell-based assays, have been observed in specific cancers (2) but have yet to be exploited in routine diagnostic medicine. Small molecule metabolites are abundant in plasma, but, being the products rather than the mechanisms of life processes, reveal a limited range of enzymatic and filtration defects. Thus among the classes of molecular analytes, proteins provide a unique combination of practical accessibility and broad clinical significance.
The clinical chemistry of proteins has already achieved fundamental importance in medicine. Existing protein tests provide a spectrum of clinical information, including definitive diagnosis of acute events (e.g., cardiac troponin released into blood after a myocardial infarction), prediction of disease risk [C-reactive protein (CRP)1 increases in coronary disease] and detection of disease recurrence (thyroglobulin in metastatic thyroid cancer after thyroid removal). These successes have raised hopes for new and improved clinical diagnostic tests for many disease indications and, as a result, have focused substantial attention on the discovery of novel protein biomarkers. Recent refinements of global protein detection methods, referred to as proteomics, hold out the promise of rapid progress in this area—promise as yet unfulfilled in the form of any widely used clinical test. Considerable discussion in the research community has identified several factors limiting progress in identifying new clinical biomarkers, including the lack of an effective technology platform to verify candidate markers in large sample sets (3), the difficulty of securing access to well-designed clinical sample sets without significant bias (4), the absence of an organized biomarker development pipeline (5)(6), and finally, the absence of anything approaching a useful theory of biomarkers. This last point exposes the totally empirical nature of biomarker research and suggests a sobering comparison with the pharmaceutical industry, in which progress is slowing despite research funding at more than 100 times the level of protein diagnostics.
In confronting such difficulties, it is instructive to survey past experience in the relevant field (clinical chemistry). One would like to know how many protein tests there are already, at what rate these have been discovered, what aspects of the protein are significant (e.g., concentration, chemical modification), what tests cost in practice, and so on. In this review, I have attempted to assemble publicly available data that shed some light on these questions, focusing particularly on tests that have been granted US Food and Drug Administration (FDA) clearance or approval (taken here as a proxy for demonstrated clinical value). The resulting lists of tests and associated protein accessions also provide a useful summary of well-characterized specific protein assays available for use in benchmarking biomarker research.
Sources and Methodology
On January 2, 2009, I downloaded a database file in text format (clia_detail.txt) of 42 452 diagnostic tests cleared or approved in the US under the Clinical Laboratory Improvement Amendments (CLIA) regulations from the FDA website (www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/Databases/ucm142437.htm). These data were collected by the Centers for Disease Control and Prevention (CDC) between 1993 and 2000 and by the FDA since February 2000. Tests cleared or approved before July 26, 1993, are listed as of that date (i.e., there is no information on the timing of test approvals before 1993). The file was parsed into an Excel spreadsheet yielding 885 unique analyte listings (including multiple names for a few analytes). I selected the list of unique “protein” assay targets by inspection of analyte names, using criteria aimed at selecting the gene products actually being measured (summary in Table 1⇓ ; detail in Supplementary Table 1, which accompanies the online version of this article at www.clinchem.org/content/vol56/issue2). A second list of additional polypeptide analytes (summary in Table 2⇓ ; detail in online Supplementary Table 2) consists of related molecules that do not meet the unique protein criteria. These fall into several groups. Peptides were distinguished from proteins by an arbitrary mass cutoff of 5 kDa. Posttranslational modifications (PTMs) comprise modified forms of a primary protein gene product already included in the unique protein list. Likewise, where a protein analyte is also tested as part of a complex with other proteins, the complexes are classified separately. Autoantibodies to human proteins, while obviously being proteins themselves, are treated as a separate category. Proteins in blood cells (measured in whole blood, but not in serum or plasma) are also treated separately. This classification, while helpful in establishing the protein analyte count, yields some anomalies. Brain natriuretic peptide (BNP) is classified as a peptide, whereas N-terminal pro-BNP (NT-proBNP) is a protein, and the date of introduction as a protein analyte is taken as the earlier of these 2. Proteins whose functional unit is a complex of several separate gene products not measured separately (e.g., complement C1 or ferritin) are treated as 1 test analyte instead of as complexes, but all component gene products are counted. Several tests, including some widely used enzyme tests of longstanding usefulness, are difficult to associate with a single specific gene product, and some classic tests [e.g., α-hydroxybutyrate dehydrogenase (α-HBDH), excluded here] appear to measure an enzyme activity of a protein with another name [α-HBDH seems to be an activity of a lactate dehydrogenase (LDH) isoenzyme (7)] and are thus excluded.
I obtained the earliest FDA approval date for each analyte and the total number of test versions approved by tabulation of records in the database file. I obtained the number of tests offered by vendors of major in vitro diagnostic (IVD) instrument platforms by selecting and tabulating versions whose “Vendor” field included 1 of the following: Abbott, Bayer, Beckman, Becton, Hitachi, Dade, Diagnostic Products, Olympus, Ortho, Roche, or Siemens.
I collected additional data for analytes listed in Supplementary Table 1 by a manual Web-based search procedure. Sequence accessions from the SwissProt database (us.expasy.org/sprot/, accessed in August 2009) were added except where protein identification was unacceptably ambiguous (as indicated by ? or ∼ in the supplementary table). I obtained Current Procedural Terminology (CPT) codes primarily from the Quest and ARUP websites, and reimbursement rates, by searching the 2008 Clinical Diagnostic Laboratory Fee Schedule (www.cms.hhs.gov/ClinicalLabFeesched/02_clinlab.asp). I obtained molecular weights and normal concentration values from the list of Hortin et al. (8) or from SwissProt (molecular weight). Five classes of protein analytes were assigned by inspection: proteins that act in plasma (at least 1 function in circulation), immunoglobulins, receptor ligands (e.g., hormones), tissue leakage (e.g., cardiac troponins), and aberrant secretions (e.g., cancer markers).
Unique protein analytes (as defined for FDA-cleared or -approved tests above) for which plasma- or serum-based tests are conducted by reference laboratories were tabulated by inspection of tables created on August 8, 2009, from the test menus on public websites of Quest (Nichols Institute), ARUP, and Mayo Medical Laboratories, which were then merged into a single list maintaining a count of laboratories offering each unique test.
Similarly, I manually searched the Directory of Rare Analyses (DORA) (9), last published in 2007 by J.M. Hicks and D.S. Young, for protein-based tests and assembled a spreadsheet (see online Supplementary Table 3). DORA contained tests for 65 proteins and 32 peptides not found in the FDA-approved list.
Given the numbers of tests surveyed and the selection of protein analytes by inspection, there is significant potential for error or inexactness in the assembled data. Any such errors are the responsibility of the author, who would be grateful for material corrections.
FDA-Cleared or -Approved Tests for Proteins in Plasma or Serum
The primary objective of this analysis was to investigate the number and character of protein analytes currently measured by IVD tests, beginning with those that have been cleared or approved by the FDA. For this purpose, I used a rather restrictive definition of protein, resulting in selection of 109 unique protein analytes (Table 1⇑ , online Supplementary Table 1). These proteins comprise approximately 114 gene products, if we exclude the 7 immunoglobulin component chains (5 heavy-chain and 2 light-chain types made by recombination of approximately 26 partial genes).
A large majority of the protein analytes (88 of 109, approximately 80%) are typically measured by immunoassay (see online Supplementary Table 1), with the remainder accounted for by enzyme assay (19 of 109, 1 of which uses an antibody) and functional coagulation assay (2 of 109). Costs for the FDA-approved assays, using average 2008 Medicaid reimbursement rate as a proxy, vary by more than 10-fold, from $9 for albumin to $122 for her-2/neu protein, with an average cost of $29 (median $25). Average immunoassay cost across these analytes is $31, whereas average enzyme assay cost is $16, reflecting the age and overall simplicity of enzyme assays. There appears to be little if any relationship between test cost and the protein’s normal concentration or the date of the test’s approval.
The largest subset of the 109 protein analytes (45%) carry out a known function in plasma other than antigen binding, and a further 6% are immunoglobulins (Fig. 1⇓ ). Hence a total of 51% of assays measure proteins that can be considered to carry out their normal functions in plasma, adopting a previous classification (10). Tissue leakage products (25%) and receptor ligands (hormones, etc.; 18%) constitute the other major classes, with the remainder comprising aberrant secretions (mainly tumor markers; 6%).
Pace of Introduction of New Tests
Of the 109 unique FDA-approved protein tests as defined here, 87 (80%) were introduced before 1993 and 22 (20%) in the last 15 years (Fig. 2⇓ ). The average rate of FDA approval of new protein-based tests over the last 15 years is thus approximately 1.5 new tests per year (median of 1 per year), a rate that appears to have been essentially constant over this period. Although an earlier analysis through 2002 (10) suggested a substantial decline, the present reanalysis extending through the end of 2008 indicates only a slight downward trend. The current rate of introduction does seem to be less than pre-1993, given that most pre-1993 tests were probably introduced in the prior 15-year period (1977–1993, the heyday of immunoassay development with monoclonal antibodies): 87 tests introduced between 1977 and 1993 would yield an average rate of 5.8 new tests per year.
Approximately 90% of all protein tests ever formally cleared or approved are still in active use: 98 of the 109 tests are offered by 1 of the 3 reference laboratories surveyed, and 85 (78%) are offered by all 3. A somewhat smaller proportion (80 tests, 73% of the total) have been offered on major vendor instrument platforms as defined here, and 77 tests in both settings. Most of the 11 tests not offered by 1 of the 3 reference laboratories are probably obsolete (10 introduced before 1993, including 5 enzymes): only myeloperoxidase was introduced recently (2005).
Individual protein analytes appear multiple times in the database, each the result of a CLIA approval and categorization (here called an “entry”). Repeat entries can indicate that a new assay has been developed (e.g., by a new vendor), that an assay has been improved or changed category, that an existing assay has been implemented on a new instrument platform, or even that the vendor has changed its name. Hence the absolute count of assay entries is not directly interpretable in terms of the number of distinct assays that have been made to detect the analyte, but is used here as a proxy for the amount of development activity around a given protein analyte. In aggregate, the 109 proteins tests have been entered a total of 7603 times, with 2949 of these entries submitted by major instrument platform vendors.
CRP has been entered most frequently (378 entries), perhaps because of its dual use (low-sensitivity test for inflammation and high-sensitivity for cardiovascular risk). Nine other tests have been entered >200 times, including 6 enzymes [amylase, aspartate aminotransferase (AST), alanine aminotransferase (ALT), γ-glutamyltransferase (GGT), lipase, and creatine kinase-MB (CK-MB), here considered a protein complex] and 3 high-/medium-abundance plasma proteins (albumin, ferritin, and fibrinogen). A further 21 tests have been entered between 101 and 200 times, 37 tests between 11 and 100 times, 28 tests 2–11 times, and 13 tests just once. Among tests implemented >100 times, only 1 was introduced more recently than 1993 [cardiac troponin I (TnI), 1995], whereas 7 of the 13 single-entry tests arrived in that period.
Judging by entry frequency, the most successful tests introduced in the last 15 years (each with at least 10 implementations) are cardiac TnI (1995; 103 implementations), lipoprotein(a) [Lp(a)] (1997; 62), cancer antigen 15-3 (CA15-3) (1998; 36), pancreatic amylase (2002; 36), cystatin C (2001; 22), NT-proBNP (2002; 18, plus BNP alone 2000; 44), and soluble transferrin receptor (sTfR 1997; 18). Immunoassays predominate in recent tests: only 1 of these 7 involves an enzyme (pancreatic amylase), and even in that case, the test usually involves an antibody to achieve isoenzyme specificity.
Under CLIA regulations, tests are classified in 3 categories as to complexity: high, moderate, and waived. Of the 109 protein analytes, 95 have a high-complexity version (presumably introduced first), and of these, 54 have a moderate-complexity version of the same date (i.e., 1993). In 17 cases, a moderate version was introduced in the last 15 years after a preexisting high version, and in 12 cases a moderate test was introduced without a high-complexity precursor. A total of 9 analytes have waived versions (considered simple and accurate to use, in each case approved in the last 15 years and following after >100 approved high- and moderate-complexity versions. The waived analytes include 5 enzymes (GGT, AST, ALT, amylase, and alkaline phosphatase), 3 hormones [thyroid-stimulating hormone (TSH), luteinizing hormone (LH), follicle-stimulating hormone (FSH)], and albumin.
Other FDA-Cleared or -Approved Tests
A further 62 tests involving human proteins have been cleared or approved by the FDA (online Supplementary Table 2, Fig. 3⇓ ). Autoantibodies against human proteins formed the largest subset (20 tests; numerous allergen tests were excluded because they involve IgE against nonhuman protein antigens), some of which are aimed at detecting autoimmune disease (e.g., antiacetylcholine receptor antibodies) and some at detecting autoantibodies known to interfere with immunoassays of the protein target (e.g., antithyroglobulin antibodies (11)). The second-largest subset involved posttranslational modifications of proteins for which tests exist already in Table 1⇑ (18 tests, of which 6 involve glycosylation and 12 involve proteolytic cleavage and/or cross-linking). These tests demonstrate the clinical potential for PTM-based biomarkers emerging from advanced proteomics studies. Peptides falling beneath the 5-kDa arbitrary cutoff (7 tests), complexes (3 tests), and proteins assayed in blood cells but not in plasma or serum (14 tests) account for the remainder.
Laboratory-Developed Tests for Additional Unique Proteins in Plasma/Serum
A survey of large reference laboratory test menus and the published DORA yielded an additional 96 unique protein tests applied to plasma or serum (see online Supplementary Table 3). A pool of 3 reference laboratories offered tests for 62 unique proteins in addition to 98 of the 109 FDA-approved analytes. DORA listed a further 34 unique protein tests (in addition to 31 also offered by the reference labs), approximately half of which consist of cytokines and growth factors widely measured in cell biology studies. The overall total number of unique proteins measured is 205 (Fig. 4⇓ ).
Clinical Plasma Proteome
These assays, taken together, provide a reasonable estimate of the scope of the current clinical plasma proteome. FDA-cleared or -approved assays available for 109 human proteins cover approximately 0.5% (116) of the roughly 20 500 protein-coding human genes (12). Assays for an additional 96 proteins are offered as laboratory-developed tests (LDTs) by major reference laboratories or referred to in DORA, yielding a total of 205 unique protein analytes in plasma or serum (approximately 1% test coverage of the proteome). If we were to include related assays targeting peptides, posttranslational modifications, protein complexes, etc., the grand total would be significantly larger: 62 such additional assays were found among those approved by the FDA, but LDTs in these categories were not tabulated. Despite some ambiguities regarding the specific isoforms and subunit compositions of proteins being measured, as well as the recognized specificity limitations of enzyme and immunoassays (13), overall this assay portfolio compares favorably with current proteomics discovery methods in terms of sensitivity, precision, and site-to-site reproducibility. Its size and widespread clinical utilization demonstrates beyond any doubt the value of protein-based diagnostics.
Implications for the Future of Protein Diagnostics
Unfortunately, the diagnostic protein portfolio is growing much too slowly to meet important medical needs, which include tests for Alzheimer disease, chronic obstructive pulmonary disease (COPD), any imminent cardiovascular event, occlusive vs hemorrhagic stroke, and most forms of cancer. In reality, early detection of almost any serious disease constitutes a significant unmet need. Although the rate of new protein analyte introduction measured over the past 15 years no longer appears to have been falling (as it appeared in an earlier analysis (10)), the rate appears to be very low indeed, averaging 1.5 new protein analytes per year. This rate is far below what must have been achieved in the early phase of immunoassay development, when most of the 87 pre-1993 tests seem to have appeared at a rate closer to 6 per year. Growing biomarker discovery efforts appear to have the potential to kickstart a new wave of diagnostic development. Global methods of proteomics (14)(15)(16) can now detect >1000 proteins in plasma (5 times the number for which we have specific tests) despite being limited so far to ∼μg/L (ng/mL) sensitivity (roughly 1000-fold less sensitive than the best specific immunoassays). However, none of the tests approved by the FDA to date appear to have been discovered as biomarkers using the methods of proteomics, and indeed the development of proteomics does not appear to have yet caused a visible increase in the rate of introduction of new protein analytes.
The approximately 100 LDTs listed here might serve as an alternative source of new FDA-approvable tests if reality conformed to a naive paradigm in which analytes are introduced as LDTs, then progress to FDA approval on a hospital instrument, and finally mature to a point–of-care device (a process described by knowledgeable IVD executives as requiring 10–20 years per stage). However, there seems to be little evidence that current LDTs include new blockbuster tests like BNP, or that many current LDTs are progressing onto major IVD platforms (entrenched LDTs with little regulatory burden could have a lasting competitive advantage). Given the increasing pressure (17) for the FDA to extend the more rigorous regulatory mechanisms of 510(k) and premarket approval (PMA) registration to cover these LDTs (thereby significantly increasing the cost to laboratories offering them, and weeding out poorly validated and perhaps unapprovable analytes), it is in fact possible that the total number of protein tests offered clinically could decline substantially over the next few years.
On the other hand, it may well be darkest before the dawn. Several developments could signal the arrival of positive disruptive change in the protein diagnostics arena. Serious efforts are now underway to demonstrate technology components for, and ultimately assemble, an effective high-throughput pipeline for winnowing the large numbers of poorly-credentialed protein biomarker candidates emerging from proteomics research to reveal those with real (approvable) clinical potential. The lack of such an integrated development pipeline is widely believed to be the largest barrier to increasing the flow of new protein tests. At the same time, the use of mass spectrometry (MS) as analyte detector in protein measurement may solve two of the most vexing issues with immunoassays: specificity (MS detectors can provide absolute structural specificity for unique-sequence tryptic peptides measured as surrogates for specific analyte proteins in plasma digests, allowing robust interference rejection) and multiplexing (MS detectors can measure >100 peptides at very little incremental cost per added analyte). If MS methods are able to achieve immunoassay sensitivity (approaching ng/L concentrations) and precision (18) at comparable cost, then similar MS instrument platforms could find use in both biomarker research and development (where a comprehensive library of specific assays for all human proteins has been proposed (19)) and in the clinical laboratory. Such a shared platform between biomarker development and the clinical laboratory could potentially eliminate the costly and time-consuming redevelopment of immunoassays for IVD use.
In parallel with these technical advances, 2 major improvements in the use of IVD results may further leverage their clinical (and thus economic) value. The ability to use multiplex panels of specific proteins may significantly improve diagnostic performance through use of protein ratios (giving improved internal standardization) and more sophisticated interpretive algorithms. Perhaps more important, measurement of changes within individual patients over time using periodic sampling (true personalized medicine, with a personal baseline replacing the current population-based reference interval) allows detection of smaller, and thus earlier, disease-related changes (20). Both of these approaches aim to increase biological signal in relation to noise to increase the clinical relevance of test results, and both require a significant increase in the total volume of protein tests performed in the clinical laboratory.
Time will tell whether a revolution in protein diagnostics lies ahead. Such a possibility has long been discounted given the conservative structure of the IVD industry, its regulatory load, limited venture investment, and the paucity of attractive new markers to date. In fact, there appears to have been no really disruptive change in protein diagnostics since the introduction of immunoassays almost 50 years ago (21). Even so, business as usual no longer appears to be a viable option in the face of current challenges in healthcare delivery. If the tools are at hand to construct a new paradigm, we are fortunate indeed.
Author Contributions: All authors confirmed they have contributed to the intellectual content of this paper and have met the following 3 requirements: (a) significant contributions to the conception and design, acquisition of data, or analysis and interpretation of data; (b) drafting or revising the article for intellectual content; and (c) final approval of the published article.
Authors’ Disclosures of Potential Conflicts of Interest: No authors declared any potential conflicts of interest.
Role of Sponsor: The funding organizations played no role in the design of study, choice of enrolled patients, review and interpretation of data, or preparation or approval of manuscript.
Acknowledgments: The author is grateful to Elizabeth Mansfield (FDA) for helpful comments on the manuscript and to John Shaw (Siemens Diagnostics) for helpful general comments. This work was supported by grants from the National Cancer Institute (NCI; U24 CA126476 to N.L. Anderson and others, co-principal investigators) as part of the NCI Clinical Proteomic Technologies for Cancer initiative.
↵1 Nonstandard abbreviations: CRP, C-reactive protein; FDA, US Food and Drug Administration; CLIA, Clinical Laboratory Improvement Amendments; CDC, Centers for Disease Control and Prevention; PTM, posttranslational modification; BNP, brain natriuretic peptide; NT-proBNP, N-terminal pro-BNP; α-HBDH, α-hydroxybutyrate dehydrogenase; LDH, lactate dehydrogenase; IVD, in vitro diagnostic; CPT, Current Procedural Terminology; DORA, Directory of Rare Analyses; AST, aspartate aminotransferase; ALT, alanine aminotransferase; GGT, γ-glutamyltransferase; CK-MB, creatine kinase-MB; TnI, troponin I; Lp(a), lipoprotein(a); CA15-3, cancer antigen 15-3; sTfR, soluble transferrin receptor; TSH, thyroid-stimulating hormone; LH, luteinizing hormone; FSH, follicle-stimulating hormone; LDT, laboratory-developed test; COPD, chronic obstructive pulmonary disease; PMA, premarket approval; MS, mass spectrometry.
- © 2010 The American Association for Clinical Chemistry