Abstract
Background: Cancer has profound effects on gene expression, including a cell’s glycosylation machinery. Thus, tumors produce glycoproteins that carry oligosaccharides with structures that are markedly different from the same protein produced by a normal cell. A single protein can have many glycosylation sites that greatly amplify the signals they generate compared with their protein backbones.
Content: In this article, we survey clinical tests that target carbohydrate modifications for diagnosing and treating cancer. We present the biological relevance of glycosylation to disease progression by highlighting the role these structures play in adhesion, signaling, and metastasis and then address current methodological approaches to biomarker discovery that capitalize on selectively capturing tumor-associated glycoforms to enrich and identify disease-related candidate analytes. Finally, we discuss emerging technologies—multiple reaction monitoring and lectin-antibody arrays—as potential tools for biomarker validation studies in pursuit of clinically useful tests.
Summary: The future of carbohydrate-based biomarker studies has arrived. At all stages, from discovery through verification and deployment into clinics, glycosylation should be considered a primary readout or a way of increasing the sensitivity and specificity of protein-based analyses.
Clinical Tests Based on Cancer-Related Changes in Glycosylation
Aberrant glycosylation has been recognized for >30 years as a hallmark of cancer (1). However, the complex nature of glycan structure and synthesis has constrained the pace of discoveries relating to their biological significance. Recent advances in carbohydrate chemistry, chemical biology, and mass spectrometry (MS)1 techniques have opened the door to rapid progress in correlating glycan structure and function (2)(3)(4)(5)(6)(7)(8)(9)(10). At the same time, the maturation of proteomics has put cancer biomarker discovery studies at the top of many to-do lists (11). The confluence of these 2 fields has led many investigators to the same conclusion: Exploiting differences in glycosylation between malignant and healthy tissues likely affords excellent opportunities to identify sensitive and specific cancer biomarkers (12)(13)(14)(15)(16)(17)(18)(19)(20). Specifically, as diagrammed in Fig. 1⇓ , glycosylation machinery appears to be particularly sensitive to malignant transformation; as a result, the saccharide structures that are added to normal cellular proteins change, resulting in neoglycoforms that can be released from the cell through conventional secretory pathways, or as the result of enhanced proteinase activity. It is possible that a portion of these alternatively glycosylated molecules reach the bloodstream. As such, they could serve as early sentinels that enable cancer detection.
Malignant cells release glycoproteins carrying disease-related carbohydrate epitopes into the interstitial space, where they can reach the circulation.
Cancer is associated with major changes in the glycan biosynthetic machinery, including (1) upregulation of fucosyltransferases (FucTs), sialyltransferases (SiaTs), and the mannosyl (α-1,6-)-glycoprotein β-1,6-N-acetyl-glucosaminyltransferase (MGAT5) gene product (GlcNAcT-V), which is involved in the elaboration of highly branched N-linked glycans. Disease-relevant proteinases such as MMPs and ADAM family members are also upregulated. Changes in the expression of glycosyltransferases result in altered glycan assembly, which occurs in the endoplasmic reticulum and Golgi (2). Accordingly, the glycoprotein products of tumor cells carry aberrant carbohydrate structures compared with their normal counterparts. Typical changes include increased levels of fucose (red triangle) and sialic acid (purple diamond), the addition of polylactosamine units [repeating sequences of galactose (yellow circle) and N-acetylglucosamine (blue square)], and higher-ordered branching of N-linked glycans. O-linked glycans are also affected in cancer, typically carrying incomplete or prematurely truncated structures relative to those found on normal cells. After secretion or proteolytic cleavage, glycosylated molecules and/or their cleavage products are released into the interstitial space (3), where they can enter the circulation (4). Because glycoproteins and mucins usually carry many carbohydrate chains, the signals they produce are highly amplified compared with proteins, making them attractive candidates as biomarkers.
Intriguingly, many of the oldest and most widely used clinical cancer biomarker tests detect glycoproteins. These include carcinoembryonic antigen (CEA), commonly used as a marker of colorectal cancer; cancer antigen 125 (CA-125), frequently used to diagnose ovarian cancer; and prostate-specific antigen (PSA) (14)(21)(22). All of these diagnostic markers are glycoproteins, and the glycans that they present change during oncogenesis (18)(23)(24). Capitalizing on alterations in cancer-related protein glycoforms affords the possibility of increasing diagnostic sensitivity and specificity. The separation of serum PSA glycoforms by lectins—proteins that specifically bind glycans according to their structural epitopes—highlights this point. The Maackia amurensis agglutinin lectin can distinguish (P < 0.001) between blood samples from individuals with benign prostatic hypertrophy and prostate cancer patients (18), which standard PSA tests fail to do (25).
Tellingly, several other clinically useful cancer biomarkers directly detect glycan epitopes. CA 19-9 binds to sialyl Lewisa (sLea) [Siaα2-3Galβ1-3(Fucα1-4)GlcNAc] (26), a terminal epitope that imparts unique biological functions (see Biological Significance of Altered Glycosylation in Cancer for details). Although serum CA 19-9 concentrations are the most commonly used biomarker for diagnosing pancreatic cancer, monitoring treatment efficacy, and detecting recurrence, the current test lacks sensitivity and specificity (27). In this context, it is important to consider that CA 19-9 and related epitopes exhibit genetic variation within the population. For example, the secretor FUT2 [fucosyltransferase 2 (secretor status included) (alias, galactoside 2-α-l-fucosyltransferase 2)]2 and Le FUT3 [fucosyltransferase 3 (galactoside 3(4)-l-fucosyltransferase, Lewis blood group)] genes control the expression of enzymes in the synthetic pathway that governs expression of critical antigenic saccharide residues and, thus, an individual’s presentation of Le blood group structures (28). Accordingly, depending on their genetic makeup, many populations have high background concentrations of these epitopes present as normal components of serum and other secreted glycoproteins (29)(30). Several groups have proposed including analysis of an individual’s secretor and Le blood group status to improve the diagnostic sensitivity and specificity of serum CA 19-9 tests (28)(31). Indeed, 1 study assessed serum CA 19-9 concentrations in 500 healthy individuals and categorized the donors into 6 groups based on FUT2 and FUT3 genotype (28). They observed a 6-fold difference in the upper reference limit (the concentrations of antigen above which disease is suspected) between the groups with the highest and lowest endogenous CA 19-9 production, suggesting the importance of including this information in the analysis.
The prevalence of glycoprotein and glycan cancer biomarkers can be traced to de novo or altered glycan expression by transformed cells (see Fig. 1⇑ and Biological Significance of Altered Glycosylation in Cancer). Detection of these changes at the cytological (e.g., bladder cancer) and histological levels by lectins and glycoreactive antibodies guides disease diagnoses and enables more accurate prognoses. The lectins Helix pomatia agglutinin (HPA), which detects α-GalNAc ± β1-4Gal, and Ulex europaeus I agglutinin (UEA 1), which recognizes Fucα1-2Galβ, are used to assess breast cancer biopsy specimens (32)(33)(34). HPA is also part of a panel of markers for histological characterization of gastric cancer specimens (35). In both cases, HPA expression is correlated with increased lymph node metastases, and expression of saccharide sequences that react with either HPA or UEA 1 is related to decreased survival (32)(33)(34)(35). Interestingly, many of the most informative tests assess expression of Le antigens. For example, CA 19-9 and other anti-sLea, -Lex, -sLex, and -Ley antibodies are used in the evaluation of biopsy specimens from breast, bladder, colorectal, esophageal and non–small cell lung carcinoma (36)(37)(38)(39)(40)(41)(42)(43). In all instances, Le antigen expression is correlated with increased metastasis, advanced stage of disease, and reduced survival time.
Thus, it is clear that many of the clinical tests currently used to diagnose and manage the treatment of cancer exploit changes in glycosylation that accompany the disease process. In this context, it is interesting to consider the fact that very few biomarker discovery strategies are designed to focus on this class of posttranslational modifications. One reason may be that the complexity and heterogeneity of carbohydrate structures carried by glycoproteins have made it difficult to understand their functions. As discussed in the next section, recent progress has been made in this important area.
Biological Significance of Altered Glycosylation in Cancer
Cancer-related changes in glycosylation reflect interesting and disease-specific alterations in glycan biosynthetic pathways. These include variations in the expression of glycosyltransferases, enzymes that add activated donor monosaccharides in specific stereochemistries to acceptors, i.e., growing carbohydrate chains. Expression of enzymes and their corresponding substrates can be downregulated, as is observed with Core 3 O-carbohydrate structures and laminin-binding glycans in gastric and colorectal tumor biopsies or prostate and breast cancer cell lines, respectively (44)(45). Reduction in these glycoforms appears to be related to increased invasive/metastatic potential of carcinoma cells (46). Alternatively, glycan structures can be upregulated in tumors compared with normal cells. For example, UDP-GlcNAc:N-glycan GlcNAc transferase V (GlcNAcT-V), which is involved in branching of N-linked glycans, is overexpressed in numerous cancers (47); the product of this enzyme, which the Phaseolus vulgaris leukoagglutinating lectin detects, is associated with metastasis and decreased survival time in colorectal cancer (48). The upregulation of cell-surface expression of specific monosaccharides is also observed in cancer; prominent examples include increased sialylation and fucosylation. Regarding sialic acid, the sialyltransferases ST3Gal I, which adds N-acetylneuraminic acid (NeuAc) to O-linked galactose (Gal) moieties, and ST6GalNAc I, which adds sialic acid to the Tn antigen (GalNAcα1-O-Ser/Thr), are both upregulated in breast cancer and are associated with aggressive disease and poor prognosis (49)(50)(51). Another sialyltransferase, ST6Gal I, is also reported to be overexpressed in many malignant tumors (52). Regarding fucose, expression of several relevant biosynthetic pathways correlates with cancer. These include the GDP-mannose 4,6-dehydratase, GDP-fucose transporter, and fucosyl transferases FucT-IV and FucTVII, all of which are upregulated to varying degrees in hepatocellular carcinomas compared with normal tissues [reviewed in (53)]. Similarly, increases in the expression of both FucT-IV and the sialyltransferase ST3Gal II are observed in human colorectal cancer biopsy specimens compared with adjacent noncancerous tissue (54).
Members of the matrix metalloproteinase (MMP) and ADAM (a disintegrin and metalloproteinase) families are another important class of molecules that are upregulated in malignant tissues (55)(56). These degradative enzymes cleave proteins and glycoproteins from the cell surface and the adjacent extracellular matrix (ECM), often altering cellular behavior and stromal signals, respectively. In aggregate, these changes are thought to facilitate the growth and metastasis of tumor cells. For example, “ectodomain shedding” releases and activates plasma membrane-associated molecules such as epidermal growth factor (EGF). These alterations afford several clinical applications; e.g., high MMP expression correlates with adverse outcomes in breast, gastric, pancreatic, and prostate cancers (57)(58). Their glycan-containing products, which have important biological activities, can also be exploited for diagnostic purposes. Thus, tumor-associated enhanced proteinase activity increases the likelihood that molecules carrying these neoglycans will be identified in the surrounding tissue compartment, where they may eventually enter the circulation (Fig. 1⇑ and following sections).
Cancer-related changes in glycan biosynthesis and proteinase activity have enormous biological ramifications, many of which advance tumor formation and disease progression. For instance, some carbohydrate modifications seem to act as “molecular switches” affecting glycan conformation and thus lectin binding. In this context, cancer-related substitutions, including sialylation and core fucosylation, govern myriad functions (e.g., serum clearance and hepatic uptake) (59). Additionally, upregulation of MMPs/ADAMs and presentation of cancer-related glycans generally decrease cell–cell and cell–ECM adhesion and promote migration (56)(60). Regulation of integrin function appears to play a key role in these processes. Integrins are heterodimeric adhesion molecules that mediate cell interactions with ECM substrates such as fibronectin and laminin, and with cell-surface proteins such as vascular cell adhesion molecule (61). As a group, integrins are highly N-glycosylated, with some dimers, e.g., α5β1 and α3β1, possessing a total of 26 potential N-linked glycosylation sites (62).
Numerous aspects of integrin biology—including expression, heterodimerization, ligand binding, and downstream signaling—are modulated by glycosylation (63). A minimal glycan repertoire allows cell-surface expression of these heterodimers; further elaborations to glycan structure refine integrin function. For instance, increased integrin sialylation, mediated by the glycosyltransferase ST6Gal I, leads to integrin activation, increased adhesion to ECM substrates, and augmented migration and invasion of colon cancer cell lines in vitro (52)(64). Similarly, in vivo, ST6Gal I expression by spontaneously derived mouse mammary tumors modulates integrin signaling and tumor differentiation (65). In addition to sialylation, integrins are also modified by the glycosyltransferase GlcNAcT-V and, as a result, carry highly branched structures (47). Spontaneous mammary tumors, which developed in a GlcNAcT-V–deficient mouse model, exhibited decreased growth, metastasis, and signaling downstream of the focal adhesion kinase, evidence that the observed effects are mediated, in part, by loss of integrin-associated N-linked branching (47).
The highly branched GlcNAcT-V glycan products are often further elaborated with the addition of polylactosamine chains, polymeric repeating units of galactose and N-acetylglucosamine (i.e., lactosamines). Galectins, a family of endogenous lectins, bind these epitopes. Galectins can multimerize through homotypic interactions, forming a lattice-type network that incorporates glycosylated ligands and other galectin molecules (66). This architecture stabilizes bound cell-surface molecules, preventing lateral movements and endocytosis. Thus, this phenomenon can affect many processes. GlcNAcT-V regulates tumor motility, in part, through galectin-3 binding to newly presented lactosamine sequences, and the consequences include changes in integrin signaling (67). GlcNAcT-V also modifies N-linked glycans on receptor tyrosine kinases such as EGF receptor (EGFR) and transforming growth factor β receptors I and II (TβRI and II). Upon polylactosaminylation, they become ligands for galectin-3, which leads to lattice formation and receptor entrapment. The lattice helps the cell surface receptors resist removal by constitutive endocytosis, and thus increases receptor signaling (68). EGFR is highly N-glycosylated, whereas TβRI and TβRII carry minimal modifications. Accordingly, cells display a strong GlcNAcT-V–dependent response when ligands bind to the former receptor, and less so when the latter receptors become engaged (68). Interestingly, GlcNAcT-V also plays a role in epithelial to mesenchymal transitions (68), an early stage of oncogenesis that is correlated with metastasis (69). This effect might be mediated, in part, through GlcNAcT-V influences on integrin and growth factor receptor signaling.
Metastasis is a critical event in the progression of tumor development. This multistep process begins when individual primary tumor cells breach the basement membrane and access the local vasculature, where they interact with endothelial cells, platelets, and leukocytes (70)(71). Exit from the circulation is accomplished by extravasation, which requires a series of steps. This shear stress–dependent process begins with rolling on endothelial cells, which leads to tethering and integrin-dependent stable adhesion. It is thought that extravasation occurs at receptive sites primed by circulating tumor-derived factors. Successful metastasis requires coordinated interactions between the newly engrafted tumor cell, the “seed,” and the “soil” at sites of metastasis (70)(71).
The first step of extravasation, rolling of cells along the endothelium, is regulated by glycosylation. Specifically, rolling is mediated by adhesion under shear stress of the selectin family of endogenous lectins to their carbohydrate ligands on cell-surface glycoproteins. The selectin family consists of 3 members that exhibit differential expression patterns. Endothelial cells display E-selectin and P-selectin, platelets express P-selectin, and leukocytes present L-selectin (72)(73). Carbohydrate ligands for selectins are modified Le blood group antigens, i.e., contain fucosylated and sialylated epitopes. These structures, detectable by antibodies such as CA19-9 (26), are abundant in malignant tissues, where their expression is strongly correlated with metastasis and poor disease outcome (see Clinical Tests Based on Cancer-Related Changes in Glycosylation above). It is thought that tethering and rolling of tumor cells occurs through selectin-mediated interactions between malignant cells and the endothelium (via E- and P-selectin). However, interactions between malignant cells and circulating platelets and leukocytes (via P- and L-selectins, respectively) are also thought to participate in tumor cell extravasation. One model envisions that circulating aggregates of tumor cells, platelets, and/or leukocytes form microvascular occlusions that arrest the cells, providing a possible means of egress from the vessel (70)(73). Thus, although expression patterns vary by virtue of location and carbohydrate selectivity, each selectin family member is poised to influence tumor metastasis.
Evidence is accumulating that the clinical correlation observed between tumor expression of selectin ligands and metastasis (see Clinical Tests Based on Cancer-Related Changes in Glycosylation) is a reflection of a causal relationship (74). Mice lacking L-selectin, P-selectin, or both molecules, exhibit greater resistance to the metastasis of colon carcinoma cells (75)(76). E-selectin is also implicated in tumor metastasis. Notably, 1 study followed dissemination of melanoma tumor cells, either presenting or lacking E-selectin ligands, in a mouse model in which E-selection was overexpressed in the liver. Tumor cell expression of E-selectin ligands was required for metastasis to the liver of transgenic but not wild-type mice (77). Thus, the pattern of E-selectin expression governs metastasis, a finding that has interesting implications for the pathogenesis of human cancers. Additionally, cancer-related protein scaffolds have been identified as selectin-ligand carriers, and thus as participants in the extravasation process. For instance, CEA, the glycoprotein commonly used as a marker for colorectal cancer (Clinical Tests Based on Cancer-Related Changes in Glycosylation above), and CD44, a highly glycosylated cell-surface adhesion protein, can serve as E- and L-selectin ligands (74).
In summary, the examples in this section illustrate the biological importance of glycosylation to tumor development and disease progression. As biomarker discovery becomes more sophisticated, it is clear that using glycobiology to inform experimental design will lead to more sensitive and specific diagnostic tests (78)(79). In this context, cancer-related changes in glycosylation can guide biomarker studies. As discussed in the next section, many laboratories have already begun exploring approaches to exploit the latest technological and biological breakthroughs in these areas.
Current Approaches to Glycan-Based Biomarker Discovery
The challenge of biomarker discovery is to identify the disease-related needle in the haystack of normally expressed proteins. Although several accessible human fluids, including urine and saliva, are available for biomarker studies, the subject of the majority of biomarker discovery approaches to date—and the one with the highest order of complexity—is processed blood (as plasma or serum). In this context, the normal plasma or serum proteome/glycome is the haystack, which creates a myriad of background issues. In this regard, it is noteworthy that 82 of the 100 most abundant plasma proteins are glycoproteins (see Supplemental Table 1, which accompanies the online version of this article at http://www.clinchem.org/content/vol56/issue2) (80). However, the majority of these endogenous molecules carry basic glycan motifs lacking typical cancer-related structures, including fucosylation, sialylation, more complex N-linked branching, and polylactosaminylation. Thus, judicious selection of lectins or antibodies that recognize carbohydrate epitopes for capture of cancer-related glycoforms enables separation of potential biomarker targets. The list of potentially useful lectins and antibodies shown in Table 1⇓ ⇓ demonstrates the versatility of these tools for identifying structural differences.
Specificities of lectins and carbohydrate-reactive antibodies.
Specificities of lectins and carbohydrate-reactive antibodies. (Continued from page 228)
Given that affinity selection is the hallmark of workflows that are designed to capture glycan-containing molecules, it is possible to analyze unfractionated samples. However, many investigators commonly include immunoaffinity depletion of the most abundant proteins, which enables deeper coverage of the remaining lower-abundance proteome where biomarkers typically reside. Multiple platforms have been employed for this purpose, including ProteomeLab IgY (Beckman Coulter; www.beckmancoulter.com), Seppro IgY and SuperMix (Sigma-Aldrich; www.sigmaaldrich.com) (81), and the Multiple Affinity Removal System (Agilent Technologies; www.agilent.com) (81)(82)(83).
Current glycan-based biomarker approaches typically involve 1 or more enrichment steps, proteolytic digestion, and liquid chromatography–tandem mass spectrometry (LC-MS/MS) analyses. Enrichment of carbohydrate-containing species is performed at either the glycoprotein or glycopeptide level, with distinct rationales for each strategy. Major advantages of performing affinity selection at the glycoprotein level include the increased likelihood of obtaining sequence information from multiple peptides, which in turn facilitates glycoprotein identification (84). Disadvantages of this approach, which are overcome by selecting at the glycopeptide level, include a higher probability of nonspecific protein–protein interactions, a particular concern since many lectins are hydrophobic molecules (85).
Compared with their O-linked counterparts, N-linked glycopeptides are more commonly analyzed in glycan-based biomarker workflows, as they are amenable to peptide N-glycosidase F (PNGase F) digestion, which hydrolyzes the amide bond of the asparagine residue to which the glycan is attached (86). This reaction results in a 1-Da molecular mass increase in the deglycosylated form of the peptide. Additional certainty is gained by requiring that this mass shift be observed in the context of an N-linked consensus sequence (NXT/S, where X is any residue except proline). Because PNGase F is an asparagine deamidase, it is also possible to enzymatically incorporate oxygen-18 at the site of glycan release (87). The development of equivalent enzymes for the release of O-linked structures, and identification of their sites of attachment, have yet to be achieved. In this regard, it may be possible to engineer a proteinase that can hydrolyze the GalNAc-Ser/Thr linkage.
Lectin- and Antibody-Based Affinity Selection
A variety of protocols have been developed for lectin capture of glycoproteins and glycopeptides. These include using single lectins (82)(84)(88) as well as mixtures, e.g., concanavalin A, wheat germ agglutinin, and Artocarpus integrifolia (jacalin). The latter approach has been dubbed multilectin affinity chromatography (89)(90)(91)(92). Less commonly, carbohydrate-reactive antibodies such as those recognizing Le blood group antigens (Table 1⇑ ⇑ ) are used for enrichment (93). Additional lectin-capture steps and dimensions of fractionation (e.g., hydrophilic interaction, ion exchange, reversed-phase chromatography) have been used in series to achieve greater protein and peptide coverage (82)(84)(87)(89)(90)(91)(93). Elution methods include displacement by specific mono- or disaccharide solutions (88)(89)(94) or by high-salt and/or acidic conditions (84)(93)(95). Numerous workflows use commercially available agarose-conjugated lectins in a centrifugal, gravity flow, or low-pressure HPLC format (84)(89)(92)(93). Magnetic bead technology has also been used (95)(96). When desirable, a higher degree of reproducibility and speed is achieved by using lectins conjugated to HPLC-compatible matrices (83)(88), which enable high-pressure/high-flow-rate chromatography.
In general, glycoprotein/glycopeptide enrichment methods identify fewer proteins than in-depth protein analyses. For instance, in 4 studies involving lectin or antibody glycoprotein/glycopeptide capture and reversed-phase LC-MS/MS analyses, 25–150 glycoproteins were observed (84)(89)(93)(94). As expected, the inclusion of additional separation steps often increases the number of proteins identified. Including a third dimension of separation and/or depletion of abundant proteins enabled identification of 50–250 glycoproteins (82)(87)(89)(94). Interestingly, narrow-specificity lectins that recognize disease-related carbohydrate structures capture fewer plasma proteins than do those with broader specificities, making them attractive affinity selectors for isolating cancer-related glycosylated biomarkers. Jung et al. (84) observed that concanavalin A, which has broad reactivity for the mannose cores typically found in N-linked glycans, captured approximately 30% of the plasma proteome. In contrast, a panel of lectins with narrower specificities bound a much smaller fraction. Two fucose-specific lectins, Aleuria aurantia lectin (AAL) and Lens culinaris agglutinin (LCA), bound approximately 4% and 8% of plasma proteins, respectively; Lycopersicon esculentum lectin (LEL), which targets polylactosamine, and HPA (see Clinical Tests Based on Cancer-Related Changes in Glycosylation) selected ≤0.1% of the proteins contained in plasma.
Several studies have applied glycan-based enrichment strategies to cancer biomarker discovery. In general, these studies tend to be small-scale proof-of-principle endeavors. It is noteworthy that, on the whole, they have succeeded in identifying target biomarker candidates for further validation. The published literature suggests that focusing the selection process on cancer-relevant glycans increases the likelihood of identifying potential disease targets. For example, in 1 study, approximately 10% of the proteins that bound to HPA and LEL were increased in abundance by at least 3-fold in samples obtained from breast cancer patients compared with control subjects (84). Comunale et al.(82) used AAL to enhance recovery of fucosylated glycoproteins from pooled serum samples obtained from 3 study groups: patients with liver cirrhosis with or without hepatocellular carcinoma and healthy control subjects. Approximately 50 glycoproteins tracked with the hepatocellular carcinoma diagnosis, and a subset were validated in a cohort of 332 patients using an antibody-lectin sandwich array (discussed in Strategies to Exploit Cancer-Related Glycobiology for Developing Diagnostic Tests). From these studies, fucosylated hemopexin emerged as a potential biomarker with a specificity and sensitivity of 92% (82).
Chemical-Based Methods
Chemical biology presents numerous approaches to enrich glycosylated molecules by exploiting glycan structures and manipulating their biosynthetic pathways (97). Two similar approaches, involving hydrazide or boronic acid chemistry, capitalize on the cis-diols present in monosaccharides. Zhang et al. (98) pioneered the use of hydrazide chemistry for purification by directly coupling glycosylated proteins to a solid support. In this scheme, carbohydrate cis-diols are oxidized into aldehydes by sodium periodate treatment, then covalently coupled through hydrazide chemistry to functionalized beads. Bound glycoproteins are trypsin-digested in situ, and nonglycosylated/unbound peptides are removed by washing. Covalently coupled N-linked glycoproteins are then released by PNGase F treatment and analyzed by LC-MS/MS. Using this approach, the authors identified 57 serum glycoproteins. The same laboratory published a revised version of this method that captures glycopeptides rather than intact glycoproteins (99). This modified procedure markedly improved glycopeptide recovery to >90%. In the aggregate, approximately 300 unique glycosylation sites were observed in serum samples and tissue lysates. A similar method was used by Sun et al. (100), who identified 302 proteins from an ovarian cancer cell line with 91% selectivity for N-linked glycopeptides in the captured fractions. A complementary approach was devised by Sparbier et al. (101), who used boronic acid–functionalized beads to covalently capture (through formation of heterocyclic boronic acid diesters) glycoproteins followed by elution with acid. This method was recently applied to the analysis of serum samples: 113 proteins were identified, of which 74 were known or proposed glycoproteins (102). Other captured molecules included proteins known to associate with glycoproteins, suggesting that this method may also be useful for characterizing complexes that contain glycosylated molecules. Building on data derived from these methods, Aebersold’s group (103) developed a searchable online catalog for depositing data regarding N-linked glycosylation sites and glycopeptide sequences (www.UniPep.org). This repository is an important companion to sequence-based algorithms that predict N-linked glycosylation sites, and as such is a valuable resource for plasma- and serum-based studies. Finally, hijacking glycan biosynthetic pathways for the incorporation of monosaccharides bearing bioorthogonal functional handles affords an alternative approach to chemically labeling glycoconjugates (104).
Glycan-Focused Methods
Whereas the above-described biomarker approaches strive to identify disease-related glycoproteins, another method targets the glycans themselves as cancer biomarkers. This approach involves the enzymatic or chemical release of N- and O-linked glycans from biological sources (e.g., cancer cell lines, tumors, or plasma/serum), followed by MS-based characterization and identification of cancer-related glycoforms. Studies using these techniques identified distinct glycan subsets that correlated with malignant versus nontransformed cell lines, or that were preferentially expressed by cancer cell lines derived from specific tissues (105)(106). Li et al. (107) used gel electrophoresis to separate proteins from medium that was conditioned by cancer cell lines or serum from individuals with ovarian cancer. Glycoprotein-containing bands, which were identified by virtue of their reactivity with Pro-Q Emerald 300, were isolated, and the attached carbohydrates were characterized by MS. The results yielded information about glycan structures that were unique to the disease state. Interestingly, investigators used specific serum glycoforms to distinguish individuals with prostate or ovarian cancer from those without disease (108)(109). A study by Kirmiz et al. (110) demonstrated a successful progression of inquiry that began with analysis of glycans released from a panel of breast cancer cell lines. The study next moved into a mouse mammary tumor model, where cancer-related glycans were observed in serum at various time points over a 6-week course of disease progression. Finally, a small number of serum samples were analyzed from healthy control subjects and patients with breast cancer. Remarkably, the molecular ions corresponding to glycans observed in breast cancer cell lines and serum samples of mice with mammary tumors were also present in patient sera.
Strategies to Exploit Cancer-Related Glycobiology for Developing Diagnostic Tests
The use of glycobiology to inform biomarker studies holds clear promise for the rapid identification of disease markers that enable risk prediction as well as diagnosis, prognosis, and monitoring response to therapy. If recent trends prevail, it is highly likely that future biomarker discovery efforts will focus on developing panels of disease sentinels, rather than pursuing single molecules. This change in strategy has been driven by a better understanding of the inherent complexity of human biology, which has in turn highlighted the importance of patient-specific therapies. In this context, approaches that allow the simultaneous analysis of numerous markers, or multiplexing, are highly advantageous.
At present, 2 relatively new technology platforms are particularly suited for the analysis of cancer-related neoglycans. The first is multiple reaction monitoring (MRM), an MS-based approach. MRM achieves high analytical specificity and sensitivity by selecting predetermined precursor molecular ions for collision-induced dissociation and monitoring the appearance of several diagnostic product ions. When identified in combination, the precursor and product ions confirm the presence of the analyte of interest. Importantly, addition of isotopically labeled internal standards allows absolute quantification. Numerous precursor ions can be monitored in 1 experiment, particularly when using scheduled MRMs, which incorporate prior knowledge of peptide elution times into the LC-MS/MS program (111)(112). Recently, a large National Cancer Institute–sponsored interlaboratory study showed that an MRM approach can quantify target proteins in a background of unfractionated human plasma with highly robust and reproducible results, supporting the notion that this technology is suitable for biomarker discovery (113).
The immediate challenge is to increase the sensitivity of these assays to reach the depth of coverage that will be required to identify molecules in the range of abundances in which disease-specific biomarkers likely circulate. In this regard, the stable isotope standards and capture by antipeptide (or anticarbohydrate) antibodies (SISCAPA) approach is designed to enhance the sensitivity of MRM assays. First reported by Anderson et al. (114), this technique uses antibodies to capture target analytes, followed by elution in a small volume for simultaneous enrichment and concentration. When coupled to an MRM readout, SISCAPA improves the limit of detection by up to 3 orders of magnitude over the original assay (115). Thus, a SISCAPA-MRM approach allows detection by MS of biomolecules present in blood at or near concentrations typical of disease biomarkers (e.g., ng/L to μg/L) with a sensitivity approaching that of clinical immunoassays (115)(116). Interestingly, Aebersold and colleagues (112) reached a similar level of plasma glycoprotein detection and quantification by using hydrazide-mediated capture of carbohydrate-containing molecules followed by release of N-linked glycopeptides and detection by scheduled MRM.
Second, other technologies are being developed that do not require the use of the sophisticated instruments and computer software programs that mass spectrometry entails. For example, lectin-antibody sandwich arrays can be used at every stage in the biomarker discovery and verification pipeline. These assays differentiate disease-associated glycosylation, offer a high level of sensitivity and specificity, and are easily multiplexed. Similar in design to an ELISA, the sandwich assay, in separate steps, detects the protein portion of the analyte (via antibodies) and the attached glycan (via lectins and carbohydrate-specific antibodies). Chen et al. (117) used this method to identify cancer-related changes in glycosylation on mucin-1 (MUC1) and CEA in sera from individuals with pancreatic cancer. In a subsequent study, sandwich arrays were used to monitor changes in glycosylation of mucins expressed by human pancreatic cancer cell lines in response to inflammatory stimuli (118).
Conclusions: The Sweet Future of Cancer Biomarkers
In a recent editorial, a group of distinguished glycobiologists dubbed this the “Second Golden Age of Glycomics” owing to the increasing evidence that disease-related changes in glycosylation will provide the basis for improved cancer biomarkers (79) As shown in Fig. 1⇑ , we think that the products of glycosylation, which are often altered in the setting of cancer, are particularly interesting targets that offer several major advantages over proteins. First, the biology allows for the rational design of discovery efforts. For example, changes in the glycosylation machinery can be identified from microarray data and translated in structural terms. In this regard, the glycosyltransferases upregulated in the cancer cell compared with its normal counterpart point to specific disease-related aberrations in oligosaccharide structures. In turn, these observations provide a compelling rationale for designing antibody- and lectin-based separation strategies for their isolation in the discovery phase of biomarker-type projects. Second, 1 protein can carry many copies of an altered glycan, which may also be added to other scaffolds. Thus, there is an important amplification effect, which could enable the detection of many fewer abnormal cells than would otherwise be possible. Finally, glycosylation acts to shield the peptide backbone from proteolytic degradation. Thus, in theory, glycan-based biomarkers are likely to be more stable in a variety of disease settings than unmodified proteins, which are often more labile.
In this context, it is interesting to note that 109 diagnostic tests currently approved by the US Food and Drug Administration measure protein concentration alone, whereas 18 additional assays include assessment of posttranslational modifications. Of the latter subset, 6 assays detect carbohydrate structures (119), namely Lens culinaris agglutinin-reactive α-fetoprotein, glycated forms of albumin, bone-specific alkaline phosphatase, LDL, CA 19-9, and carbohydrate-deficient transferrin. Notably, while the currently approved clinical tests monitor single biomolecules in isolation, future diagnostics will include multiplexed analyses of numerous proteins with or without their posttranslational modifications or exclusively glycans. In parallel with the shift toward biology-driven discovery efforts, technological improvements in MS platforms and lectin-antibody array methodologies are opening the door to a new generation of approaches for verification and clinical deployment. In total, the confluence of these changes will strongly affect the way cancer is diagnosed and disease progression/remission is monitored in the context of patient-specific therapies.
Acknowledgments
Author Contributions: All authors confirmed they have contributed to the intellectual content of this paper and have met the following 3 requirements: (a) significant contributions to the conception and design, acquisition of data, or analysis and interpretation of data; (b) drafting or revising the article for intellectual content; and (c) final approval of the published article.
Authors’ Disclosures of Potential Conflicts of Interest: Upon manuscript submission, all authors completed the Disclosures of Potential Conflict of Interest form. Potential conflicts of interest:
Employment or Leadership: F.E. Regnier, Purdue University and Quadraspec; B.W. Gibson, Oregon State University, University of Iowa, and Rockefeller University.
Consultant or Advisory Role: F.E. Regnier, Quadraspec.
Stock Ownership: F.E. Regnier, BG Medicine, Quadraspec, and Predictive Physiology and Medicine.
Honoraria: B.W. Gibson, Oregon State University.
Research Funding: Clinical Proteomic Technologies for Cancer initiative, 5U24CA126477-04. A. Prakobphol, NIH; F.E. Regnier, National Cancer Institute; B.W. Gibson, NIH.
Expert Testimony: F.E. Regnier, Pharmacia versus Genentech on Human Growth Hormone.
Other Remuneration: F.E. Regnier, The German Chemical Society (travel to HPLC 2009 in Dresden, Germany).
Role of Sponsor: The funding organizations played no role in the design of study, choice of enrolled patients, review and interpretation of data, or preparation or approval of manuscript.
Footnotes
1 Nonstandard abbreviations: MS, mass spectrometry; CEA, carcinoembryonic antigen; CA-125, cancer antigen 125; PSA, prostate-specific antigen; sLea, sialyl Lewisa; HPA, Helix pomatia agglutinin; UEA 1, Ulex europaeus I; GlcNAcT-V, UDP-GlcNAc:N-glycan GlcNAc transferase V; MMP, matrix metalloproteinase; ADAM, a disintegrin and metalloproteinase; ECM, extracellular matrix; EGF, epidermal growth factor; EGFR, EGF receptor; TβR, transforming growth factor β receptor; LC-MS/MS, liquid chromatography–tandem mass spectrometry; PNGase, peptide N-glycosidase; AAL, Aleuria aurantia lectin; LCA, Lens culinaris agglutinin; LEL, Lycopersicon esulentum lectin; MRM, multiple reaction monitoring; SISCAPA, stable isotope standards and capture by antipeptide (or anticarbohydrate) antibodies; MUC1, mucin-1.
2 Human genes: FUT2, fucosyltransferase 2 (secretor status included) (alias, galactoside 2-α-l-fucosyltransferase 2); FUT3, fucosyltransferase 3 (galactoside 3(4)-l-fucosyltransferase, Lewis blood group); MGAT5, mannosyl (α-1,6-)-glycoprotein β-1,6-N-acetyl-glucosaminyltransferase.
- © 2010 The American Association for Clinical Chemistry