Background: High-throughput proteomic methods for disease biomarker discovery in human serum are promising, but concerns exist regarding reproducibility of results and variability introduced by sample handling. This study investigated the influence of different preanalytic handling methods on surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF MS) protein profiles of prefractionated serum. We investigated whether older collections with longer sample transit times yield useful protein profiles, and sought to establish the most feasible collection methods for future clinical proteomic studies.
Methods: To examine the effect of tube type, clotting time, transport/incubation time, temperature, and storage method on protein profiles, we used 6 different handling methods to collect sera from 25 healthy volunteers. We used a high-throughput, prefractionation strategy to generate anion-exchange fractions and examined their protein profiles on CM10, IMAC30-Cu, and H50 arrays by using surface-enhanced laser desorption/ionization time-of-flight mass spectrometry.
Results: Prolonged transport and incubation at room temperature generated low mass peaks, resulting in distinctions among the protocols. The most and least stringent methods gave the lowest overall peak variances, indicating that proteolysis in the latter may have been nearly complete. For samples transported on ice there was little effect of clotting time, storage method, or transit time. Certain proteins (TTR, ApoCI, and transferrin) were unaffected by handling, but others (ITIH4 and hemoglobin β) displayed significant variability.
Conclusions: Changes in preanalytical handling variables affect profiles of serum proteins, including proposed disease biomarkers. Proteomic analysis of samples from serum banks collected using less stringent protocols is applicable if all samples are handled identically.
Analysis of serum proteomes holds great promise for identifying novel cancer markers for screening, diagnosis, and prognosis (1)(2)(3). Most patients find venipuncture tolerable, and it is standard practice to monitor disease progression or response to therapy by collecting serial blood samples. Recently, various proteomics-based approaches coupled with advanced bioinformatics have been used to identify putative disease biomarkers in patient serum and plasma, and several studies have identified biomarkers that improve the positive predictive value of disease detection (4)(5). The cancer antigen CA125, a diagnostic and prognostic marker for ovarian cancer, is increased in only 85% of patients with ovarian cancer and 50% of those with early stage disease (6), and may also be increased in benign disease. Research using new technologies is focused on developing a marker that will outperform CA125 or can be used in combination to increase the performance of the current test.
Several different methods based on mass spectrometry (MS)1 have been applied in the search for cancer biomarkers [reviewed in (1)(7)]. Surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) MS (8)(9) has been used extensively for serum profiling. In this method, high-throughput mass profiling with laser desorption/ionization MS instrumentation is performed on sample proteins bound selectively to chip surfaces with different chemical properties. Spectral patterns are then compared across samples to find discriminating masses or changes in peak intensities. Initial enthusiasm about these new technologies has been somewhat tempered by questions on the robustness of class discriminating algorithms and method reproducibility (7)(10)(11). Increasing evidence that sample collection and processing can affect protein profiles and the ability to differentiate between disease and control samples has cast further doubt on the validity of some studies (12)(13). Transit time, storage conditions, clotting time, and tube type can all affect serum profiles, irrespective of true biological variation (14)(15)(16)(17). It is likely that such introduced differences are primarily driven by proteolysis, although other variables may contribute, such as agglutination or differential adhesion of serum polypeptides to tube walls. These findings have raised concerns about using samples for case-control studies from older collections, where samples were collected and transported for different times at ambient temperatures. Many of these collections are unique, with samples predating cancer diagnosis. One such collection is the United Kingdom Collaborative Trial of Ovarian Cancer Screening (UKCTOCS), in which 202 638 postmenopausal women from 13 centers in the UK were randomized to screening vs controls; this serum bank will eventually have 500 000 samples, including serial samples from 50 000 women (see www.ukctocs.org.uk). The collection protocol for this trial allows blood samples to stand on the clot for 24–56 h before processing. Thus, if proteomic technologies are used for biomarker discovery from such collections, it is imperative to compare samples collected and processed using these less stringent protocols with those collected in accordance with protocols that involve immediate transport on ice.
We examined the impact of diverse serum handling protocols on protein profiles observed by SELDI-TOF MS. We also sought to determine the least variable and most clinically feasible handling method for prospective serum collections for proteomic studies.
Materials and Methods
sample collection and handling
This study was approved by the local ethics committee and written informed consent was obtained from all donors. Serum samples were collected at Barts and the London NHS Trust on 2 consecutive days from 25 healthy, postmenopausal women randomized to the CA125 arm of UKCTOCS. Protocol 1 (Green; GN) used the existing UKCTOCS protocol (see www.ukctocs.org.uk) with samples collected in Greiner gel tubes, allowed to clot, centrifuged at room temperature (RT), divided into aliquots, and placed in straws that were heat sealed and stored at −80 °C. Time from venipuncture to centrifugation was 30 h for each sample. Additional samples from the same volunteers were collected in Becton Dickinson red-top tubes, allowed to clot at RT for 60 min, subjected to transport/storage on wet ice for 2 h before centrifugation, transferred to straws, and stored at −80 °C (protocol 2a; Yellow; YE) (14). A 3rd protocol used a 5 min clotting time at RT, followed by transport/storage on wet ice for 3 h before centrifugation, transfer to straws, and storage (protocol 2b; Gray; GY). Three variants of protocol 2b were used to prepare cryovials for storage instead of straws (protocol 2c; Cryovial; CR), with transport/storage on wet ice for 6 h instead of 3 h (protocol 2d; Orange; OR) and with transport/storage for 3 h at RT instead of on wet ice (protocol 2e; White; WH). Handling protocols are detailed in Table S1 of the Data Supplement that accompanies this article at (http://www.clinchem.org/content/vol53/issue4). Although 25 × 6 protocols would generate 150 individual serum samples for analysis, because of insufficient material only 13 protocol 2c samples were available.
Sample preparation details are provided as Supplemental Data. Briefly, samples were thawed, randomized, and triplicate 25 μL aliquots placed into 96-well plates. After denaturation in urea and dilution, the samples at pH 9 were put in filter plates containing rehydrated QHyperD® F resin (Pall Corporation) and incubated with shaking. Unbound material was removed on a vacuum manifold as fraction 1 (FR1), and proteins were eluted in a step-wise fashion by decreasing pH (FR2, pH 7; FR3, pH 5; FR4, pH 4; and FR5, pH 3) with a final organic solvent elution (FR6). The 6 fractions were applied to CM10 (weak cation-exchange) and IMAC30-Cu (immobilized metal affinity capture) arrays in 96-sample bioprocessors (Ciphergen Biosystems). FR6 samples were also applied to H50 (hydrophobic) arrays. Chip preparation, sample application, and matrix application were performed according to the manufacturer’s instructions. All liquid handling steps were performed on an Aquarius workstation (Tecan).
seldi-tof ms data acquisition and processing
Details of SELDI-TOF MS data acquisition and processing are provided as Supplemental Data. Briefly, spectra were acquired on an externally calibrated ProteinChip® System Series 4000 instrument, using 2 laser intensities for acquisition of low (2.5–20 kDa) and high (20–200 kDa) mass range data. Spectra were processed (baseline subtraction, deionzing, normalization, spectral alignment, and peak detection) with CiphergenExpress software, as described in Supplemental Data. Peak numbers were recorded for the different handling methods and fraction types. We examined the differences between the handling methods by principal components analysis (PCA), hierarchical clustering, and examination of P value by use of mean peak intensities from triplicate samples. Median variances were also calculated as descriptors of trends for different collection/handling methods.
Proteins were enriched by liquid chromatography and ultrafiltration followed by SDS-PAGE. Peptides <5 kDa were identified by direct sequencing by tandem mass spectrometry (see below). Proteins >5 kDa were purified by SDS-PAGE and stained using a Colloidal Blue Staining Kit (Invitrogen). Selected bands were excised, and one quarter of each was extracted using 50% formic acid/25% acetonitrile/15% isopropanol/10% water and reanalyzed by SELDI-TOF MS to confirm matching of the stained band with the peak of interest. The remainder was in-gel digested with trypsin and analyzed by tandem MS using a Q-STAR® XL equipped with a PCI-1000 ProteinChip Interface (Ciphergen Biosystems). MS/MS spectra were submitted to the database mining tool Mascot (Version 2.1.2; Matrix Science) and searched against the updated SwissProt or NCBInr databases with the following search parameters: trypsin, allowing up to 2 missed cleavages (or semitrypsin if the trypsin search was not successful); peptide tolerance ±50 ppm; MS/MS tolerance ± 0.3 Da; peptide charge +1. Peak identifications were also confirmed using data from previous publications (3)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30)(31). All are abundant, ubiquitously expressed serum proteins or proteolytic fragments.
role of the sponsors
Ciphergen Biosystems participated in the study design, donation of reagents, interpretation of results, and writing of this article, but were not involved in donor selection. Approval of the paper by Ciphergen Biosystems before its submission was not required.
Serum samples were processed 6 different ways (see Table S1 in the online Data Supplement) to assess the effects of time and temperature before storage, clotting time, and storage tube type on SELDI-TOF MS profiles. To improve coverage, samples were prefractionated using strong anion-exchange chromatography, and fractions further spotted onto different SELDI arrays (CM10, IMAC30-Cu2+ and H50) generating 13 different conditions for SELDI-TOF MS analysis. All steps were automated using robotics to improve reproducibility and throughput. The number of peaks (average of triplicate samples) in each fraction type was first compared across the different handling methods (Table 1⇓ ). Based on previous work, and after preliminary analysis, only fractions FR1, 4 and 6 were considered since the others gave either low peak numbers (FR2) or similar patterns of peaks to those in FR4 (FR3 and FR5). For most fractions, the CM10 surface gave more peaks than the IMAC30 surface, with FR1 (anion exchange flow-through) on the CM10 chip surface yielding the most peaks overall. The organic elution (FR6) gave the highest number of peaks on all chip types. The individual collection methods gave similar numbers of peaks in each fraction and on each chip type, although the least stringent protocol 1 (GN) gave an appreciably higher number in the low mass range, while protocols 2b (GY) and 2c (CR) gave the lowest peak numbers (Table 1⇓ ).
We next conducted PCA to assess how samples and handling methods grouped together. Protocol 1 (GN) was the most distinctive method, with most volunteer samples grouping together and away from samples collected using the other methods (Fig. 1⇓ ). This was true for all fraction/chip surface combinations, but only in the low (2.5–20 kDa) mass range. A likely explanation for this separation is the extended transport/storage time used in this protocol. In support of this, partial separation was observed between protocols 2b (GY; 3 h on ice) and 2e (WH; 3 h at RT), suggesting that temperature before centrifugation is a major factor influencing spectral patterns (data not shown). Importantly, there was no differences among the protocols, using either mass range, when samples were transported on ice for 3 vs 6 h (GY vs OR), when samples were clotted for 60 min vs 5 min (YE vs GY), or when samples were strawed vs aliquoted into cryovials for storage (GY vs CR) (data not shown). Protocols were also compared based on P values. For this, the preprocessed 160 most frequent peaks were selected and a median intensity value (n = 3) obtained for each study participant and protocol. To test the null-hypothesis that there is no difference between protocols, P values were calculated for all 160 peaks using the Wilcoxon sign test. The Pmin (minimum P value of all 160 values) was then used to find a measure of agreement between pairs of protocols by calculating the corresponding “conservative” P value according to the formula: min (n*Pmin;1), where n = 160 and the term “min” means that the minimum of 2 numbers Pmin and 1 is taken; the word “conservative” refers to the fact that the probability of the P value not exceeding epsilon is at most epsilon, for any epsilon between 0 and 1. To create a summary table to characterize the combination of all 13 fraction/chip types, the smallest P value in the fractions for each protocol pair was taken and adjusted according to the formula using n = 13 (Table 2⇓ ). Using this approach, protocols 2b (GY), 2c (CR), and 2d (OR) were most similar and protocol 1 (GN) most dissimilar, in agreement with the PCA. However, there was no significant difference between protocols 1 and 2c, (P = 0.276), although protocol 2c had the smaller number of 13 samples, making it difficult to make reliable conclusions. This may also explain why all other protocols showed a strong agreement with protocol 2c. Notably, there was a significant difference between protocols 2a (YE) and 2b (GY), suggesting that clotting time does influence the protein profiles, a finding which was not apparent from the PCA.
We next analyzed peak variances to assess the general stability of the handling methods. Data were calibrated internally based on known peaks, and median intensities and SDs taken for each peak across the 25 study participants. These values were used to calculate a coefficient of variance (SD/median intensity) for each protocol and fraction type. Median variance values were also calculated and compared across protocols to give an overall measure of variability. Protocol 2a (YE) gave the lowest median variances in both mass ranges, followed by protocols 1 (GN) and 2e (WH), suggesting that these were the most stable methods (Fig. 2A⇓ and B). This was corroborated by the observation that these methods gave the highest numbers of peaks with a median variance <1.0 (Fig. 2C⇓ ).
Selection of peaks for identification was based on altered intensity and variance across protocols, with emphasis on differences between protocols 1 (GN) and 2a (YE); considered the least and most stringent protocols, respectively. Identifications were in agreement with previous studies, where available (Table 3⇓ ⇓ ⇓ ⇓ ). Examples of two peaks that displayed increased intensity in protocol 1 samples, but not those of protocol 2a, are shown in Fig. 3⇓ . Peak 4286 Da (Fig. 3A⇓ ) was identified as the 4281.78 Da fragment of interα-trypsin inhibitor heavy chain H4 [ITIH4) (see (22)]. Two other ITIH4 fragments (3157.58 Da [see (27)] and 3955.48 Da [see(31)] and their methionine-oxidised forms were also identified, with all displaying increased intensity in protocol 1 samples (Table 3⇓ ⇓ ⇓ ⇓ ). An 8144 Da peak appeared to be a superposition of two peaks. In most protocols the 8144 Da peak represented an 8141.59 Da form of platelet factor 4 (30), with an alternatively cleaved signal sequence (data not shown). In protocol 1, the up-regulated peak 8126 Da corresponded to a C-terminal-truncated fragment of C3a anaphylatoxin [8126.52 Da; see (3)(23)] (Fig. 3C⇓ ). A C1 inhibitor C-terminal fragment (4152.87 Da), an albumin N-terminal fragment (3156.59 Da), and peaks corresponding to neutrophil defensins 1, 2, and 3 were also noticeably increased in protocol 1 samples. Other identifications were apolipoprotein C-I (ApoCI; 6330.59 Da) and its SPA adduct, a truncated form of ApoCII (8204.17 Da), albumin dimer (138 kDa), transferrin (79 kDa), and transthyretin (13.9 kDa), which were relatively stable across the different handling methods. In contrast, hemoglobin α (15126.36 Da) and β (15867.28 Da), and fibrinogen α fragments 3262.47 Da and 5904.22 Da and their modified forms, displayed lower intensities in protocols 1 (GN) and 2e (WH) (Table 3⇓ ⇓ ⇓ ⇓ ). Several peaks, including those representing the major form of platelet factor 4 [7765.10 Da; see (30)], showed altered intensity in protocol 2b (GY) vs 2a (YE) and 1 (GN), but not the other protocols, suggesting that their final serum concentration is affected by clotting time (highlighted in bold in Table 3⇓ ⇓ ⇓ ⇓ ).
One of the main objectives of this study was to select an optimal protocol for serum collection and handling for a large case-control study, which would be feasible for both clinical collection and protein profiling. Variance analysis showed that protocol 2a (YE) gave the lowest overall variance when all peaks were considered, and is therefore the preferred serum collection/handling method. This method, while clinically feasible, requires rapid transit of samples on ice to the laboratory for processing and freezing, and imposes additional logistic and resource burdens on clinical studies. No obvious effects on overall peak number, median intensity, or variance were apparent between the various protocols (GY, YE, CR, OR) where samples were placed on ice. This indicates that transport times of up to 6 h, and different storage methods (straws vs cryovials), have little effect on serum protein profiles as long as samples are on ice. However, the altered intensity of some peaks (e.g., platelet factor 4 at 7765.18 Da) did show that altering the clotting time has some effects. Perhaps not surprisingly, PCA showed the greatest separation for protocol 1 over other methods. Separation was representative for all 2.5–20 kDa spectra on all arrays, and was not apparent in the 20–200 kDa range, in accordance with increased peak numbers and intensities in the lower mass range. There was minor separation attributable to transport/storage at RT vs ice (GY vs WH). Thus, although strict control on sample handling protocols has been proposed to reduce variability, these data show that sample collection protocols can be more flexible with regard to clotting times and transport; transport for 3 h at RT or up to 6 h on ice can be used in clinical studies with limited impact on variability. The critical issue is that all individual samples must be treated exactly the same.
The stringent protocol with 1 h clotting and 3 h transit/storage on ice (YE) has been shown to give the most reproducible results in a serum profiling analysis performed using automated magnetic bead-based prefractionation and MALDI-TOF MS (14). However, our study showed that despite transit/storage at RT for 30 h, protocol 1 (GN) also gave a relatively low overall variance. This finding is critical, because it establishes that samples collected in older studies with longer transit times at RT can be used in case-control studies for novel biomarkers as long as all samples were handled similarly. Many of these biobanks are unique because they are associated with long follow-up, and contain samples stored many years before disease diagnosis.
The greater number of peaks (often with lower variances and higher peak intensities) in the low mass range for protocol 1 samples suggests that proteolysis in these samples has gone to completion. A similar trend was also apparent in protocol 2e (WH) samples, which were incubated for 3 h at RT before storage. These data are in agreement with a previous SELDI-TOF MS study showing increases in certain peaks with time between venipuncture and sample processing, with some overlap with the peaks identified here (15). In particular, the increased intensities of the ITIH4, C3a, and C1 inhibitor fragments and their modified forms in protocol 1 samples provides evidence that these degradation products are generated as a result of increased proteolysis due to extended transport/storage. Conversely, full-length hemoglobin α and β displayed decreased intensities in protocols 1 (GN) and 2e (WH), suggesting that they may be subject to degradation. Similarly, the decreases in fibrinogen α fragments 3262.47 and 5904.22 Da were consistent with further degradation to smaller undetected forms. It is harder to explain the increased levels of the neutrophil defensins in the protocol 1 samples. These disulfide bond-containing molecules are resistant to proteolysis, so an indirect mechanism must account for their altered intensities. Several other protein forms did not change significantly with collection method, revealing them to be relatively stable serum markers.
It has been suggested that many candidate disease biomarkers identified in SELDI-TOF MS profiling experiments are abundant acute-phase reactants, and are thus secondary effects of the diseased state (7). For example, serum transthyretin (TTR) is a known marker for nutritional status and the inflammatory acute-phase response. In ovarian cancer, TTR was identified as a potential early diagnostic marker, with decreased TTR levels reported in the sera of ovarian cancer patients compared with controls, without differences in its microheterogeneity (20)(32)(33). Notably, TTR and its modified forms were unaffected by the different handling conditions used here, and displayed relatively low variances across this healthy cohort. Thus, it would appear that TTR is relatively stable, making it a more robust disease biomarker. Similarly, SELDI-TOF MS was previously used to detect ApoCI and transferrin as classifiers of ovarian, colorectal, and other cancers (20)(34), and our data show that they are also stable under the conditions tested.
In a recent study, the putative acute-phase protein ITIH4 was shown to be extensively proteolytically processed, and its fragmentation patterns associated with different disease conditions (27)(31). Fragmentation was generally consistent with cleavages by endoproteases, followed by exoproteases, and the observed fragments were reported to change little under different assay conditions or processing procedures. An up-regulated cleavage fragment of ITIH4 was also shown to enable differentiation of patients with ovarian cancer from healthy controls or patients with benign pelvic masses (32). Our data provide evidence that ITIH4 is relatively unstable, with the generation of fragments increasing in serum maintained at RT for prolonged periods. Hemoglobin β has also been identified as a putative ovarian cancer biomarker (3)(20), but appears from our study to be relatively unstable in serum. With this in mind, ITIH4 fragments or hemoglobin β may not make robust disease biomarkers unless strict precautions are taken with sample handling. Future work will involve additional MS/MS-based identification of the unknown peaks that are discriminatory for the different collection methods. This will allow the assessment of their usefulness as potential disease biomarkers where they have been identified in other studies.
Our work establishes that the proteomic analysis of samples from established serum banks, where samples were not collected in accordance with more stringent protocols, can be used for proteomic biomarker studies. The key factor is that all samples in the collection should have been handled in a similar manner. Cases and controls should be matched for transport time and an assessment made when proteolysis in these samples reaches completion. Biomarker discovery using a proteomic approach in such case-control sets, such as UKCTOCS, will involve stable biomarkers rather than labile proteins. For future studies, the key variable during specimen collection will be transport on ice, and it does not seem to matter if transport times are then 3 or 6 h.
Financial support from The Eve Appeal and Ciphergen Biosystems Inc. is gratefully acknowledged.
Peaks were picked in CiphergenExpress software using the criteria outlined in Materials and Methods. Peak numbers by fraction type, molecular weight range, and collection protocol are shown.
Wilcoxon sign test P values were calculated for pairs of protocols by using the median intensity values (n=3) for the preprocessed 160 most frequent peaks. The minimum P value (Pmin) was then used to find a measure of agreement between pairs of protocols by calculating the corresponding “conservative” P value according to the formula: min (n × Pmin;1), where n=160. To create the summary table shown, to characterize the combination of all 13 fraction/chip types, the smallest P value in the 13 fractions/chip types for each protocol pair was taken and adjusted according to the formula, using n=13.
Internally calibrated mass, peak identification, references, median peak intensity, and median variance (SD/median intensity) across different handling protocols are shown. Median intensity values in bold indicate peaks that change due to altered clotting time. Median variance values shaded gray and in italics are >1.0.
↵1 Nonstandard abbreviations: MS, mass spectrometry; SELDI-TOF MS, surface-enhanced laser desorption/ionization time-of-flight mass spectrometry; UKCTOCS, United Kingdom Collaborative Trial of Ovarian Cancer Screening; RT, room temperature; IMAC, immobilized metal affinity capture; PCA, principal components analysis; ITIH4, inter-α-trypsin inhibitor heavy chain H4; ApoCI, apolipoprotein C-I; TTR, transthyretin.
- © 2007 The American Association for Clinical Chemistry