Background: Quantitative gene expression analysis by real-time PCR is important in several diagnostic areas, such as the detection of minimum residual disease in leukemia and the prognostic assessment of cancer patients. To address quality assurance in this technically challenging area, the European Union (EU) has funded the EQUAL project to develop methodologic external quality assessment (EQA) relevant to diagnostic and research laboratories among the EU member states. We report here the results of the EQUAL-quant program, which assesses standards in the use of TaqMan™ probes, one of the most widely used assays in the implementation of real-time PCR.
Methods: The EQUAL-quant reagent set was developed to assess the technical execution of a standard TaqMan assay, including RNA extraction, reverse transcription, and real-time PCR quantification of target DNA copy number.
Results: The multidisciplinary EQA scheme included 137 participating laboratories from 29 countries. We demonstrated significant differences in performance among laboratories, with 20% of laboratories reporting at least one result lacking in precision and/or accuracy according to the statistical procedures described. No differences in performance were observed for the >10 different testing platforms used by the study participants.
Conclusions: This EQA scheme demonstrated both the requirement and demand for external assessment of technical standards in real-time PCR. The reagent design and the statistical tools developed within this project will provide a benchmark for defining acceptable working standards in this emerging technology.
Molecular genetic techniques are now central to clinical research and diagnostics. To maintain patient confidence in this rapidly expanding field and to provide the highest standard of analysis, strict laboratory quality assurance procedures must be followed. External quality assessment (EQA), 1 also known as laboratory proficiency testing, is a well-established means to monitor and improve the quality of laboratory output (1). Accrediting and scientific bodies such as the IFCC recognize participation in EQA schemes as a key tool for quality assurance in clinical diagnostics (2)(3).
Numerous disease-specific EQA schemes have been designed to address standards in molecular genetics (4)(5)(6)(7)(8)(9)(10); however, with more than 800 genetic tests currently available or in development [data from GeneTest web site (11)], disease-specific EQA is unlikely to meet the increasing demand. In addition, disease-specific EQA schemes are not applicable to research laboratories, which as a consequence have lacked this external oversight. The development of generic EQA schemes relevant to all testing laboratories is a possible solution to this problem.
the EQUAL program
The EQUAL scheme [Full program title: Multinational external quality assay (EQA) programs in clinical molecular diagnostics based on performance and interpretation of PCR assay methods including dissemination and training (http://www.ec-4.org/equal/04_projectpresentation.htm)] arose under the auspices of the European Communities Confederation of Clinical Chemistry and Laboratory Medicine (EC4) from a successful application to the European Union (EU) Sixth Framework Program. A primary aim of the EQUAL program was to develop and implement novel methodologic EQA programs applied to PCR-based technologies used in research and diagnostic laboratories across Europe. As part of this work package, EQA schemes were designed and implemented for general DNA analysis (EQUAL-qual), quantitative PCR (EQUAL-quant), and sequencing (EQUAL-seq). The results of the EQUAL-seq work package have been described elsewhere (12); here we describe the EQUAL-quant project.
Quantitative PCR is currently used in a wide range of clinical and research applications, including the measurement of gene dosage, detection of residual disease in hematologic malignancies, and detection of bacterial and viral infection. Real-time PCR, first described in 1993 (13), is now a widely used procedure that in many areas has replaced end-point analysis. As a result of the widespread use of this technology, commercial reagents specifically designed and optimized for real-time PCR applications are now used extensively to enhance measurement sensitivity, precision, and accuracy.
The in vitro diagnostic medical device (IVD) directive of the European Communities (98/79/EC) requires that manufacturers of in vitro diagnostics assure the traceability of values assigned to their calibrators and control materials through available reference measurement procedures and/or available reference materials of a higher order. Currently, no certified reference materials are available for molecular genetic testing, and consequently, commercial providers of diagnostic devices must consider EQA performed by their client base as an essential means of postmarket monitoring of the analytical validity of their product.
A previous national interlaboratory comparison (14)(15) identified high variability in performance among laboratories using TaqMan probes in conjunction with real-time platforms. The aim of the EQUAL-quant program was to demonstrate the feasibility and scientific validity of methodologic proficiency testing for measuring real-time PCR performance throughout the EU.
Materials and Methods
Laboratories were invited to participate in the EQUAL-quant scheme by national and international scientific bodies by overland mail and web site advertisements. All applications to participate were made via the project web site. On registration, all participants were allocated a unique identification code. A total of 137 laboratories from 29 countries applied to take part in the EQUAL-quant scheme (participating centers are listed in Table 1 of the Data Supplement that accompanies the online version of this article at http://www.clinchem.org/content/vol52/issue8/).
choice of target
The target chosen for the EQUAL-quant scheme was the ABL protooncogene, assayed by use of primer and probe sets developed within the Europe Against Cancer program (16) (primer details are given in Table 2 of the online Data Supplement). Wild-type ABL is a ubiquitously expressed housekeeping gene with expression detectable by real-time PCR.
scheme design and reagent sets
Participants received detailed instructions regarding the actions required to carry out the EQUAL-quant EQA exercise. All participants were provided with the following reagents:
90.0 μL of ABL primers and 5′-FAM/3′-TAMRA–labeled (where FAM is 6-carboxyfluorescein, and TAMRA is 6-carboxytetramethylrhodamine) probe master mixture (25×);
50.0 μL each of 5 ABL plasmid standards (containing 10, 102, 103, 104, or 105 copies/5 μL);
Test samples T1, T2, and T3: 50.0 μL each of 3 cloned test cDNAs (unknown to the participants, containing 50, 500, or 50 000 copies/5 μL); and
Samples C1 and C2: 1.0 mL each of 2 samples containing cells (K562) suspended in RNAlater™ (Ambion).
Using the reagents provided, participants were asked to:
Construct a calibration curve;
Estimate cDNA copy numbers in the 3 test samples (T1, T2, and T3); and
Carry out RNA extraction, real-time PCR, and cDNA quantification on 2 samples of cells provided (C1 and C2). This action was optional.
Reagents for mRNA extraction and reverse transcription were not provided. The cDNAs (T1, T2, and T3) were intermediate dilutions of the ABL plasmid; their concentrations were measured on a spectrophotometer (absorbance) and confirmed by real-time PCR. The K562 cell line was grown according to American Type Culture Collection conditions. For C1 and C2, a single batch of culture was amplified in bulk to obtain 2 × 109 cells. The cells were then washed with phosphate-buffered saline (137 mM NaCl; 10 mM PO4; 2.7 mM KCl; pH 7.4), resuspended in RNAlater, and divided into aliquots, each tube containing 1 mL of RNAlater + 5 × 106 cells. The reagent sets were designed to be stable at room temperature and were shipped by guaranteed 24-h delivery. Participants were advised to store the reagent sets at −40 °C on arrival and were given 8 weeks to complete the analysis and submit their results. Before the full distribution, 5 pilot reagent sets were manufactured and validated by independent reference laboratories (in 3 countries) to ensure stability and suitability of the contents.
collection of results
Participants were free to define the cycle interval for calculation of the fluorescence baseline and the cycle threshold (Ct) value for calculations. Ct values for the no-template control, the calibrators, and all unknown samples were collected, as were details on the testing platforms used. Participants were required to provide all Ct measurements in triplicate and to 3 decimal places. All results were collected via a password-protected web site accessed with the unique laboratory identification number allocated at registration. Of the 137 laboratories that received the EQUAL-quant reagent set, 24 (18%) did not submit results and did not contact the scheme organizer to explain their nonparticipation. Ten (7%) other laboratories informed the scheme organizer that they would be unable to submit results in the time scale of the survey. Thus, the results from 103 laboratories (75%; laboratories L001–L103) were available for analysis. To protect participant anonymity, the laboratory codes L001–L103 used here are different from those used during the course of the project.
Among the 103 laboratories that provided data, 10 failed to meet the acceptance criteria for consideration. Specifically: 4 laboratories gave insufficient data, 3 laboratories provided Ct values for the no-template control that indicated assay contamination, and 3 laboratories did not provide Ct values for 1 of the 5 standard dilutions. Therefore, data from 93 laboratories were included in the final analysis.
To evaluate the precision and accuracy of laboratory results, we considered all Ct values provided for the standard dilutions, cDNA samples (T1, T2, and T3), and biological samples (C1 and C2). A calibration curve was fitted to the data provided by each laboratory by plotting the mean Ct numbers as a function of the known starting concentration of the standard dilutions. This curve was then used to estimate the unknown starting concentration in the test samples. Only concentration values included within the range of calibration data (interpolation) were taken into consideration for the subsequent analysis; concentration values falling outside the range of standard dilutions (extrapolation) were omitted. Consequently, we cannot assume that the linear model on which the calibration is based will hold true outside the range of the standard dilutions.
The 95% confidence intervals (95% CIs) of the cDNA concentrations of the unknown samples were the pivotal statistics adopted for analysis. These intervals were computed by means of Fieller’s theorem as described elsewhere (17), and for laboratories providing at least one Ct value per sample are depicted graphically in Figs. 1⇓ and 2⇓ as rectangles. The upper and lower borders of the rectangles correspond to the upper and lower limits, respectively, of each CI.
To evaluate the precision of the concentration estimates, the 90th centiles of the cumulative distribution of 95% CI lengths were calculated by considering all available 95% CIs for each sample. Intervals with lengths exceeding this threshold are depicted in the graphs as lacking in precision.
Ideally, accuracy is assessed by comparison to a known reference value (i.e., the “true” cDNA concentration), but in the present context this value is unknown and must be estimated from the data. For this purpose, a trimmed mean was adopted as a robust estimate of the true value. For each sample, the trimmed mean was calculated from those laboratories for which the concentrations were within the 12.5th to 87.5th centiles of the cumulative distribution of all concentrations; consequently the outlying 25% of concentrations do not contribute to this value.
Sampling theory predicts that the true cDNA concentration of each sample lies between the 2 confidence limits with a probability of 95%. Therefore, a laboratory with a 95% CI that fails to include the trimmed mean is regarded as lacking in accuracy. The trimmed means are indicated as horizontal continuous lines in Figs. 1⇑ and 2⇑ .
We used the Fisher exact test to assess possible associations between analytical platforms and observed performance and performed all statistical analyses with the SAS System (18).
The minimum, median, and maximum values of the distributions of the cDNA copy numbers in the 5 unknown samples, together with the expected concentrations and the number of laboratories taken into consideration for statistical analysis, are reported in Table 1⇓ . The 95% CIs computed for each laboratory on the 5 test samples are shown in Figs. 1⇑ and 2⇑ .
cDNA samples t1, t2, and t3
For the unknown cDNA samples (Fig. 1⇑ ), 74 of 93 laboratories met the performance criteria (i.e., provided both precise and accurate estimates for all 3 unknown samples tested).
For 9 laboratories, the 95% CI length exceeded, in at least 1 sample, the (arbitrary) threshold adopted to identify the imprecise determinations. Overall, inaccuracy increased from the higher (sample T3) to the lower (sample T1) copy number; in fact, 9, 1, and 0 CIs did not cover the trimmed mean for samples T1, T2, and T3, respectively. For all 3 cDNA samples, the trimmed means closely agreed with the expected concentrations (see Table 1⇑ ); therefore, the trimmed means can be viewed as unbiased estimate of the true concentrations.
cell suspension samples c1 and c2
Samples C1 and C2 were included in the scheme to assess the ability to perform all of the steps necessary to estimate RNA copy number. Because this is not a routine procedure for many laboratories carrying out real-time PCR, analysis of these samples was optional. Specifically, 75 laboratories provided Ct values for sample C1 and 74 for sample C2. In 27 of 75 and 25 of 74 laboratories for sample C1 and C2, respectively, both 95% CI limits fell outside the range of standard dilutions. The 95% CIs for 48 (sample C1) and 49 (sample C2) laboratories are reported in Fig. 2⇑ . The 95% CI length for 4 laboratories exceeded the threshold adopted to pick up the imprecise determinations for both samples. The CIs of 20 participants did not cover the trimmed mean for at least 1 biological sample, and for 10 participants, the CIs did not cover the trimmed mean for both samples.
More than 10 different testing platforms were used to carry out this scheme (Table 2⇓ ), and laboratories that reported at least 1 inaccurate or imprecise result for T1, T2, or T3 were distributed evenly among them. To achieve sufficiently large groups for analysis, we grouped users according to whether they used the ABI platforms (n = 58), LightCycler (Roche; n = 15), or other platforms (n = 20). As reported in Table 2⇓ , no associations were observed between performance and platform categories (Fisher exact test, P = 0.71).
Unlike disease-specific EQA schemes, EQUAL-quant does not set out to assess the performance of in-house assays. By reviewing performance in a manner independent of local specialties, we are not limiting participation. In addition, by using fully validated reagent sets manufactured to commercial standards and providing detailed written procedures, we have minimized issues of reagent performance and concentrated instead on platform performance and technical implementation of this analytical procedure, essential measures of laboratory performance that currently have little external quality control in the context of this technology.
We have designed and manufactured a reagent set suitable for the external assessment of real-time PCR with TaqMan probes. This EQA scheme has been implemented successfully by a wide variety of laboratories across the EU and beyond, as well as across numerous clinical disciplines within both diagnostic and research settings. This EQA of real-time PCR has been well received by participants, many of whom have had no prior experience with such a scheme. We focused on real-time PCR by evaluating the performance in testing gene expression (samples T1, T2, and T3) or including the pre-PCR steps (samples C1 and C2). With samples C1 and C2, we have been unable to determine the cause of the wide variation in the laboratory results (including RNA extraction, reverse transcription, and real-time PCR).
Continual demonstration of satisfactory performance is a requirement of laboratory accreditation, and we have identified a lack of precision and/or accuracy in the determination of cDNA copy number in a small but significant number of laboratories. Of the 103 laboratories that provided data, 74 gave both precise and accurate results for all cDNA samples, 19 were lacking in precision and/or in accuracy for at least 1 sample, and 10 were not included in the statistical analysis for one of the following reasons: (a) presence of one or more values for the negative samples, indicating contamination of the assay; (b) presence of at least one standard with all Ct values missing; or (c) all Ct values missing for all samples.
In a previous national EQA scheme in real-time PCR, significantly higher variability than described here occurred among participating laboratories (14). Using statistical procedures equivalent to those described above, Casini Raggi et al. (14) found that only 12 of 40 (30%) participating laboratories that provided a full set of results were both precise and accurate for all samples tested, compared with 74 of 93 (80%) in our current analysis. This difference most likely reflects the different approaches to EQA design rather than genuine differences in laboratory performance. For example, in the previous study test (14), cDNA samples were retrieved from pools of total RNA, whereas in the current scheme cloned cDNA was provided to control the starting copy number. In addition, in the current scheme the full range of standard controls were provided ready-made, whereas in the previous exercise (14), a single standard cDNA solution was provided, from which participants were asked to make serial dilutions to construct a calibration curve. This action is likely to have introduced an additional source of variability. We provided ready-made standard dilutions and observed lower variability and better assessment of analytical steps critical to the assay execution.
For samples C1 and C2, it is less valid to define performance on the basis of results lacking in accuracy or precision in the same way as for samples T1, T2, and T3. The reasons for this are as follows: (a) existing participant in-house procedures may not be optimized for cells in the medium provided, and (b) because RNA extraction and cDNA synthesis are highly variable in efficiency, deviation above the mean might reflect more rather than less efficient procedures. Good laboratory practice will take this variability into account; however, it is important that participants be given the opportunity to compare their own procedures against others. For the sake of completeness, we describe the results for 48 and 49 laboratories for sample C1 and C2, respectively. Undoubtedly, copy number estimates consistently below the mean should warrant further consideration. In future EQA schemes, we recommend that a more extensive calibration curve prepared with standard dilutions be considered to minimize invalid results (extrapolation).
Participants must consider the results from samples C1 and C2 and attempt to explain any deviation from the mean. Because C1 and C2 represent independent samplings from a common culture, they cannot be regarded as identical samples. Consequently, a statistical comparison is not appropriate. However, these samples would have been sufficiently similar that the results will give laboratories a picture of the reliability of their pre-PCR procedures. In the light of the results from samples C1 and C2, we suggest that in future EQA exercises participants should be asked to compensate for differences between samples attributable to material losses, PCR inhibition, and differences in reverse transcriptase yields by standardizing target copy number estimates by parallel amplification with 1 or more control genes (19)(20).
With the data presented here, we can begin to investigate common sources of error in real-time PCR procedures. In regard to accuracy and precision, most problems were encountered with the lowest target cDNA numbers. Inaccurate results for T1 (50 copies/5 μL) were reported by 9 laboratories, and for 7 of these laboratories this was the only inaccurate result. In addition, 8 laboratories obtained imprecise results for 2 or more samples. Pipetting is the most likely source of error, emphasizing the need for well-calibrated and robust pipettes for real-time PCR.
There are now a wide range of platforms available for real-time PCR, a situation reflected by the participation in the EQUAL-quant scheme. More than 10 different platforms were used to carry out this scheme, and laboratories reporting both accurate and precise results were distributed evenly among them. Specifically, 79% of ABI platform users, 87% of LightCycler users, and 75% of the others achieved accurate and precise measurements for all 3 cDNA samples, T1, T2, and T3.
In this report attention has focused on copy number estimation. More detailed analysis of possible sources of errors related to calibration curves will be presented in a future report.
EQA is now regarded as an indispensable aspect of laboratory activity, and we must consider the financial implications of providing a scheme such as this. We estimate that the cost of participation in the EQUAL-quant scheme, taking into account scheme design, reagent set manufacture, and analysis of the results, to be approximately €526 per participant. It is hoped that with experience the unit costs of future schemes could be reduced.
In conclusion, real-time PCR has opened new horizons in molecular clinical diagnostics. However, minor variations in sample handling and test execution can lead to large changes in the overall amount of amplified product, which may compromise the appropriate clinical interpretation of results. The availability of reference materials is an unrealistic goal for the vast majority of applications, because different genes have unpredictable expression and the cost of target-specific materials would be prohibitive. Methodologic EQA offers independent assessment of internal quality-control procedures essential to the reliable execution of this technically demanding technology.
We gratefully acknowledge the contribution of all of the partners and participant laboratories (see Table 1⇑ in the online Data Supplement) in the EQUAL project. This program has been supported by the EU Sixth Framework Program (Contract 504842).
1 Laboratory L007 did not provide Ct values for sample T1.
2 Laboratory L010 did not provide Ct values for sample T3.
3 Number of laboratories that provided Ct values for sample C1 within the standard dilutions range.
4 Number of laboratories that provided Ct values for sample C1 within the standard dilutions range.
1 Data are shown only for those laboratories included in the final analysis of samples T1, T2, and T3.
2 Fisher exact test, P = 0.71.
3 Both precise and accurate estimates for all 3 cDNA samples tested.
↵1 Nonstandard abbreviations: EQA, external quality assurance; EU, European Union; Ct, cycle threshold; and 95% CI, 95% confidence interval.
- © 2006 The American Association for Clinical Chemistry