BACKGROUND: Pediatric healthcare is critically dependent on the availability of accurate and precise laboratory biomarkers of pediatric disease, and on the availability of reference intervals to allow appropriate clinical interpretation. The development and growth of children profoundly influence normal circulating concentrations of biochemical markers and thus the respective reference intervals. There are currently substantial gaps in our knowledge of the influences of age, sex, and ethnicity on reference intervals. We report a comprehensive covariate-stratified reference interval database established from a healthy, nonhospitalized, and multiethnic pediatric population.
METHODS: Healthy children and adolescents (n = 2188, newborn to 18 years of age) were recruited from a multiethnic population with informed parental consent and were assessed from completed questionnaires and according to defined exclusion criteria. Whole-blood samples were collected for establishing age- and sex-stratified reference intervals for 40 serum biochemical markers (serum chemistry, enzymes, lipids, proteins) on the Abbott ARCHITECT c8000 analyzer.
RESULTS: Reference intervals were generated according to CLSI C28-A3 statistical guidelines. Caucasians, East Asians, and South Asian participants were evaluated with respect to the influence of ethnicity, and statistically significant differences were observed for 7 specific biomarkers.
CONCLUSIONS: The establishment of a new comprehensive database of pediatric reference intervals is part of the Canadian Laboratory Initiative in Pediatric Reference Intervals (CALIPER). It should assist laboratorians and pediatricians in interpreting test results more accurately and thereby lead to improved diagnosis of childhood diseases and reduced patient risk. The database will also be of global benefit once reference intervals are validated in transference studies with other analytical platforms and local populations, as recommended by the CLSI.
Proper medical assessment and care of children are vitally dependent on both the availability of accurate laboratory tests and reliable reference intervals to help guide test interpretation. Current guidelines define a reference interval as the interval between 2 limiting values within which 95% of the results for apparently healthy individuals would fall—usually between the 0.025 and 0.975 fractiles of the distribution of test results for the reference (healthy) population (1). Although the concept of reference intervals and their application appear straightforward, the process of establishing accurate and reliable pediatric reference intervals is complex. Recent CLSI guidelines (1), which are focused mostly on generating adult reference intervals, acknowledge the challenges in establishing age-specific and sex-specific pediatric reference intervals. Many of the challenges encountered when establishing pediatric reference intervals are related to child development and growth, which can profoundly influence the concentrations of many analytes routinely measured in the clinical diagnostic laboratory. Differences in physical size, organ maturity, body fluid compartments, immune and hormone responsiveness, nutrition, and metabolism are likely to affect normal analyte concentrations in children and youth (2, 3).
Several studies have highlighted the clinical impacts of using inappropriate reference intervals in clinical medicine. One study found that the use of inadequate serum ferritin reference intervals led to a substantial (>15%) underestimation of iron deficiency in low-income children (4). The lack of age-adjusted cutoffs for thyroid-stimulating hormone during neonatal screening for congenital hypothyroidism led to an increase in the frequency of false positives and to excessive follow-up rates (5, 6). Mir et al. (7) analyzed N-terminal B-type natriuretic peptide and observed age-specific sex differences, with children having concentrations up to 260% higher than those of adults. Infants of North African origin have higher immunoreactive trypsinogen values compared with newborns of European ethnic origin (8). One consequence of using reference intervals that do not reflect ethnic differences was the observation of significantly higher rates of false-positive cystic fibrosis screening results for the former group. These results and those of similar studies (9, 10) clearly demonstrate that inadequate pediatric reference intervals that fail to account for differences between age groups, sexes, or ethnic groups can lead to misdiagnosis and misclassification of disease.
Despite this recognized need, pediatric-specific reference intervals remain inadequate or unavailable for many analytes. Many of the reference intervals in current use have been derived from the analysis of a small number of healthy or hospitalized individuals or are focused on a limited age interval with restricted partitions (11–15). Because of the challenges with recruiting study participants, only a small number of analytes have been studied (16–18). Larger national initiatives have begun to work toward establishing new pediatric reference intervals, but the results remain predominantly unpublished (19).
The CALIPER (Canadian Laboratory Initiative in Pediatric Reference Intervals)4 Project is a collaborative study among pediatric centers across Canada that is addressing critical gaps in pediatric reference intervals by determining the influence of key covariates, such as age, sex, and ethnicity, on pediatric reference intervals. The present report presents age- and sex-specific reference intervals for 40 biochemical markers (serum chemistry, enzyme, lipid, and protein analytes). This new database clearly demonstrates that child age and sex profoundly influence circulating concentrations of these biomarkers, with considerable variation occurring from analyte to analyte.
Materials and Methods
PARTICIPANT RECRUITMENT AND SAMPLE ACQUISITION
This study was approved by the Institutional Review Board at the Hospital for Sick Children, Toronto, Canada. Healthy children from birth to 18 years of age were recruited to participate in the CALIPER study. Because the goal was to obtain samples from healthy infants and children, the recruitment of study participants took place in the wider community (schools, churches, community centers) in the multiethnic population of the greater Toronto area. Participation in this study required completion of a short questionnaire, written informed consent, and donation of a blood sample. Participants were excluded from this study if they had a history of chronic illness or metabolic disease, an acute illness within the previous month, or use of prescribed medication over the previous 2 weeks. The collected demographic data included diet, exercise status, ethnicity, and body mass index parameters. Samples were collected in serum separator tubes (SST™; BD). All collected blood samples were centrifuged, separated, and aliquoted within 4 h of collection; all serum aliquots were kept frozen at −80 °C until testing. Participant data were screened before entry into the database to ensure that only data from healthy individuals were used in the analysis. All samples analyzed were matched by age, sex, and ethnicity so as to generate equivalent groups for comparison and to produce an ethnically diverse group. The ethnic composition of the study participants was based on the 2006 Canadian census data for the province of Ontario (20). Ethnicity was based on the ethnic background of both parents. The major ethnic groups represented in the study population included Caucasians (Canadians of European ancestry born in Canada or both parents originating from Western European countries), East Asians (Chinese and other East Asian countries), and South Asians (India or Bangladesh).
Of note is that although all samples from participants older than 1 year were collected from healthy children in the community, additional samples from apparently healthy/metabolically stable children were collected from participants younger than 1 year to ensure a sufficiently large sample size. The samples for the group <14 days old were obtained from neonates in the maternity ward of Women's College Hospital in Toronto who had been deemed healthy and were being sent home (i.e., 100% healthy neonates going home from the maternity ward). For samples from individuals older than 14 days and younger than 1 year, we used leftover samples from select outpatient clinics, which included dentistry, fracture, and plastic surgery (93% outpatients; 7% from community children). Samples from groups of participants older than 1 year included no outpatient samples (i.e., all samples from children between 1 and 18 years of age were from healthy community children).
Serum samples from participant with ages from newborn to 18 years were analyzed on the Abbott ARCHITECT c8000 system for 40 biochemical markers (see Table 1 in the Data Supplement that accompanies the online version of this article at http://www.clinchem.org/content/vol58/issue5). The samples were analyzed in batches over a 6-month period. Analytical methods were controlled according to the manufacturer's instructions by preventive maintenance, function checks, calibration, and quality control. All samples tested underwent automated interference analysis for hemolysis, icterus, and turbidity. Table 1 in the online Data Supplement summarizes the analytical parameters of the ARCHITECT assays and calibration/traceability information. The analytical performance of the assays was vigorously controlled, and samples for reference intervals were analyzed only when all analytical parameters were acceptable.
STATISTICAL ANALYSIS AND DETERMINATION OF REFERENCE INTERVALS
Data were analyzed in accordance with CLSI C28-A3 guidelines, as outlined in Fig. 1 (1). Statistical analysis was performed with Excel (Microsoft) and SPSS (IBM) software. In brief, scatter and distribution plots were used to visually inspect the data; outliers were then identified with the Tukey test and removed (21). Age and sex partitions were determined by visually inspecting distribution and scatter plots for overall trends; partition decisions were based on trends observed within the distribution plots and then statistically evaluated with Harris and Boyd's test, which uses the SD and a modified z statistic for 2 groups to determine if each group is sufficiently different statistically to warrant its own grouping (22). When the results of Harris and Boyd's test did not indicate partitioning, data were combined and then reevaluated. The nonparametric rank method was used to calculate the reference interval for partitions with a sample size ≥120 participants. For partitions with a sample size <120, effort was made to analyze additional samples to ensure that each partition had a minimum sample size of 120. All partitions included a sample size ≥120, with the exception of a few analytes in which extensive partitioning was required. For analytes with partitions containing <120 participants, the robust method of Horn and Pesce (23) was used to calculate the reference interval. For each reference interval, 90% confidence intervals were calculated for the end points.
Differences among the 3 ethnic groups (Caucasians, East Asians, and South Asians) were analyzed by ANOVA for the data of analytes that met the distributional assumptions required for ANOVA. When the overall ANOVA was statistically significant, post hoc pairwise comparisons were conducted after correcting for multiple comparisons with the Bonferroni adjustment (24).
Samples from 1072 male and 1116 female participants (newborn to 18 years) were used to calculate age- and sex-specific reference intervals. Age- and sex-specific pediatric reference intervals for 40 biochemical markers (serum chemistry, enzyme, lipid, and protein analytes) are provided in Table 1. The ethnic composition of the male and female participants included in this study is presented in Table 2. Complete reference interval data are also presented in scatter plot format for all 40 assays (see the scatter plots in the online Data Supplement). Supplemental tables expressing the same reference intervals in SI units are also available in the online Data Supplement. As Table 1 shows, all analytes required some amount of partitioning, by age, sex, or both. Interestingly, all analytes required partitioning within the first year of life, and several required additional stratification within the first year. Calcium, which is known to be tightly regulated, required a reference interval for infants <1 year of age and a separate reference interval for ages from 1 year to <19 years (Table 1). Amylase required 3 different age partitions within the first year of life: birth to 14 days, 15 days to 12 weeks, and 13 weeks to <1 year (Fig. 2). This pattern was seen for most of the 40 studied analytes, with 32 analytes requiring partitioning between birth and 14 days of age. Of these analytes, many displayed higher analyte concentrations in the neonatal period that eventually declined after 15 days, whereas other analytes displayed lower initial concentrations that eventually increased after 15 days (Fig. 3).
Considerable age partitioning was required for alkaline phosphatase (ALP), aspartate aminotransferase (AST), creatinine (by both enzymatic and Jaffe methods), HDL cholesterol, IgA, IgG, IgM, lactate dehydrogenase (LDH), phosphate, prealbumin, total CO2, and total protein. Each of these analytes required a minimum of 5 age-specific reference intervals. Within this group of analytes, ALP, creatinine (enzymatic and Jaffe methods), IgG, phosphate, prealbumin, and total CO2 demonstrated a complex pattern of change in analyte concentration over time, whereas other analytes showed steady increases (e.g., uric acid) or decreases (e.g., phosphate) in analyte concentration over time (Fig. 3). The marked changes and fluctuations in children during their growth and development highlight the importance of determining age-specific pediatric reference intervals.
Differences in analyte concentrations over time were explored among 3 ethnic groups: Caucasians, East Asians, and South Asians (major ethnic groups in Canada). The following analytes demonstrated ethnic differences: alanine aminotransferase, amylase, IgG, IgM, magnesium, total protein, and transferrin (see Figs. 4–10 and Table 4 in the online Data Supplement). Caucasians showed significantly lower concentrations of amylase, IgG, and IgM compared with East or South Asians (see Figs. 4–6 in the online Data Supplement). East Asians had significantly lower concentrations of alanine aminotransferase and total protein (see Figs. 7 and 8 in the online Data Supplement). Finally, among the 3 ethnic groups examined, South Asians had the lowest serum magnesium concentrations (see Fig. 9 in the online Data Supplement) and the highest transferrin concentrations (see Fig. 10 in the online Data Supplement).
The comprehensive database of pediatric reference intervals reported for the current study fills the long-standing gaps for 40 key biochemical tests used in medical assessment and diagnosis/monitoring of childhood diseases. Although the influences of both age and sex on biochemical markers were clearly apparent for all biomarkers tested, age-related changes in analyte concentrations were observed more commonly than sex-associated differences.
Of the analytes studied, only lipase required no age or sex partitioning; thus, only 1 combined reference interval is reported for this analyte. This finding differs from data reported by Ghoshal and Soldin, who described increases in the upper reference limit for lipase with age (25). The lack of significant changes in lipase across the pediatric age groups in our study may reflect the considerable overall variation in lipase we observed. In addition, our study population was entirely based on healthy community children and differs considerably from that used in the former study.
Creatinine, along with ALP, total CO2, and IgG, showed complex age- and sex-related patterns, which are reflected in the high number of reference interval partitions required from birth to 18 years of age (Table 1). As Fig. 3A shows, the continuous change in analyte concentrations over time for creatinine makes determining age- and sex-specific reference intervals a challenge for this analyte. Analytes showing this pattern of continuous change over time and between sexes may be better served with a continuous reference interval that reflects this dynamic change in analyte concentration, rather than with static age- and sex-related reference intervals (26). Although creatinine reference intervals are not used to assess the glomerular filtration rate directly, use of a continuous reference interval may provide more reliable and accurate assessment when using the Schwartz equation to estimate glomerular filtration rate from serum creatinine concentrations (27). Although the use of continuous reference intervals has some advantages, the disadvantage of this approach is that most laboratory information systems cannot accommodate continuous reference intervals, and clinicians may find them difficult to interpret. Finally, differences between the Jaffe and enzymatic assays for creatinine were observed for reference intervals (Table 1; see Table 3 in the online Data Supplement). Creatinine assays based on the Jaffe method are well recognized to produce falsely increased results, owing to reaction with a variety of interferents with this assay. As expected, the enzymatic creatinine assay yielded lower results, thus supporting its use as the method of choice.
Albumin (bromcresol green and bromcresol purple methods), ALP, apolipoprotein AI, AST, AST (activated, with pyridoxal phosphate), total bilirubin, total CO2, creatinine (enzymatic and Jaffe methods), IgM, iron, lipase, transferrin, HDL cholesterol, and uric acid all required additional sex-stratified reference intervals. Interestingly, the influence of sexual development and growth during puberty is reflected in the fact that these analytes all required additional sex-related partitioning, primarily in the age interval of 14 to 18 years (Table 1). Tanner stage information, which would have allowed Tanner-specific partitioning, was not available for all participants, however.
In the current study, the slightly higher albumin values obtained with the bromcresol green method compared with the bromcresol purple method are reflected in the reference intervals determined. This finding is consistent with other studies that evaluated differences between these 2 methods (28). Previous studies have shown that the presence of δ bilirubin can lead to lower albumin values when the bromcresol purple method is used (29); however, unconjugated bilirubin and conjugated bilirubin do not interfere with either method. Thus, use of either method should be appropriate for pediatric populations.
Another interesting observation was the patterns observed in the neonatal period. Several biochemical markers (AST, direct bilirubin, total bilirubin, creatinine, C-reactive protein, γ-glutamyltransferase, IgG, LDH, magnesium, phosphate, rheumatoid factor, uric acid) were initially increased in the neonatal period but then declined quickly after 14 days. Other markers, such as amylase, antistreptolysin O, cholesterol, IgA, IgM, and transferrin, demonstrated the opposite pattern; that is, values were very low in the neonatal period and increased after 14 days. Several analytes showed reference intervals that were wider in the neonatal or infancy period than in older age groups, which may reflect variable organ development within a population or immature homeostatic mechanisms among neonates. Finally, although medical decision levels are more appropriate to use for some of the analytes (such as cholesterol and triglycerides) than reference intervals in the assessment and monitoring of clinical disorders, the availability of reference intervals in a healthy pediatric population is of epidemiologic interest.
The reference interval database reported for this study is based on a multiethnic population and is thus more representative of the ethnic diversity of the patient populations seen at urban healthcare centers across North America. Our data suggest that ethnicity is not a major covariate for many biochemical markers and that reference intervals from all these groups can be combined. For 7 of the 40 analytes examined, however, there appeared to be statistically significant differences among the 3 major ethnic groups, indicating the need for partitioning of the data by ethnic origin. Because the sample sizes for these groups were too small to determine reliable reference intervals by ethnicity, overall trends were explored in the current study. We are planning further studies to investigate ethnicity-specific differences in pediatric reference intervals for these 7 analytes with a large sample size for each ethnic group.
We calculated central 95% reference intervals for all analytes as recommended by the CLSI, although 99% reference intervals may also be appropriate. Regardless of the decision to use 95% or 99% reference intervals, the number of partitions for a given analyte should remain the same, because the partitions are determined before the reference intervals are calculated. An instance in which the number of partitions may differ between 2 different intervals is after inspection a posteriori. For example, there are 5 partitions for direct bilirubin. One may question the clinical relevance of the sex difference for the age partition of 13 to <19 years [females, 0.10–0.39 mg/dL (1.7–6.7 μmol/L); males, 0.11–0.42 mg/dL (1.9–7.1 μmol/L)]. The 99% reference intervals for the 2 sexes are 0.05–0.46 mg/dL (0.8–7.8 μmol/L) and 0.10–0.43 mg/dL (1.7–7.3 μmol/L), respectively. It is interesting that the upper limit for females exceeds that for males in the 99% reference interval, yet the opposite occurs for the 95% reference interval. In another example, for CO2, one may question the clinical difference between the sexes in the partition from 15 to <19 years (females, 17–26 mmol/L; males, 18–28 mmol/L). One may also notice that both 95% reference intervals are similar to the interval of the partition from 5 to <15 years (17–26 mmol/L). Calculating the 99% reference interval yields similar results: 5 to <15 years, 16–27 mmol/L; females from 15 to <19 years, 16–27 mmol/L; males from 15 to <19 years, 17–29 mmol/L. Although the range of the interval broadens, the male reference interval remains slightly higher than the female reference interval in the age group of 15 to <19 years.
Over the last decade, several studies have looked at determining reference intervals in the pediatric population. Some of these studies used samples from hospital clinic patients (30), and others determined reference intervals from small sample sizes (31, 32) or a homogeneous population (33). Although several pediatric reference interval studies have recruited healthy children, many of these studies have focused on only a few analytes (34) or were conducted many years ago on outdated instrumentation (31, 32). In contrast, the reference intervals presented in the present report are based on a large number of samples collected from healthy community children of various ethnic backgrounds.
Although previous studies have used different instrumentation, trends in analyte concentrations over time similar to those described in prior studies were observed in the CALIPER study for some of the analytes. For example, the study by Gomez et al. found patterns in creatinine, total bilirubin, and LDH concentrations that were similar to those of our study when they were plotted over time and that reflected the underlying physiological changes that occur throughout childhood (35). Similarly, the trends in our data for alanine aminotransferase and creatinine showed similarities to the data presented by Lai et al., who studied a population that included 5000 healthy participants from 3 years of age to adult (34). The CALIPER reference intervals obtained for creatinine (enzymatic) in the present study were also closely similar to the age-adjusted reference intervals reported by Ceriotti et al. (26), likely owing to the use of isotope-dilution mass spectroscopy–standardized methods in both studies. A study by Southcott et al. generated reference intervals from samples obtained from a large cohort of healthy community children (36); however, this study was limited to a narrow age interval, 8 to 12 years. The age partitions from the CALIPER study cut through this age interval for most of the analytes and rarely fell into this age grouping (Table 1); however, 2 analytes investigated in the CALIPER study, ALP and AST, did have age partitions that fell within the age interval of 8 to 12 years. Compared with the results of Southcott et al., the reference intervals for these 2 analytes were similar. For example, the CALIPER reference interval for AST (7 to <12 years of age) was 18–36 U/L, compared with the reference intervals of 18–37 U/L (males) and 17–39 U/L (females) determined by Southcott et al. (36). Similarly, the CALIPER reference interval for ALP (10 to <13 years of age) was 144–460 U/L, compared with 145–402 U/L (males) and 161–460 U/L (females) in the study of Southcott et al. (36).
Likewise, our results for amylase were similar to those of Clifford et al. for ages between 6 years and 17 years (33). Although Clifford et al. investigated only the age interval of 6 to 17 years, our study extends the data to earlier ages. Our data show that amylase enzyme activities are significantly lower between the ages of 0 and 3 years. This finding is important because amylase is increased in several pathologic conditions that affect infants, such as small bowel injury/pathology, pancreatitis, and cystic fibrosis. Prealbumin values in our study, however, showed a constant increase with age, a finding not consistent with that of the Clifford et al. study. These differences may reflect ethnic differences between the Utah population and the Canadian population, which is more ethnically diverse. This result highlights the importance of investigating an ethnically diverse population.
Reference intervals in the study by Ghoshal and Soldin were determined by transference, i.e., correlating data from one analyzer with data from analyzers for which reference intervals had already been established (25). Reference intervals generated in the study by Ghoshal and Soldin do not compare well with results from the current CALIPER study, and the CALIPER study had fewer age and sex partitions. Differences between these 2 studies may be because Ghoshal and Soldin determined age and sex partitions from a compilation of several studies, each of which contributed a different interval of ages for each analyte.
The data presented for the current CALIPER study clearly show complex patterns in the concentrations of most biochemical markers during child growth and development, as well as sex differences for some of the biomarkers. These results led to more partitioning than we had expected. However, statistically significant differences between ages/sexes do not necessarily imply clinically important differences. Because we chose to follow a consistent statistical approach to determining age and sex partitions, some of our partitions may not be clinically important and may be combined when applying these reference intervals in clinical practice. The complete database used to calculate the reference intervals in Table 1 is available as a Supplemental Database file in the online Data Supplement to allow investigators to reanalyze the data with different approaches. Because the reference intervals established in the current study are method specific for the ARCHITECT c8000 instrument, they are directly applicable only for that platform. This database needs to be validated for other analytical platforms and in local populations by performing transference studies as recommended by CLSI C28. The CALIPER program is currently performing transference studies aimed at validating the reference interval database established with the Abbott ARCHITECT platform for other major platforms, including the Roche Modular, Siemens Vista, Beckman Coulter DxC, and Ortho Vitros 5600 analyzers. Completion of these transference studies will allow a broader application of the reference intervals developed through the CALIPER study and should benefit pediatric centers worldwide. The new database may also be of global benefit, and it will likely be used by hospital laboratories in other countries, although intervals should ideally be validated with local populations (considering ethnic, environmental, and lifestyle variation), with samples from healthy children in each population, and in transference studies as recommended by the CLSI.
We thank all the participants and their families who gave of their time and blood; this study could not have been possible without your willingness to participate. We also thank the CALIPER outreach coordinator, Jennifer Clark, and numerous CALIPER volunteers, whose dedication and countless hours of work made this study possible. The kind contributions of Dr. Lei Fu at Women's College Hospital and Dr. Paul Yip at University Health Network for their assistance with sample collection and sample analysis are also acknowledged.
↵4 Nonstandard abbreviations:
- Canadian Laboratory Initiative in Pediatric Reference Intervals;
- alkaline phosphatase;
- aspartate aminotransferase;
- lactate dehydrogenase.
(see editorial on page 808)
Author Contributions: All authors confirmed they have contributed to the intellectual content of this paper and have met the following 3 requirements: (a) significant contributions to the conception and design, acquisition of data, or analysis and interpretation of data; (b) drafting or revising the article for intellectual content; and (c) final approval of the published article.
Authors' Disclosures or Potential Conflicts of Interest: Upon manuscript submission, all authors completed the Disclosures of Potential Conflict of Interest form. Potential conflicts of interest:
Employment or Leadership: D. Armbruster, Abbott Diagnostics.
Consultant or Advisory Role: None declared.
Stock Ownership: D. Armbruster, Abbott Diagnostics.
Honoraria: None declared.
Research Funding: Canadian Institutes of Health Research and Abbott Diagnostics.
Expert Testimony: None declared.
Role of Sponsor: The funding organizations played a direct role in the design of the study, the choice of enrolled patients, the review and interpretation of data, and the preparation and final approval of the manuscript.
- Received for publication October 23, 2011.
- Accepted for publication February 2, 2012.
- © 2012 The American Association for Clinical Chemistry