BACKGROUND: We analyzed serial data in patients with clinically stable monoclonal gammopathy to determine the total variation of serum M-spikes [measured with serum protein electrophoresis (SPEP)], urine M-spikes [measured with urine protein electrophoresis (UPEP)], and monoclonal serum free light chain (FLC) concentrations measured with immunoassay.
METHODS: Patients to be studied were identified by (a) no treatment during the study interval, (b) no change in diagnosis and <5 g/L change in serum M-spike over the course of observation; (c) performance of all 3 tests (SPEP, UPEP, FLC immunoassay) in at least 3 serial samples that were obtained 9 months to 5 years apart; (d) serum M-spike ≥10 g/L, urine M-spike ≥200 mg/24 h, or clonal FLC ≥100 mg/L. The total CV was calculated for each method.
RESULTS: Among the cohort of 158 patients, 90 had measurable serum M-spikes, 25 had urine M-spikes, and 52 had measurable serum FLC abnormalities. The CVs were calculated for serial SPEP M-spikes (8.1%), UPEP M-spikes (35.8%), and serum FLC concentrations (28.4%). Combining these CVs and the interassay analytical CVs, we calculated the biological CV for the serum M-spike (7.8%), urine M-spike (35.5%), and serum FLC concentration (27.8%).
CONCLUSIONS: The variations in urine M-spike and serum FLC measurements during patient monitoring are similar and are larger than those for serum M-spikes. In addition, in this group of stable patients, a measurable serum FLC concentration was available twice as often as a measurable urine M-spike.
Plasma cell proliferative disorders are commonly associated with the synthesis and secretion of a monoclonal immunoglobulin. These abnormal proteins may be monitored by a variety of methods. Serum protein electrophoresis (SPEP)4 and/or urine protein electrophoresis (UPEP) M-spike quantifications can distinguish polyclonal and monoclonal immunoglobulins and are used in multiple myeloma (MM) to detect response to treatment or relapse. In addition, these measures are commonly assessed in patients with monoclonal gammopathy of undetermined significance (MGUS) and smoldering MM (SMM) as indicators of progression. Guidelines for MM disease monitoring recommend the use of the serum M-spike if it is ≥10 g/L (i.e., measurable) and urine M-spike if ≥200 mg/24 h (1). In serum, reductions in the M-spike of at least 25% and 50% are considered minimal and partial responses, respectively (1, 2). The urine M-spike, however, requires at least a 50% and 90% decrease for minimal and partial responses (1, 2). A complete response requires the absence of a monoclonal protein detected by immunofixation electrophoreses.
Serum immunoglobulin concentrations, often assessed as a quality check for changes in the SPEP M-spike, may be especially useful when the M-spike is >20–30 g/L (3) or if the electrophoretic migration of a monoclonal IgA, or rarely IgM, is obscured within the β fraction. In addition, in the absence of a measurable serum or urine M-spike, the International Myeloma Working Group (IMWG) has recommended that serum free light chain (FLC) concentration be used to monitor disease if the monoclonal (involved) FLC (iFLC) concentration is ≥100 mg/L in the presence of an abnormal FLC κ/λ ratio (rFLC). Analogous to the SPEP M-spike, a 50% decrease has been suggested as a partial response criterion (1, 4).
There is a large body of work regarding within-person variation and the meaning of differences between sequential laboratory results (5, 6). These studies have recognized the variation to be the sum of preanalytic and analytical variability, as well as intraindividual biological variability. Traditionally, these studies have focused on healthy individuals with results within the working range of an assay. Serum immunoglobulins can be quantified in patients with monoclonal immunoglobulins and variations can be compared with reference intervals for serum immunoglobulins, but there are no normal counterparts to serum and urine M-spikes. We have analyzed serial samples in clinically stable patients to assess the total variability (analytical plus biological) of these monitoring tests. Intrinsic to this approach is that the biological variability also may contain disease variability despite our restricting the patient cohort to clinically defined stable patients. We have undertaken these studies to evaluate disease-monitoring recommendations, with particular emphasis on the recommendations for serum FLC.
To identify patients with clinically stable disease, we queried our database for patients who met the following criteria: (a) no treatment during the study interval, (b) no change in clinical diagnosis, (c) <5 g/L change in serum M-spike over the course of the observation compared to the first sample in the series, (d) at least 3 serial samples obtained within 5 years, and (e) availability of all test results in the clinical history. For a patient's results to be analyzed for variation, at least 1 method had to fulfill the definition of a measurable response criterion (3): (1) SPEP M-spike ≥10 g/L, (2) UPEP M-spike ≥200 mg/24 h, or (3) iFLC ≥100 mg/L in the presence of an rFLC that was not with in reference intervals. All data were retrospective and obtained from the medical record, and all queries to the database were done under a protocol approved by the Mayo Clinic Institutional Review Board.
We measured immunoglobulins by immunonephelometry using a Siemens BNII and Siemens reagent sets. We measured M-spikes using Helena SPIFE SPE reagent sets (Helena Labs) in combination with serum total protein determined with Biuret reagent (Siemens) on an Advia 1200 (Siemens) and urine total protein determined with pyrogallol red (Wako) on a Cobas c501 analyzer (Roche Diagnostics). We measured κ and λ FLC using a Siemens BNII nephelometer and Freelite reagent sets from The Binding Site. The FLC data were analyzed as the iFLC concentration, the uninvolved FLC (uFLC) concentration, and the rFLC.
Visual inspection of the data revealed that a few individuals displayed a slight trend over time. The variation within an individual was therefore defined as the SD about the fitted line for each person. We computed the total CV for each test as the mean square error using ANOVA on the logarithm of the laboratory value. A single fit was done including all individuals, with separate slope and intercept parameters per person. Regression on the log(response) is appropriate for data with constant CV, and the resulting estimate of variance from the model is a direct estimate of the CV of the data on the original scale (7).
For the analytical CVs of the methods, we used the clinical laboratory interassay CVs for patient-derived control samples. All samples were collected in a single institution, and we have assumed minimal preanalytic variance.
We selected 158 patients for evaluation: there were a minimum of 3 observations per patient and a maximum of 16 observations (median 4) (Table 1). All patients had monoclonal proteins of the IgG isotype and had a diagnosis of MGUS, SMM, or MM. Although not all assays had results that were measurable according to international consensus criteria, all patients had a serum M-spike and serum FLC quantification. The mean (range) for serum M-spike concentration was 14.3 g/L (1–51 g/L). The rFLC ranged from 0.001 to 1860. Only 25 patients had a urine M-spike; 148 had quantification of serum IgG.
We calculated the CVs from the patient sample sets and graphed the distribution of the CVs in relation to the mean concentration of the sample set (Fig. 1). Current recommendations for “measurable” values are indicated by the dashed lines.
The total CVs for serial measurements of serum M-spike, IgG quantification, urine M-spike, iFLC, uFLC, and rFLC are tabulated by diagnosis in Table 2. There were 90 patients whose serum M-spike was ≥10 g/L; the mean CVs for serum M-spike and IgG quantification were 8.1% and 13.0%, respectively. The larger CV for the IgG quantification is presumably due to the inclusion of monoclonal and polyclonal IgG in the nephelometric quantification. Although SMM and MGUS are both premalignant disorders, SMM patients are monitored at more frequent intervals than those with MGUS—the SMM samples were collected within 15 months, whereas the MGUS samples were collected within 5 years.
There were 25 patients whose urine M-spike was ≥200 mg/24 h; only 1 of these had MGUS. The urine M-spike CV was 35.8% (Table 2). There were 52 patients whose serum iFLC was ≥100 mg/L; the serum iFLC CV was 28.4%. These 2 variabilities were 3–4 times larger than the serum M-spike, which quantifies intact monoclonal immunoglobulin in these patients.
The CVs for all 158 iFLC and 157 uFLC measurements were 34.9% and 45.2%, respectively. The larger CV for the uFLC is presumably due to the lower concentrations of uFLC. The CV of the rFLC (47.5%) is a combination of the iFLC and uFLC and is even larger.
The percentage decreases needed in each assay to achieve 50%, 80%, 90%, or 95% probability of statistical significance between consecutive results in an individual are listed in Table 3. Each of these values indicates the decrease needed at each of the different probabilities to minimize the chances of falsely reporting a positive response in a stable patient.
The total variation inherent in these patient sample sets is a composite of the interassay analytical variation and the long-term biological variation. The analytical variations of the assays are known from the interassay CV of our clinical laboratory assay validation documentation. The analytical CVs for the SPEP M-spike, UPEP M-spike, and iFLC are 2.1%, 4.5%, and 5.8%, respectively (Table 4). The within-person biological variation is used here as an aggregate variable including preanalytical variation, disease variation, and within-person biological variation. Using the total average CVs and the interassay analytical CVs, the calculated biological variations are 7.8% (serum M-spike), 35.5% (urine M-spike), and 27.8% (iFLC).
The guidelines for monitoring monoclonal gammopathies propose that a 25% decline in serum M-spike defines a minimal response and a 50% decline is considered a partial response; in urine, the corresponding decreases needed are 50% and 90% (1, 2). It has recently been recommended that a partial response in serum FLC concentration be defined like that of the serum M-spike. These guidelines have been proposed to standardize treatment-response criteria. Our data confirm some of these guidelines but suggest that the FLC guidelines should be more stringent and similar to the recommendations for urine M-spike.
During monitoring of stable disease, the serum electrophoretic M-spike has a CV of 8.1%. This variation reflects the interassay analytical CV (2.1%) and the long-term biological CV (7.8%). Although the patient selection criteria for this study required that the M-spike vary <5 g/L, the total and biological CVs are surprisingly low, considering concerns about variations in hydration and hematocrit over these long time frames. The serum M-spike CV is smaller than the IgG quantification CV (8.1% vs 13.0%), presumably owing to the added variability of polyclonal immunoglobulins contained in the IgG quantification. Either measurement, however, should suffice for monitoring patients with large serum M-spikes and suppressed polyclonal immunoglobulins. The 12.3% biological variation of the monoclonal IgG quantification (Table 4) compares to 4.5% for the within-person biological variation of normal IgG (8). This increase in within-person biological variation for monoclonal IgG in our study is presumably due to the 2 additional factors of longer study intervals and the disease variation intrinsic to studying a cohort of diseased patients. Although IgG can be quantified in normal sera, serum and urine M-spikes as well as abnormal rFLC and increased iFLC are found only in the presence of disease, and thus, these quantities could be characterized only in diseased patients.
The urine M-spike has a CV of 35.8%. This variation is approximately 4 times larger than the serum M-spike CV and is due almost entirely to a large biological variation (35.3%) in the 24-h urine M-spike. It should again be emphasized that we used the serum M-spike as part of the criteria for defining stable disease in this patient cohort, and this selection criterion may have biased the relationship of the assay CVs. When we attempted to select patients using urine M-spike and serum FLC criteria, we obtained smaller numbers of patients. The relationships between the 3 sets of CVs, however, remained essentially the same.
The serum iFLC has a CV of 28.4%. The iFLC has been the FLC format recommended for monitoring oligosecretory multiple myeloma (4), whereas the difference between the iFLC and uFLC has been recommended for patients with primary amyloid (9) to normalize the effects of declining renal clearance. Reduced renal clearance results in increased serum FLC concentrations, as indicated by reference interval studies and studies in renal failure patients (10, 11). The rFLC normalizes most of the changes seen with reduced renal function and is recommended for diagnostic studies. The rFLC had originally been thought to normalize biological variation, but it is the most variable of the FLC data formats and is not recommended for disease monitoring.
In comparison to the other serum measurements of M-spike and IgG, the serum iFLC variation is larger than expected, and like that for urine, has almost entirely a biological basis. Because of its low molecular weight and short serum half-life, the FLC concentration responds very quickly to changes in plasma cell synthesis and renal clearance. The short half-life may be the reason for the large CV of FLC measurements compared with intact immunoglobulin measurements. Small variations in synthesis or clearance of intact immunoglobulin get diluted in the large pool of long-lived serum immunoglobulins, whereas these variations for light chains are not smoothed by dilution in a large, long-lived serum pool of FLC. Although the large CV of urine M-spike assays usually has been attributed to the preanalytic variability of 24-h urine collections, this same FLC physiology is probably responsible for most of the large CV of the urine M-spike. The increased variability of the urine M-spike compared with the serum iFLC concentration may reflect the preanalytic variability inherent in a 24-h urine collection.
The percent reductions between samples listed in Table 3 are those needed for the different probabilities listed in the table, and these different reductions reflect the tradeoff between the need for sensitivity of the test for early detection of response and a low false-positive rate. As the percent reduction gets larger and approaches the 95% probability threshold, the rate of false positives in stable patients falls, but the rate of false negatives in unstable patients may increase. The reductions in serum and urine M-spikes needed for the 95% probability threshold in Table 3 are close to the current guidelines requiring a 25% or 50% decrease for a minimal response in serum or urine M-spike, respectively.
The FLC results in Table 3 indicate that the guidelines for serum FLC need to be more stringent and more closely resemble the UPEP guidelines rather than the SPEP guidelines. Several studies have correlated clinical response to changes in serum FLC concentration. Studies in primary amyloid (12) and multiple myeloma (13) indicated that a 50% reduction in FLC concentration correlated with response, and responses up to 90% improved prognostic predictions (14,–,18). Although some of these studies indicated that a 50% reduction in FLC was significant for predicting response, the biological variation data indicate that a 50% change has only an 80%–90% probability of being a significant change. These data suggest that the criteria for minimal and partial response using FLC should be comparable to those of the UPEP M-spike. Finally, these data show that serum FLC, when used as a monitoring tool, is measurable in more patients than the urine M-spike.
4 Nonstandard abbreviations:
- serum protein electrophoresis;
- urine protein electrophoresis;
- multiple myeloma;
- monoclonal gammopathy of undetermined significance;
- smoldering MM;
- International Myeloma Working Group;
- free light chain;
- involved (monoclonal) FLC;
- ratio of κ to λ FLC;
- uninvolved FLC.
(see editorial on page 1635)
Author Contributions: All authors confirmed they have contributed to the intellectual content of this paper and have met the following 3 requirements: (a) significant contributions to the conception and design, acquisition of data, or analysis and interpretation of data; (b) drafting or revising the article for intellectual content; and (c) final approval of the published article.
Authors' Disclosures or Potential Conflicts of Interest: Upon manuscript submission, all authors completed the Disclosures of Potential Conflict of Interest form. Potential conflicts of interest:
Employment or Leadership: None declared.
Consultant or Advisory Role: None declared.
Stock Ownership: None declared.
Honoraria: J.A. Katzmann, Binding Site; R.A. Kyle, Binding Site.
Research Funding: None declared.
Expert Testimony: None declared.
Other Remuneration: A. Dispenzieri, Binding Site paid for travel to a meeting.
Role of Sponsor: The funding organizations played no role in the design of study, choice of enrolled patients, review and interpretation of data, or preparation or approval of manuscript.
- Received for publication June 26, 2011.
- Accepted for publication August 29, 2011.
- © 2011 The American Association for Clinical Chemistry