## Abstract

Numerous outcome measures can be used to characterize and compare the performance of alternative quality-control (QC) strategies. The performance measure traditionally used in the QC planning process is the probability of rejecting an analytical run when a critical out-of-control error condition exists. Another performance measure that naturally fits within the total allowable error paradigm is the probability that a reported test result contains an analytical error that exceeds the total allowable error specification. In general, the out-of-control error conditions associated with the greatest chance of reporting an unacceptable test result are unrelated to the traditionally defined “critical” error conditions. If the probability of reporting an unacceptable test result is used as the primary performance measure, worst-case QC performance can be determined irrespective of the magnitude of any out-of-control error condition that may exist, thus eliminating the need for the concept of a “critical” out-of-control error.

The traditional approach to the quality-control (QC) planning process involves a series of at least five steps(1)(2).^{1} First, the quality requirement, usually defined in terms of total allowable error (TE_{a}), must be specified. Results that contain analytical errors that exceed TE_{a} are considered to be of unacceptable quality. Second, the accuracy and precision of the assay are evaluated. Third, critical size errors are calculated on the basis of the quality requirement and the assay accuracy and precision. Next, the performance of alternative QC rules are assessed in terms of their probability of rejecting an analytical run when an out-of-control error condition exists (*P*_{ed}) and their probability of falsely rejecting an analytical run that is in control (*P*_{fr}). Finally, control rules and the number of control samples per run (N) are selected to give a low false-rejection rate (*P*_{fr} ≤0.05) and a high error-detection rate (*P*_{ed} ≥0.90) for critical size errors.

Numerousoutcome measures can be used to characterize and compare the performance of alternative QC strategies. One performance measure that naturally fits within the total allowable error paradigm is the probability of reporting a test result with an analytical error that exceeds TE_{a} when an out-of-control error condition exists. The primary performance measure used in the traditional QC planning process is the probability of rejecting an analytical run when an out-of-control error condition exists. This paper addresses the question of how the probability of rejecting an analytical run that contains a critical size out-of-control error condition relates to the probability of reporting an unacceptable test result (one with an analytical error that exceeds the total allowable error specification).

## Methods

To demonstrate the relation between performance measures, I evaluated the 1_{ks} rule and the *¯X**(c)*/R_{4s} rule. The 1_{ks} rule rejects if any of the control observations in the current analytical run are more than *k* analytical SDs from target (2). The *¯X**(c)*/R_{4s} rule rejects if the average difference from target of the control observations in the current analytical run exceeds *c* SEMs or the range of the control observations exceeds four analytical SDs (3).

The TE_{a} specification is assumed to be 5.0 analytical SDs. Out-of-control error conditions that cause a systematic error (SE) or an increase in analytical imprecision (RE) are evaluated. Analytical imprecision is assumed to be within-run imprecision. Between-run imprecision is not considered (3). The critical systematic error (SE_{c}) is calculated as SE_{c} = TE_{a} − 1.65 and the critical random error (RE_{c}) is calculated as RE_{c} = TE_{a}/1.96 (4). These critical errors are associated with out-of-control error conditions that would have a 5% chance of producing a test result with an analytical error that exceeds TE_{a}.

The probability that a reported test result contains an analytical error that exceeds TE_{a} will depend on the magnitude of any out-of-control error condition that may exist and on the probability (*P*_{ed}) that the out-of-control error condition is detected by the QC rules. The probability that a test result contains an analytical error that exceeds TE_{a} when an out-of-control error condition exists, but before QC testing has occurred, will be denoted *P*_{e}. The probability that a test result has an analytical error that exceeds TE_{a} after QC testing has been performed will be denoted *P*_{qe}. For a given out-of-control error condition, *P*_{qe} is the product of the probability that a result contains an unacceptable error and the probability that the out-of-control error condition isn’t detected, or *P*_{qe} = *P*_{e} (1 − *P*_{ed}).

All results were obtained by numerical analysis without the use of simulations. The normal distribution function is required to calculate *P*_{e}. Determining *P*_{ed} for the 1_{ks} rule and *¯X**(c)* rule also requires the normal distribution function. Calculating *P*_{ed} for the R_{4s} rule when N = 2 can be accomplished by using the χ^{2} distribution function. When N >2, *P*_{ed} for the R_{4s} rule was calculated by numerical integration (5). Calculations were performed by using the statistical software package Stata (Stata Corp., College Station, TX).

## Results

Figure 1⇓ displays the traditional QC performance measures for the 1_{2.5s} and *¯X*(2.32)/R_{4s} QC rules with N = 2. The false rejection probability for the 1_{2.5s} rule is 0.025 when N = 2. The control limit for the *¯X**(c)* rule was determined so that the *¯X**(c)*/R_{4s} rule also had a false-rejection rate of 0.025. The appropriate control limit rounded to the nearest hundredth is 2.32 SEM or 2.32/ = 1.52 analytical SDs. With TE_{a} = 5.0, SE_{c} is 3.35 (Fig. 1A⇓ ) and RE_{c} is 2.55 (Fig. 1B⇓ ). For a SE_{c}, *P*_{ed} = 0.961 for the 1_{2.5s} rule and *P*_{ed} = 0.992 for the *¯X*(2.32)/R_{4s} rule. For a RE_{c}, *P*_{ed} = 0.547 for the 1_{2.5s} rule and *P*_{ed} = 0.533 for the *¯X*(2.32)/R_{4s} rule.

Both QC rules have high probabilities for rejecting a SE_{c} and would be considered by traditional criteria to provide acceptable error-detection performance. The *¯X*(2.32)/R_{4s} rule has a slightly better rejection probability at SE_{c}, but the difference doesn’t seem great. Both QC rules have relatively low error-detection probabilities for detecting a critical increase in analytical imprecision. A QC rule’s ability to detect RE_{c} is commonly observed to be less than its ability to detect SE_{c}(2).

Figure 2A⇓ shows that *P*_{e} increases as a function of SE, with the rate of increase depending on the value of TE_{a}. Fig. 2A⇓ also displays the acceptance probabilities (1 − *P*_{ed}) for the 1_{2.5s} and *¯X*(2.32)/R_{4s} rules. Fig. 2B⇓ gives the probabilities (*P*_{qe}) that a result contains an analytical error that exceeds TE_{a} after QC testing by each QC rule. These curves result from the product of the descending acceptance probability curves and the ascending *P*_{e} curve. The maximum value is 0.0022 occurring at SE = 3.04 for the 1_{2.5s} rule. The maximum value is 0.0007 occurring at SE = 2.66 for the *¯X*(2.32)/R_{4s} rule. In both cases, the SE condition associated with the highest probability of reporting an unacceptable test result is less than the traditionally defined SE_{c}.

Figure 3⇓ shows the same performance measures as Fig. 2⇑ , but as a function of RE. The maximum *P*_{qe} is 0.047 occurring at RE = 4.49 for the 1_{2.5s} rule. The maximum *P*_{qe} is 0.049 occurring at RE = 4.51 for the *¯X*(2.32)/R_{4s} rule. In this case, the RE condition associated with the highest probability of producing a test result that contains an unacceptable analytical error is substantially greater than the traditionally defined RE_{c}. In general, the out-of-control error conditions that have the greatest chance of producing an unacceptable test result are unrelated to the traditionally defined “critical” error conditions.

Rather than being specified in terms of the minimum acceptable probability of rejecting an analytical run that contains a critical out-of-control error condition, QC performance might better be specified in terms of the maximum acceptable probability (*P*_{max}) of reporting a test result with an analytical error that exceeds total allowable error specifications. The minimum error detection requirement for every possible error magnitude can then be easily determined by simply rearranging the inequality (1 − *P*_{ed}) *P*_{e} ≤*P*_{max} to give *P*_{ed} ≥1 − *P*_{max}/*P*_{e}. For example, Fig. 4A⇓ shows the minimum *P*_{ed} requirement for a QC rule to assure that the probability of reporting a test result with an analytical error >5.0 analytical SDs is never >0.001 for any possible SE condition. Fig. 4B⇓ shows the minimum *P*_{ed} requirement to keep *P*_{max} <0.01 for any possible RE condition.

Figure 5A⇓ displays the power function curves for the 1_{2.18s} rule (*P*_{fr} = 0.058) and the *¯X*(2.49)/R_{4s} rule (*P*_{fr} = 0.017) for two control samples. Both rules meet the quality specification shown in Fig. 4A⇑ and were easily found by systematically varying the control limits for the 1_{ks} rule and for the *¯X**(c)* rule until the power function curves just exceeded the minimum *P*_{ed} requirement. Fig. 5B⇓ shows the probabilities of reporting an unacceptable test result when these rules are used. As required, the maximum value for *P*_{qe} is just <0.001 and occurs when SE = 2.84 for the 1_{2.18s} rule and when SE = 2.74 for the *¯X*(2.49)/R_{4s} rule.

Similarly, Fig. 6⇓ gives the performance characteristics of the 1_{2.35s} rule (*P*_{fr} = 0.073) and the *¯X*(1.91)/R_{4s} rule (*P*_{fr} = 0.079) for four control samples. The rules produce virtually identical power function curves (Fig. 6A⇓ ) that meet the quality specification shown in Fig. 4B⇑ , and are associated with a maximum probability of reporting an unacceptable test result that is just <0.01 occurring when RE = 3.10 (Fig. 6B⇓ ).

## Discussion

The traditional approach to specifying analytical quality goals in the clinical laboratory has been based on a TE_{a} paradigm. Westgard and Barry discuss quality goals in terms of the TE_{a} that can be tolerated in a test result without compromising its medical usefulness (4). Cembrowski and Carey refer to quality goals in terms of the maximum clinically allowable error (E_{a}) in a test result (6). When this paradigm is used to evaluate QC strategies, the primary performance measure of interest should be related to the probability of reporting a test result that contains an analytical error that exceeds the TE_{a} specification. However, the traditional performance measure used in QC evaluation has been the probability of rejecting an analytical run when a critical out-of-control error condition exists. Such a condition is usually defined as one that would result in 1% or 5% of test results containing an analytical error that exceeds the TE_{a} error specification (4)(6). The traditional approach to performance evaluation only indirectly relates to the primary performance measure of interest.

Figures 1–3⇑ ⇑ ⇑ compare the traditional approach to QC performance measurement based on the probability of rejecting a run to an approach that directly calculates the probability of reporting an unacceptable result. The *P*_{ed} when a SE_{c} or RE_{c} exists has little to do with the error conditions that result in the greatest chance of producing unacceptable results. The probability of reporting an unacceptable result increases as the magnitude of the out-of-control error condition increases, reaches a maximum, and then decreases for larger error conditions. This behavior should not come as a surprise. For very small out-of-control error conditions, the chance that a QC rule will detect the error condition is low, but the probability of reporting an unacceptable result even if the out-of-control error condition isn’t detected will still be low. For very large error conditions, the probability of an unacceptable result is very high if the out-of-control error condition is not detected, but the probability that a QC rule will detect the error condition is also very high, so the ultimate probability of reporting an unacceptable result is again low. The out-of-control error conditions that will be associated with the greatest probability of reporting an unacceptable result will be those whose magnitude is small enough to still be relatively difficult for a QC rule to detect, but large enough so that the probability of producing a result with an unacceptable error is relatively high. The magnitude of the out-of-control error conditions associated with the highest probabilities of reporting test results with unacceptable errors will depend on the TE_{a} specification, the QC rule, and the type of error.

Traditionally, QC rules are selected to have a high probability (≥0.90) for detecting “critical” out-of-control error conditions. When comparing alternative QC rules that all have error-detection probabilities that are ≥0.90 for detecting a critical error, differences between rules seem relatively small. However, these apparently small differences between rules can translate into quite large differences in their probabilities of reporting a laboratory result with an unacceptable analytical error. In Fig. 1A⇑ , *P*_{ed} at the traditionally calculated SE_{c} is 0.961 for the 1_{2.5s} rule compared with 0.992 for the *¯X*(2.32)/R_{4s} rule. However, in Fig. 2B⇑ the probability of reporting a result with an unacceptable error is more than three times greater for the 1_{2.5s} rule (0.0022) than for the *¯X*(2.32)/R_{4s} rule (0.0007). Thus, differences between QC rules that might be dismissed as inconsequential when evaluated by traditional performance measures could be substantial in terms of their probabilities of allowing results to be reported that contain unacceptable analytical error.

If quality goals are specified in terms of a TE_{a} specification and a *P*_{max} of reporting a result that contains an analytical error that exceeds TE_{a}, then the minimum error detection probability required for every magnitude of out-of-control error condition can be easily determined (Fig. 4⇑ ). While others have discussed in general terms the optimal power function for a QC rule (6)(7), this approach provides a precise definition for an optimal *P*_{ed} criterion.

The quantity *P*_{qe} describes the probability of reporting a test result with an analytical error that exceeds TE_{a} as a function of the magnitude of an out-of-control error condition. If an estimate of the overall rate of reporting unacceptable test results is desired, then the frequency of occurrence of out-of-control error conditions must be taken into account. For example, if a SE condition between 1.0 and 4.0 SD exists less than one in 10 times that QC testing occurs, then for the situation shown in Fig. 5B⇑ the overall rate of test results containing unacceptable analytical errors would be considerably less than (0.1)(0.001) = 0.0001.

Westgard and Barry have described the defect rate of an analytical process as the portion of test results having medically important errors (4). However, the quantity that they calculate for defect rate is *f*(1 − *P*_{ed}), where *f* denotes the frequency of occurrence of a medically important out-of-control error condition. This quantity reflects the fraction of analytical runs containing critical out-of-control error conditions that are not rejected, which is not the same as the fraction of test results that have medically important errors.

In summary, when evaluating QC rules, the probability of reporting a test result that contains an unacceptable analytical error should be the performance measure of interest, not the probability of rejecting an analytical run with a SE_{c} or RE_{c}. By using the probability of reporting an unacceptable test result as the primary performance measure, worst-case QC performance can be determined irrespective of the magnitude of any out-of-control error condition that may exist, thus eliminating the need for the concept of a critical out-of-control error.

## Footnotes

Division of Laboratory Medicine, Department of Pathology, Washington University School of Medicine, Box 8118, 660 South Euclid Ave., St. Louis, MO 63110. Fax 314-362-3016; e-mail parvin{at}wugcrc.wustl.edu

^{1}Nonstandard abbreviations: QC, quality control; P_{e}, probability that a test result has an unacceptable analytical error when an out-of-control error condition exists, but before QC testing has occurred; P_{ed}, probability of rejecting an analytical run when an out-of-control error condition exists; P_{fr}, probability of rejecting an analytical run that is in control; P_{max}, maximum acceptable probability of reporting a test with an analytical error that exceeds the total allowable error specification; P_{qe}, probability that a test result has an unacceptable analytical error after QC testing has occurred; RE, an out-of-control error condition that causes an increase in analytic imprecision (random error) in subsequent test results; RE_{c}, an out-of-control error condition that causes an increase in analytic imprecision that is considered critical; SE, an out-of-control error condition that results in a constant bias (systematic error) in subsequent test results; SE_{c}, an out-of-control error condition that results in a constant bias that is considered critical; TE_{a}, total allowable error specification; test results that contain analytical errors that exceed TE_{a}are considered unacceptable; ¯X(c)/R_{4s}, QC rule that rejects if the average difference from target of the control observations in the current analytical run exceeds c SEMs or the range of the control observations exceeds four analytical SDs; and 1_{ks}, QC rule that rejects if any of the control observations in the current analytical run are more than k analytical SDs from target.

- © 1997 The American Association for Clinical Chemistry