Evidence-based medicine (EBM) has been driven by the need to cope with information overload, by cost-control, and by a public impatient for the best in diagnostics and treatment. Clinical guidelines, care maps, and outcome measures are quality improvement tools for the appropriateness, efficiency, and effectiveness of health services. Although they are imperfect, their value increases with the quality of the evidence they incorporate. Laboratory professionals must direct more effort to demonstrating the impact of laboratory tests on a greater variety of clinical outcomes. Laboratory and clinical practitioners must be familiar with many of the accessible electronic and paper tools for searching for evidence. Detailed statistical and epidemiologic knowledge is not essential, but critical appraisal skills and a competent understanding of the strengths and weaknesses of systematic review and metaanalysis are necessary. Overemphasis on complexity and failure to recognize time limitations are major barriers to translating EBM into everyday practice. Emphasizing and practicing the role of the laboratory professional as a skilled clinical consultant strongly grounded in evidence as well, in addition to better integration of laboratory and clinical information and improved laboratory reports will overcome most barriers. There is a poverty of good, primary studies of test evaluations. Institution of more consistent standards for the design and reporting of studies on diagnostic accuracy should improve the situation. If nothing else, systematic reviews have demonstrated the need for more good-quality primary research in laboratory medicine.
There is a clinical logic that drives the pursuit of evidence-based medicine (EBM). 1 In every encounter between an individual and a healthcare provider, there is a problem or need being expressed. This may be large or small, trivial or serious, but it requires an appropriate response. The need is expressed in what is said and in what is not said, in the appearance of the individual, in actions or inactions, in the findings from clinical examination, and in the results of investigations and tests. The individual physician, nurse, osteopath, or herbalist (a few examples from an extensive list of possibilities that definitely does include laboratory professionals) must recognize, interpret, and incorporate this information. The need or problem may fall into the areas broadly represented by the terms diagnosis, prognosis, or therapy. The meeting between patient and physician is direct, whereas the laboratory professional will frequently be indirectly involved. Nevertheless, all providers face the same obligations and challenges if the individual seeking help is to receive the best possible care. The information that expresses the patient’s need(s) must be turned into answerable questions, and the best evidence must be efficiently searched out, critically evaluated, and applied for the benefit of the patient (1). Knowing whether the action taken was beneficial requires evaluation and acceptance of the possibility of improvement, and thus a process of life-long and self-directed learning. The recognition that there are many providers of health services emphasizes the challenge that there is a widespread requirement and responsibility for evidence. Perhaps the Holy Grail of evidence-based healthcare will be achieved when evidence-based health services are partnered by evidence-driven health management and public policy. The ultimate challenge may be to recognize openly the political uses and misuses of evidence and to extend rules of evidence to government policy.
Although the term EBM was created in Canada at McMaster University by a group led by Dr. Gord Guyatt (2), there are various claims as to the origin of its practice. Paris in the 19th century has been suggested as the source of its philosophical origins (3), with the 18th century staking a claim when Morgagni in 1769 used the autopsy in the study of disease (4). In 17th century Paris, it was noted that those who received bleeding as part of the treatment for cholera had a much higher death rate than those who were not bled. There is the suggestion of even earlier philosophical origins for the assessment of evidence in research during the reign of the Chinese Emperor Qianlong (1). The method of “kaozheng” (practicing evidential research) was used in the interpretation of ancient Confucian texts.
Regardless of its origins, many factors have come together over the past 30 years to drive the movement to EBM. One factor is that individual physicians, faced with >30 000 biomedical journals published annually and >17 000 new medical books each year (5), have neither the time nor the ability to cope, not even in a specialist area. In 1992, the ∼20 English-language clinical journals dealing with adult internal medicine published >6000 articles with abstracts; every day a physician would have had to read at least 17 articles related to internal medicine alone to try to keep up to date (6). The challenge is even greater today. Systematic reviews of the literature have demonstrated that many studies are grossly inadequate, and thus are potentially misleading, and that >95% of articles in medical journals do not meet the minimal standards of critical appraisal (6).
The second factor is the global phenomenon of increasing healthcare costs, the use of cost-control to limit this, and efforts to make the best use of finite resources. There is an ongoing cost-benefit analysis. The payers are becoming increasingly dissatisfied with authoritative statements and are demanding evidence. Add to this a third factor, a general public who have more education, want the best in diagnostics and therapies, do not want to wait, and also know how to get information from the electronic media. EBM is at the interface of all these forces, because no matter what “spin” is put on a budget limit in healthcare, it represents rationing. This last point emphasizes that although EBM can provide guidance relating to a wide range of clinical situations, it is important to clearly recognize the financial and clinical components and to distinguish between them. This will ensure that the toughest problem in the provision of healthcare, namely, how to weigh the relative merits of clinical and economic efficacy in prioritizing the application of healthcare resources, is not masked as solely a decision on clinical cost-effectiveness. It must be recognized that this prioritizing requires societal, ethical, and political discussion (7).
Tools for Continuous Quality Improvement
It is not practical every time to repeat all of the steps in “the process of systematically finding, appraising, and using contemporaneous research findings as the basis for clinical decisions” (8). If the goal is to make the best clinical decision for any patient, a quality improvement approach will ask whether the care is appropriate, efficient, and effective. These three steps can be represented by clinical practice guidelines, care maps, and outcome measures (9).
Clinical practice guidelines have been defined as “systematically developed statements to assist practitioner and patient decisions about appropriate healthcare for specific clinical circumstances” (10). The goal of a guideline is to arrive at an agreement as to how patients should be treated, potentially providing them with greater consistency of care. As they become more widely known, guidelines may also help patients make more informed choices and even influence public policy by drawing attention to deficiencies (11). The process of developing guidelines is important not only because of the influence this will have on the quality of the evidence that is used, but also on its effective translation or transfer into clinical practice (12). Clinical guidelines are developed and supported by experts, and the goal is that they should have general application. The greater the strength of the evidence that is incorporated, the better the quality of the guidelines.
It would be naive not to recognize the weaknesses of guidelines, particularly if poor evidence leads to poor quality, that it takes a lot of time and resources to produce them, and that they are not easy to update. Cautionary notes have been sounded to recognize explicitly the fact that guidelines might become powerful rationing tools and that transparency and public scrutiny are essential to ensure that economic or political decisions are not disguised as clinical decisions (13). It has been observed that with the gradual increase of cost constraints, there is a reduction in the frequency with which the best option for a population matches that for individual patients (14). The same authors state that this is not good news for finding a solution to the tension between infinite needs and finite resources (14)(15).
Clinical guidelines are not cookbook medicine, with an absence of clinical freedom somewhat similar to the consensus guidelines explored by Plato in the 4th century BC (16). He foresaw legal action as a mechanism of forcing physician compliance with guidelines. It is worthy of note that in the United States there has been a study published on the use of guidelines in legal proceedings, stating that in <7% of malpractice suits, guidelines played “a relevant or pivotal role” (17)(18). This may reflect a realistic assessment of their value, recognizing their role as a quality improvement tool and not a rigid standard of practice.
Guidelines for the most appropriate care of the patient must be fitted into a structured process for delivery of healthcare. Too much or too little service delivered, too late or too early, makes even the most appropriate guidelines inefficient and ineffective. The issues of structure, process, and outcome in the production of healthcare have been lucidly presented by the late Professor Avedis Donabedian (19). The care map sets the optimum steps in the process with standards for each step that should allow comparison with the best practice. Acceptable and unacceptable variances from those standards can be benchmarks that in some cases require immediate review and action, whereas in others, they may become part of a retrospective variance analysis, with both cases leading to actions to improve the process. The clinical laboratory and many other disciplines may have impact at several steps in care paths and should be involved in their development, implementation, and evaluation (20).
The care map or critical pathway document usually is formulated as a grid, and the categories of care are given specific time segments, usually in days (21). Another component of the grid may be readily obtained outcome measures, and these reflect progress on the pathway (21)(22)(23)(24).
Test turnaround time (TAT) is one component of a care path that should focus on a patient’s laboratory test from the time the physician considers the test right through to the action that is taken based on the test result. In this more extensive understanding of TAT, the responsibilities of the laboratory service include, but are not limited to, mechanical or electronic efficiency in moving samples, providing analyses, and transmitting results. The laboratory service can also have valuable input into the initial appropriateness of the test, its interpretation and application, and all the steps in between, thus influencing favorably the TAT as it impacts the patient.
The clinical care map also relies on good-quality evidence and must evaluate whether it is optimizing the process and improving efficiency of care. These clinical pathways can be designed to be very responsive to the local situation and can be defined by diagnosis or by procedure, whichever is most appropriate in the given situation. They are an integral part of the benchmarking that hospitals use, or may be mandated to use, to achieve clinically relevant comparison with sister institutions at the regional or national levels. Bringing enthusiasm to these activities is important and valuable, but it is also useful to be reminded that the mechanics and data gathering applied to a selected group of quality measures can become so demanding on individual effort and institutional resources that no energy is left for other aspects of quality (25).
assessment of outcomes
This is the third step in the continuous quality improvement loop and is a guide to the effectiveness of care. This provides information for further assessment of the initial appropriateness of care and may suggest review and updating of the clinical practice guidelines. At each step, the quality of the evidence used is vitally important. Outcomes measurements and studies have been around for many years in clinical medicine and continue to be both expensive and difficult to do well (26)(27)(28). The focusing of so much attention on outcomes derives from the desire to control spending on healthcare. Outcomes management, a term proposed in 1988 (29), is difficult, but it has been argued that there has been some progress in evaluating the processes and outcomes of medical care (30). Understanding the distinction between efficacy and effectiveness is necessary in evaluating outcome studies. There are many published studies of efficacy, i.e., the ability of a procedure or a test to achieve the desired clinical purpose under controlled circumstance for a controlled group of patients. To be effective, the treatment effect must be demonstrated in a larger population, verifying its value in the situation of usual clinical practice.
An illustrative and straightforward outcome example is a patient recuperating from surgery. If the clinical outcome for a patient recuperating at home with support from a home healthcare program is as good as the outcome when the patient stays in the hospital, then a case is made for the less costly and more comfortable home environment. This outcome makes economic and clinical sense, but it also scores highly as another outcome, improved patient satisfaction. High cesarean section rates have been used as an outcome, generating reexamination of existing practice guidelines with subsequent modification of these guidelines, thus feeding back into the quality improvement loop. The hospital length of stay by diagnosis and hospital days per thousand population are outcomes that focus on resource utilization. Immunization rates for children, rates for Pap smears, complications after various procedures, mortality rates, quality of life, return to work, and functional status of a patient all serve to indicate that measurable outcomes may have a variety of time frames attached to them, i.e., short-term, long-term, and endpoint outcomes.
There have been many large and expensive clinical outcome studies that have looked at major endpoints such as myocardial infarction, mortality, and complication rates. Several very large studies on lowering cholesterol, among them the Scandinavian Simvastatin Survival Study (31), have shown significant reductions in nonfatal heart attack, the need for revascularization, deaths from ischemic heart disease, and overall mortality. Laboratory testing cannot attract funding for similar studies. However, there is an opportunity for much greater involvement in these clinical trials.
In addition, the clinical laboratory, while devoting resources to ensuring an efficient internal testing process, should always try to look at the steps in the context of seeing how they contribute to positive or negative clinical outcomes. For example, in the case of therapeutic drug monitoring, process indicators could be the percentage of acceptable blood collections, the percentage of samples collected on time, TAT, and reporting of critical results. Outcome measures for therapeutic drug monitoring that are patient-related indicators could be the percentages of therapeutic and toxic concentrations of dilantin, phenobarbital, or carbamazepine. Another patient-related example would be the number of drug assays performed per adjusted patient day. This is not to downgrade the importance of the process variables in monitoring the laboratory quality and performance improvement, because the laboratory must deliver its best-quality results (evidence) into the clinical system. However, it is the clinical outcome measurements that are the visible signs of the added value contributed by laboratory measurement of therapeutic drug concentrations. There are signs that laboratory professionals are giving serious attention to these clinical outcome issues (32). The clinical laboratory can have a very important role in this multidisciplinary activity. It can also be of great assistance in generating impact analyses before or after appointments of new physicians, introduction of new procedures, or increased clinical volume. Such data have increased credibility relating to test utilization and budget impact because they have been produced by the clinical laboratory in collaboration with the clinical team.
Building Blocks of EBM
It has been said that to practice EBM physicians must combine the skills and judgment they have developed through clinical experience with the best available clinical evidence that has been derived from systematic research (33). I stated earlier that arising from the clinical encounter is the need to pose an answerable question(s), search for the best evidence, assess it, and apply it.
efficient searching for quality evidence
Collecting appropriate information resources increases the sensitivity of the search process and ensures that important articles are not excluded. However, this will lead to the inclusion of articles of low methodologic quality. The resources have to be used skillfully to improve the specificity by filtering out poor-quality evidence. Strategies for this have been described (1)(34).
There are evidence-based journals that have set specific methodologic and clinical standards that have been met by the clinical articles they publish. Nothing is perfect, but this will increase the validity of results and hence the quality of the publications. It has been claimed (35) that this produces the most methologically rigorous 2% of papers that are most useful for clinicians. Examples of these journals are the ACP Journal Club, Evidence-Based Cardiovascular Medicine, and Evidence-Based Medicine. The contents of the ACP Journal Club and Evidence-Based Medicine are combined in a compact disk produced by the American College of Physicians. These evidence-based journals produce structured abstracts in which the evidence is summarized and there is a critical commentary to the article to help the reader relate its findings in a wider context.
There are databases of published articles, such as MEDLINE and EMBASE. It must be remembered that there are limitations to such databases because not all medical articles have been indexed on MEDLINE and human error means that many have been misclassified. MEDLINE can be researched by a variety of software products, including Grateful Med and the Internet Grateful Med. The HealthSTAR provides access to nonclinical information that could relate to health economics, planning, and administration. It also gives access to chapters, meetings, abstracts, and reports. In 1998, it was announced by the National Library of Medicine that there would be free access to MEDLINE through the PubMed interface and to MEDLINE and HealthSTAR through Internet Grateful Med. A very helpful introduction for the nonexpert to learn to use the MEDLINE database provides a series of questions to direct the approach to searching the medical literature, giving clear examples of how to use evidence-based quality filters for everyday use and maximally sensitive search strings for research (36). An informative review that is of interest to laboratory medicine shows how these databases are used to develop a systematic review of published reports relating to inappropriate laboratory utilization (37).
There is an ever-expanding resource of evidence-based materials on the Internet (Table 1⇓ ).
Coping with the avalanche of published articles requires the development of survival skills. When potentially useful evidence is available, whether sought by use of the formal approaches discussed earlier, by reading selected journals, or even serendipitously by glancing through journals, a decision must be made whether any article is valid and if it has any usefulness in the individual professional practice. This is part of a quality improvement process that involves optimizing the use of limited time, acquiring better-quality information by rejecting lesser-quality information, and hopefully providing improved clinical or laboratory performance if it is applied in practice. The skills of critical reading may be taught as part of formal professional education. Journal clubs have been shown to be effective in teaching residents the necessary skills (38) and have improved the knowledge of medical students (39). The time taken by journal clubs (appraising two papers in ∼1 h) is appropriate for the formal learning environment but is probably unrealistic for the busy practitioner (40). Various guidelines have been developed for the practitioner when evaluating an article about therapy or prevention. By asking whether the assignment of patients to treatment was randomized, this directs attention to look for biases that may be present and in what direction they would influence the results. Was the follow-up complete? How were those who withdrew or dropped out of the study handled? Was the study properly blinded, were the study groups similar when the trial began, and were the groups treated equally, apart from the application of the medication or procedure being tested? Can the results be applied to my practice, and were all of the clinically important outcomes considered? Other fundamental questions ask whether the likely treatment benefits are worth the cost of the intervention and whether the treatment may bring about any potential harm. There are textbook learning resources (1), and the web site of the NHS Research Development Centre (Table 1⇑ ) provides a set of worksheets for critical appraisals.
Similar lists have been produced to evaluate an article on diagnostic tests; this will be discussed later in the section concerning the way forward for laboratory medicine. There is legitimate criticism and concern that the guidelines and checklists become more complex over time and are difficult to apply for those who have not acquired the appraisal skills as an integral part of their education. The structured abstract used for the past several years in a large number of journals helps to determine the value of the article. A randomized control trial in which three research articles for family practice were assessed by two groups of family doctors using the READER (Relevance, Education, Applicability, Discrimination, overall Evaluation) method of critical appraisal provided a more appropriate evaluation of the three articles compared with a free appraisal not using any model (41).
systematic review and metaanalysis of studies
A systematic review and a metaanalysis are not one and the same. A systematic review is a concise summary of the best available evidence from primary studies using explicit, rigorous, and reproducible methods to find, critically review, and then synthesize the evidence (42)(43)(44). A metaanalysis uses statistical methods to combine the results of multiple studies of similar design (45). The production of a high-quality systematic review is time-consuming and expensive. The Cochrane Collaboration is an international initiative that was established in England in 1992 under the leadership of Iain Chalmers (46)(47). For many years Archie Cochrane, who was a physician and epidemiologist, argued that because there would always be limits on the resources for healthcare, there should be critical summaries of all randomized trials to be sure that selected interventions are effective. These systematic reviews must be updated regularly with new evidence (47). To ensure that randomized trials have been well designed, executed, and reported, the Consolidated Standards of Reporting Trials (CONSORT) statement was published in 1996 with a checklist of 21 items that should be included in reports of randomized trials (48)(49). The goal is to eliminate bias, a problem with badly executed randomized trials (50). It is essential to move away from the narrative and selective form of review used by the Nobel prize winner Linus Pauling (51) to assert that vitamin C could help us live longer and have a better quality of life. A systematic review found that although one or two trials had strongly suggestive evidence that vitamin C would prevent the common cold, far more studies that did not show any benefit had been omitted or ignored (42) by the distinguished biochemist.
For the evaluation of a systematic review, five questions (Table 2 ⇓ ) have been suggested (44). Many criticisms have been expressed of systematic reviews, and a recent vigorous defense states that the methodology is not limited to randomized control trials but can also be applied to diverse topics, providing a credible evidence base to support policy making (52). Systematic reviews do not always produce a definitive answer and can lead to more primary research because they identify a lack in an area of significant relevance. A systematic review need not necessarily end by converting the data from each accepted study to a common measurement scale and then combining them with the statistical approach called metaanalysis. In some reviews, it is appropriate to present a narrative synthesis of the methods and results; this is a qualitative systematic review. As with any developing science, there are substantial challenges, including how to integrate into systematic reviews heterogeneous evidence that may be direct or indirect and can be derived from different populations with studies of varying design and quality (53).
Metaanalysis is a statistical combination of the data from more than one study after they have been converted into a standard measurement scale. The results usually are presented in a graphic form as odds (or risk) ratios with a 95% conference interval for individual trials, an overall ratio, and a 95% confidence interval for the pooled data from all of the trials (Fig. 1⇓ ). No effect is assigned the relative risk of 1.0. If the confidence interval of the result (a horizontal line) crosses the vertical line of no effect, there may be no significant difference between the interventions or the sample size is too small. These ratios provide an estimate of the relative efficacy of an intervention. A risk ratio <1 denotes a reduction in the number of events in the treated group compared with the control group. As each new study is added to the pooled results, the confidence interval should narrow unless substantial heterogeneity exists. A cumulative metaanalysis is consistent with Bayesian statistics that any observation should be considered along with prior knowledge, described as a prior probability, without including the phenomenon being examined. The addition of the results of the new studies creates a posterior probability (54).
Although there is guidance in the use of metaanalyses of randomized and nonrandomized studies (55), it has long been recognized that metaanalyses can produce conflicting or misleading conclusions. This bias can arise because studies with negative or indefinite conclusions may have less chance of being published or be significantly delayed (56)(57)(58)(59). Covert duplicate publication of clinical trials of the antiemetic efficacy of the drug Ondansetron, included in a metaanalysis, led to a 23% overestimation of efficacy (60). Because 17% of the published trials were covert duplicates, 28% of patient data was duplicated. Trials that reported greater treatment effect were significantly more likely to be duplicated with no cross-references to the original source (60). Because the discordance of results of statistically combined studies can be attributable to many factors, which can be of greater or lesser importance, a decision algorithm has been proposed to assist the process for identifying and resolving the causes (61).
A recent study found publication bias within a sample of studies from the Cochrane Database of Systematic Reviews, but inclusion of the missing studies changed the conclusions in <10% of the metaanalyses (62). To detect a publication bias before a study is included in pooled analyses, the use of funnel plots has been proposed (63). A funnel plot is a plot of the trials’ effect estimates against the sample sizes, and it is based on the fact that the precision of estimating the treatment effect will increase as the sample size of the studies included in the metaanalysis increases. All studies will show a wide scatter of their results at the bottom of the graph; this will narrow among larger studies. If there is no bias, the plot will resemble a symmetric inverted funnel; if there is bias, it will often be skewed and asymmetric (63). Although prospective registration of all trials before their results are known is being encouraged (64), it has still not been achieved in spite of being proposed 15 years ago (65).
Barriers to EBM
Translating the results of EBM research into everyday practice is a major challenge. There are substantial gaps between the care that patients receive and the practice the evidence suggests is effective. Given that physicians and other healthcare providers do not go to work to do a bad job and recognizing that limited time is available, activities have to be directed to how people learn and what they need to know. Finding information is important (34); in addition, overcoming barriers to the dissemination and timely application of research findings (66) and valid clinical guidelines (11) are part of the necessary creation of a culture of evidence-based decision making in healthcare (67)(68). The increasing accessibility and user-friendliness of the tools of EBM are major steps forward in improving individual efficiency in making the best use of time.
It is too easy to forget the needs of the individual practitioner, most of whom have no need or desire to be knowledgeable in the theory of research methodology; they want to know how to use the results of the research. A detailed understanding of study design or statistics is not required, although it may help. Most practitioners want to focus on high-priority problems, and to do this they need to be able to construct focused and answerable questions. Evidence-based practice guidelines are preferable to working through mountains of original research reports (69). Guidelines need to be relevant, easy to use, widely disseminated, and updated to maintain their relevance. Better implementation of electronic records could incorporate practice protocols as part of the patient record, with automatic reminders and noting of completion (70). Systematic reviews of interventions aimed at improving professional practice have shown how difficult it is to translate research results into everyday practice (71)(72)(73). These have underlined the need to integrate many educational tools and increase the focus on strategies to promote implementation in ways that effectively communicate best practices. Recognizing the magnitude of the challenge and the fact that the efficacy of guidelines in altering clinical practice has been demonstrated in rigorous investigations (74)(75), the August 2000 issue of the official publication of the American College of Chest Physicians (76) devoted an entire supplement to translating guidelines into practice, with specific emphasis on their implementation and physician behavior change.
Evidence-based Laboratory Medicine: Barriers and Bridges
Some of the barriers for the laboratory medicine community are not significantly different from those faced by physicians trying to adapt to an evidence-based world. The demand for evidence in health services is not going to go away. Many clinical and laboratory professionals in all disciplines have felt uncomfortable with evidence-based practice, lack familiarity with the tools, and are uncertain as to how they fit into the complete picture. In many situations, there has been a lack of leadership. Many laboratory professionals have been uncomfortable with the challenge of the clinical consultant role. This has not been made easier by laboratory reorganization, which often has promoted an industrial rather than a clinical model for laboratory medicine (20)(27). There is little doubt that the way laboratory data are presented to clinicians is not user-friendly and does not help them with decision making. There are very large databases in laboratory information systems, with very little effort having been devoted to make them patient- and user-centered so that they readily and quickly provide clinically helpful information. The final barrier is a lack of research funding for laboratory medicine that is in any way comparable to the funding devoted to clinical trials.
Recognizing that laboratory medicine also operates in a healthcare environment of information overload, has already experienced many cost-control strategies, and exists to serve a public that is more knowledgeable and demanding, the bridge must lead to cooperation and working as part of a healthcare team. This means understanding the strengths and weaknesses of clinical guidelines, care maps, and clinical outcomes while recognizing how to develop outcome measures for laboratory testing that are patient indicators and are not limited to laboratory process. To be a clinical consultant in these and other patient-centered activities requires access to the best available evidence for the appropriate use of laboratory tests (77). Mastery of the tools for seeking out quality evidence and evaluating it requires that critical appraisal be a way of life. Just like physicians, it is not necessary for laboratory professionals to be experts in all of the arcane elements of research methodology or statistical analysis, but there must be a professional comfort level with their principles, application, and interpretation. It is impossible to work as a member of a team of clinicians and to interpret laboratory tests in that environment unless there is understanding of the clinical question being asked and the clinical implications of the answer. The focus of interpretation and consultation should be directed toward nonstandard tests, and when this is based on evidence and is presented in a nonthreatening and collegial manner that demonstrates the role the laboratory can play in appropriate use of results, this will come to be valued by clinicians. In most cases, the value and relevance necessitate that the results and interpretation are rapidly available. This would be further enhanced by greater integration of laboratory data with other clinical, diagnostic, pharmaceutical, and financial information as part of the clinical database repository. There are also some signs that laboratory professionals are becoming more interested in better ways of presenting test results for their use in diagnosis, treatment, and prognosis.
Putting the Evidence in Evidence-based Laboratory Medicine
evaluating the existing information
The reduction or elimination of bias in research is one of the goals of systematic reviews, with the hope of preventing a wrong result attributable to the study design (78). There are guidelines that help critical appraisal of the literature regarding diagnostic tests and their systematic review (79)(80)(81). Mathematical tools for test evaluation have been described (82), and there are definitions for terms used in clinical literature, such as pre- and posttest probability, odds ratio, confidence intervals, relative risk reduction, and absolute risk reduction (83)(84)(85)(86)(87). These concepts consistently help to place the value of the laboratory test in the clinical situation. The methodologic quality of studies of test evaluations has been recognized as poor, and a survey report in 1995 (88) found that of the seven methodologic standards examined, only 18% of the studies met five of them. This was also stated to be an improvement over previous findings. A review of metaanalyses of diagnostic tests showed that poor design, data collection, and reporting all affect the estimates of diagnostic accuracy, thus providing empirical evidence of bias in studies with diagnostic tests (89).
Guidelines for metaanalyses evaluating diagnostic tests have been given by the Methods Working Group on screening and diagnostic tests in the Cochrane Collaboration (80). A well-conducted search covering the period January 1985 to December 1998 identified 169 studies, 45 of which were selected because they were systematic reviews in the fields of clinical chemistry and hematology (90). Of those 45 studies, 23 met the inclusion criteria, i.e., the authors addressed a clearly prescribed diagnostic and prognostic test, and they completed an explicit and thorough search of the literature. These 23 systematic reviews were then analyzed using the working group guidelines (80), and none of them met all six guidelines. In total, 11 reviews met three or more guidelines, and 12 met fewer than three guidelines. It is a very instructive review of the methods and difficulties in performing a systematic review of diagnostic tests. However, the major limitation stems from the poor quality of the primary studies. A systematic review and metaanalysis of studies on the use of carbohydrate-deficient transferrin rather than γ-glutamyltransferase in the detection of problem drinkers (91) revealed poor quality in more than one-half of the primary studies and concluded that additional high-quality studies are needed. Another evidence-based approach to examine the sensitivity of cardiac markers in the detection of acute myocardial infarction (92) also found design flaws in many of the primary studies and also concludes with the need for additional larger studies.
improving the quality of new information
Systematic reviews are expensive and time consuming. The depressing conclusion is that there are very few primary studies that meet the high standards necessary for inclusion in such a review and that publication of the weak studies produces unjustifiably optimistic estimates of diagnostic accuracy. It is a reasonable demand that standards be met before publication. In 1997, a checklist was proposed for reporting of studies of diagnostic accuracy (93). The checklist has been updated after input from statisticians, methodologists, editors, and researchers (94). There is a dual purpose for this checklist, the first being to guide investigators in the design of their studies, knowing that the second use of the checklist will be by the journal reviewers. Once again the travails of evidence-based laboratory medicine are similar to those faced by clinicians. The need to improve the quality and reporting of randomized control trials led to the Consolidated Standards of Reporting Trials (CONSORT) statement (49), and it is hoped that the Standards for Reporting of Diagnostic Accuracy (STARD) statement will lead to a complete and accurate reporting of studies on diagnostic accuracy.
To develop the STARD statement, a steering committee, which included the editor of Clinical Chemistry, was established in December 1999. A background document, a scope document, and a matrix of published checklists were prepared by an international, multidisciplinary working group that was established and held a consensus meeting in Amsterdam in September 2000. A draft STARD statement and a 25-item checklist were circulated among the working members in November 2000. After the first review is completed, the committee hopes to publish the statement and checklist on a web site early in 2001 and then make any revisions as a result of input from those who visit the website. The next goal is to publish the documents and have them presented for peer review at scientific meetings and among journal editors. It is envisaged that this will not be a static statement and there will be regular evaluation and improvement as understanding and knowledge increases.
providing new and valid information through research
There is a need for many prospective studies in the area of diagnostic testing that are large enough and correctly designed to meet critical appraisal standards, such as those incorporated in the STARD proposal. A key problem in many of the primary studies of laboratory tests is the lack of information on how the patients were selected for the study. It is recognized that studies of diagnostic tests are different from the clinical trials model (94), pointing to the need for further academic activity aimed at fully understanding the design and statistical approaches for the evaluation of laboratory testing. Even the word “diagnostic” needs to be clarified because it can be used in different ways by different people. There are many expressions for the efficacy of a test, and this reflects the need to have and to use such an expression. However, such a large number of definitions also suggests that each one has some deficiency in practice. Statistical techniques for evaluating the diagnostic utility of laboratory tests, including information theory, have been described (95). Assuming that the analytic accuracy of a laboratory procedure has been correctly established before it is used in any clinical situation, the discrimination level of the test may have to be set at different points when it is used for diagnosis, prognosis, monitoring, or screening. These decision points need to be adjusted when the same test is used in different disease states. All of this indicates the necessity of laboratory and clinical collaboration in a partnership with the diagnostics industry and academic centers to promote quality research that is relevant to the provision of quality care. This requires funding, which it is not easy to obtain. However, proposals are more likely to be attractive to funding agencies if they have good and clearly expressed questions, the answers to which may contribute to the most effective patient care.
Cholesterol was earlier used as an example (31) of a laboratory test, with its clinical use having been clearly established by many large clinical trials. Another example is C-reactive protein, which has very low diagnostic specificity but has long been used as an acute-phase marker of injury, infection, and inflammation and may also indicate future cardiovascular disease (96). There are many opportunities for the application and evaluation of laboratory tests in good clinical trials. There are even greater opportunities for correlating various laboratory procedures with the clinical findings, outcomes, and diagnoses and using the stored samples collected for those studies. One good study used clinical outcomes to provide clearer definitions of risk-benefit for target concentrations of immunosuppressive drugs (97). Another demonstrated how computers can aid in determining the correct dose of some drugs (98), thus enhancing the clinical value of therapeutic drug monitoring.
Systematic reviews have demonstrated the use of plasma homocysteine as a risk factor for coronary and other vascular diseases (99). It is also recognized that clinicians will be faced with many new laboratory tests that are said to predict venous thromboembolism. Following critical appraisal of the available information, criteria have been proposed to help physicians use the information from published studies to estimate the real usefulness of 26 laboratory tests listed as possibly being of value in the diagnosis of thrombophilia (100). There now is no excuse for failing to design a study in which the results can be tested against the predetermined levels of evidence. There are also evidence-based guidelines for performance characteristics of laboratory tests used in the diagnosis and monitoring of hepatic injury (101) and for their use in screening, diagnosis, and monitoring (102). All of these studies link laboratory and clinical medicine through analysis that is data driven and critically evaluated.
- © 2001 The American Association for Clinical Chemistry