Can Physicians Identify Inappropriate Nuclear Stress Tests? An Examination of Inter-Rater Reliability for the 2009 Appropriate Use Criteria for Radionuclide Imaging | Journal Scan
Are physicians at various levels of training able to consistently identify nuclear stress tests as appropriate or inappropriate, per the 2009 American College of Cardiology (ACC)/American Heart Association (AHA) Appropriate Use Criteria for radionuclide imaging?
A total of 400 patients referred for nuclear stress testing during early 2006 were categorized as appropriate or inappropriate using the 2009 ACC/AHA Appropriate Use Criteria by two board-certified cardiologists who were not affiliated with the nuclear cardiology laboratory, two first-year cardiology fellows, two internal medicine hospitalists, and two internal medicine interns. The consensus of the two cardiologists was used as the gold standard. All raters had a 30-minute orientation session and two training sessions each using 20 separate stress tests also classified by the two cardiologists, followed by in-person feedback. Cohen’s Kappa, a measure of agreement as well as sensitivity and specificity for detecting inappropriate tests, were calculated.
Of the 400 patients in the study, the most frequent indications for testing were evaluation of ischemia in symptomatic patients after percutaneous coronary intervention or coronary artery bypass grafting (15%), possible acute coronary syndrome without electrocardiography changes, low risk Thrombolysis in Myocardial Infarction score and negative troponins (14%), and preoperative evaluation for vascular surgery without clinical risk factors (12%). Two cardiologists classified 256 of the 400 tests as appropriate (64%); 18% were uncertain, 14% were inappropriate, and 5% were not able to be classified. Overall, reliability was modest with Kappa of 0.51 (95% confidence interval, 0.46-0.55) among all six noncardiologists. The proportion of agreement was higher for appropriate indications than inappropriate indications (83% vs. 52%). Sensitivity for inappropriate testing ranged from 47% to 82%, and specificity ranged from 85% to 97%.
The authors concluded that there is only modest inter-rater reliability in application of the 2009 Appropriate Use Criteria, with substantial variability in ability to identify inappropriate tests.
Increasingly, Appropriate Use Criteria are being used as a means to improve quality and reduce health care costs. Several prior studies have suggested that inappropriate tests have limited diagnostic and prognostic yield. Other studies have attempted to apply Appropriate Use Criteria to administrative data sets and large multi-institutional databases. These data suggest that these efforts be viewed with caution, as reliability in such assignments, even with full access to medical records, is limited. Importantly, although extensive training was a part of this study, no electronic or other tools were used to assist in application of the Acceptable Use Criteria. Several of these have been developed. Further, automated methods of assignment of appropriateness from administrative records have been previously used and were not evaluated here. Nonetheless, it is clear that improved, easy-to-use tools for application of Appropriate Use Criteria are necessary. Finally, one goal of future revisions of Appropriate Use Criteria may be to improve accessibility to clinicians.
Keywords: Acute Coronary Syndrome, American Heart Association, Consensus, Coronary Artery Bypass, Exercise Test, Health Care Costs, Electrocardiography, Hospitalists, Coronary Artery Bypass, Myocardial Infarction, Percutaneous Coronary Intervention, Physicians, Radioisotopes, Reproducibility of Results
< Back to Listings