Do Highly-Cited Articles Accurately Reflect A Biomarkers Predictive Value?
Editor's Note: This article is based on Ioannidis JP, Panagiotou OA. Comparison of effect sizes associated with biomarkers reported in highly cited individual articles and in subsequent meta-analyses. JAMA. 2011;305:2200-10.
Numerous different biomarkers have been found to predict risk, prognosis, and response to treatment. Studies that show positive effects, particularly if large, are frequently cited in subsequent studies, despite the fact that often later studies show a much smaller, or even absent predictive value. In their study, Ioannidies and Panagiutou sought to examine whether the magnitude of the effect sizes of biomarkers proposed in highly-cited studies is accurate or overestimated.(1)
To do so, they examined biomarkers that had been evaluated in at least 1 highly-cited study and for which at least 1 meta-analysis had been performed for that same association. The effect size of these associations in the most highly-cited studies versus what was observed in the largest studies and the corresponding meta-analyses was then compared. Because there are >100,000 articles published on biomarkers, they limited eligible biomarker articles to those that had received more than 400 citations in the ISI Web of Science and that had been published in 1 of 24 highly-cited biomedical journals. General and cardiology journals included New England Journal of Medicine, Lancet, JAMA, BMJ, Circulation, and Journal of the American College of Cardiology. The threshold of 400 citations was used to target approximately the top 3% of biomarker studies published in influential journals.
Among the 14,023 articles that initially met criteria, 35 were considered highly-cited articles that had corresponding meta-analyses. Cancer-related (n=14) and cardiovascular-related (n= 12) outcomes predominated. Cardiovascular biomarkers included were CRP, hyperhomocysteinemia, hyperinsulinemia, adiponectin levels, and the ACE gene deletion-insertion polymorphism. The highly citeed articles received a median of 645 citations, and had a median publication year of 1996. The median sample size for the 35 highly-cited studies was 518, with a median RR for events of 2.50.
The median sample size of the largest studies was 1820. In only 3 cases was the largest study the highly-cited one. Excluding these 3 cases, the median number of citations in the largest studies was only 79 (compared to a median of 645 for the highly-cited articles).
For 30 of the 35 eligible studies (86%), the highly-cited studies had a stronger effect estimate than the largest study. In addition, for 29 of the 35 (83%) highly-cited studies, the corresponding meta-analysis found a smaller effect estimate. Only 15 of the associations were nominally statistically significant based on the largest studies, and of those only 7 had a relative risk point estimate greater than 1.37.
The authors concluded that highly-cited biomarker studies often report larger effect estimates for postulated associations than are reported in subsequent meta-analyses evaluating the same associations.
The last 20 years have witnessed an explosion in biomarkers, not just in cardiovascular diseases, but in many other fields as well (particularly oncology). Despite the preponderance of studies, only a few biomarkers have translated into routine clinical use. In cardiology, these would include the troponins, the natriuretic peptides, and perhaps CRP.
In their study, the authors found that the studies of a particular marker that received the greatest number of citations were far more likely to have a stronger effect size than a following study that included more patients, or in later meta-analyses of the same marker. Because many of the results reflected in the highly-cited publication were not as positive (and in some cases neutral, and hence no effect), the authors imply that the emphasis these highly-cited studies is overstated, and may lead to approval, acceptance and early adaptation prior to the biomarker being convincingly shown to have incremental value over current information that is already known. They stated (correctly) that until such studies are available, emphasis on single studies with highly promising results is frequently premature.
As noted by the authors the standards for adaptation of a new biomarker requires that it be validated in prospective studies, demonstrating not only an association, but significant incremental value to known data.(2) A point not stated by authors is that what would be even more valuable would be the ability to show that treatment that affects the biomarker would improve outcomes.
Some of the information that authors present is not surprising. It has been increasingly recognized that small studies have inflated relative risks that are usually not repeated in subsequent studies.(3) Because of this large treatment effect, they may receive significant publicity, which is often difficult to overcome, even if multiple later studies (with much larger patient cohorts) that fail to reproduce the treatment effect are subsequently published.
There are additional reasons that could explain the potential false-positive and inflated results present in the highly-cited studies, some of which were noted by the authors. Half of the highly-cited studies were published early, with a median publication date of 1996, which may have given them a citation advantage, with more time to accrue citations. However, the exact reasons for citing previous articles are unknown and were not examined in this study. For example, it would be expected that the initial positive study will be referred to in most subsequent studies to give (appropriate) credit to the initial discoverers. Even a study that finds a neutral or negative effect is likely to cite the original study, contributing to citation bias. There is no consensus on what threshold characterizes a highly-cited study, although the criteria the authors selected seem reasonable.
Attempts at validation with subsequent publication are much more likely for initial positive studies, than for studies with a negative or a smaller effect size. Analogously, the likelihood that a meta-analysis will be conducted for a specific biomarker is not similar for all biomarkers, which partially explains why the authors could only find corresponding meta-analyses for 1/3 of the highly-cited studies. Finally, the number of journals assessed were only a minority of currently available journals (although they are the ones most likely to report valid associations).
It would be interesting to extrapolate the outcomes from their study on biomarkers to other treatments in cardiovascular disease. Likely, the results would be the same, that initial enthusiasm is met with later skepticism, and a failure to reproduce the results. Such outcomes have led the FDA to often require multiple large studies prior to drug approval. Attempts to address the limitations often found in early studies have been made. These include the use of standardized checklists, enabling editors, reviewers, and readers to more easily assess the studies and detect study weaknesses.(4)
The report by Ioannidis and Panagiotou(1) demonstrate that more extreme, often early associations receive considerable attention and continue to do so, despite the availability of subsequent studies or meta-analyses with more precise estimates. Readers of any biomarker article demonstrating a strong association with outcomes should have the study design and outcomes closely scrutinized. Frequently, if it seems "too good to be true," it probably is.
- Ioannidis JP, Panagiotou OA. Comparison of effect sizes associated with biomarkers reported in highly cited individual articles and in subsequent meta-analyses. JAMA. 2011;305:2200-10.
- Hlatky MA, Greenland P, Arnett DK, et al. American Heart Association Expert Panel on Subclinical Atherosclerotic Diseases and Emerging Risk Factors and the Stroke Council. Criteria for evaluation of novel markers of cardiovascular risk: a scientific statement from the American Heart Association.Circulation. 2009;119:2408-16.
- Ioannidis JP. Why most discovered true associations are inflated. Epidemiology. 2008;19:640–648.
- Bossuyt PM, Reitsma JB, Bruns DE, et al. Standards for Reporting of Diagnostic Accuracy. Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. BMJ. 2003;326:41–44
Keywords: Adiponectin, Biological Markers, Checklist, Consensus, Drug Approval, Gene Deletion, Hyperhomocysteinemia, Hyperinsulinism, Natriuretic Peptides, Publications, Prognosis, Risk, Sample Size, Troponin
< Back to Listings