Bias in Associations Between Emerging Biomarkers and Cardiovascular Disease
Editor’s Note: This article is based on Tzoulaki I, Siontis SC, Evangelou E, Ioannidis JP. JAMA Intern Med. 2013;173:664-671.
Numerous cardiovascular biomarkers have been proposed as potential predictors of cardiovascular risk. However, few have consistently demonstrated clear improvements in predictive discrimination, reclassification, and/or calibration, and their clinical utility remains unclear. It is possible that studies evaluating novel markers may be biased to overestimating their predictive value.
In this study, the authors sought to evaluate whether there is evidence for bias in published studies that could favor statistically significant and/or inflated results. To do so, the authors analyzed meta-analyses that examined any emerging biomarker in relation to cardiovascular disease (CVD) or mortality. When a meta-analysis evaluated more than 1 marker, each marker was considered separately.
The summary effects were estimated and between study heterogeneity calculated with the I2 metric. The I2 metric ranges from 0% to 100%; values > 50% or >75% are considered to represent "large" or "very large" heterogeneity, respectively. In addition, to examine the potential for small study effects, they evaluated whether large studies had less significant results compared to smaller studies, and whether there were too many studies with statistically significant results compared with what would be expected on the basis of the findings of the largest study in each meta-analysis.
Overall, 35 articles corresponding to 56 meta-analyses that examined 42 unique biomarkers were included. The median (range) number of studies included in each meta-analysis was 12 (3-68) and number of events was 2,459 (34-12,785).
Of 56 eligible meta-analyses, 49 (88%) had statistically significant results. Large and very large heterogeneity were seen in 26 (46%) and 9 (16%) studies, respectively. Additionally, evidence for significant small-study effects was seen in 13 (23%) meta-analyses. In 29 meta-analyses (52%), there was a significant excess of studies with statistically significant results. Meta-analyses of fibrinogen, apoB BNP, and cystatin C showed very large heterogeneity and evidence of small-study effects. The meta-analyses of coronary artery calcium with CVD and troponin with 30-day cardiac death in patients with ACS also showed large heterogeneity (I2 = 50%-75%) and evidence of small-study effects.
Only 13 of the statistically significant meta-analyses met criteria that were considered a well designed study (defined as having >1000 cases, no evidence of large heterogeneity, small-study effects, or excess significance). Markers that were identified in this study included GFR, ,albumin to creatinine ratio, non–HDL chol, serum albumin, Chlamydia pneumoniae IgG, HgbA1C, non-fasting insulin, apoB/AI ratio, ESR, and lipoprotein-associated phospholipase.
The authors concluded that selective reporting biases may be common in the evidence on emerging cardiovascular biomarkers. Most of the proposed associations of these biomarkers may be inflated. In addition, most meta-analyses had too many single studies had reported "positive" results compared with what would be expected on the basis of the results of the largest studies. This suggests that small studies with "negative" results remain unpublished, or that their results are distorted during analysis.
The authors performed an interesting study in which they examined the potential for bias in published meta-analyses of articles estimating the predictive and diagnostic value of biomarkers for assessing cardiac disease. They concluded that selective reporting biases are prevalent in many studies on emerging cardiac biomarkers.
In many ways this study is not that surprising. There has been an exponential growth in the number of studies published examining the predictive value of new cardiac biomarkers. In each of the last 3 years, >2000 articles on cardiac biomarkers have been published, averaging to >5 per day. Despite the plethora of new markers, very few have entered the realm of routine use.
Reasons for this bias hypothesized by the authors include selective publication bias, in which only positive results are published. Another is selective reporting, in which multiple markers are analyzed, but only the results of the most predictive marker is reported. This may therefore simply reflect false-positives owing to multiple testing of many biomarkers. A similar problem can occur if when only 1 biomarker is analyzed, but multiple different end-points and cut-points are analyzed, and only the most significant is reported.
Publication bias may be particularly prevalent in small studies, in which the predictive value is likely to be highly inflated. As reported by Tzoulaki et al1, subsequent much larger trials frequently fail to confirm the original result, or find the predictive value is substantially less than the original article.There are number of other explanations for the results found by Tzoulaki et al1. Many initial biomarker studies are performed in populations at extremes of risk (e.g., patients with MI compared to asymptomatic patients), that can lead to inflated odds ratios and predicted values. When the biomarker is subsequently tested in populations in which there is significant overlap between normal and abnormal (e.g., undifferentiated chest pain patients in the emergency department) the predictive values almost always decrease.
Failure to include other markers (e.g., BNP, CRP) that have also been shown to be predictive in the same population can result in inflated predictive values. Although the authors commented that such "routine" markers are "relatively inexpensive" and should be included, this may not be as easy as it sounds. Biomarker studies are frequently performed as secondary sub-studies of larger trials. When the 2-4 additional tests are multiplied across 2-4000 patients, the added expense can be cost-prohibitive. Additionally, appropriate multivariate analyses require 10 end-points for every variable entered; therefore smaller studies suffer from not being sufficiently large to include more that a few of the important variables.
An important consideration is that although a marker may not be more prognostic or diagnostic than what it is being tested against, it does not mean it is not valuable. Other factors, such as cost, ease of measurement, and accuracy may make it a viable alternative. An obvious example are markers used for diagnosing MI, in which over the past 60 years we have seen an evolution with successive replacement of less accurate markers, from AST to CK to CK-MB and now to troponin.
A likely additional reason for the variation in odds ratios relates to the marked heterogeneity of the studies included in the meta-analyses. For example, one meta-analysis estimating the predictive value of troponin2 combined studies that included both STEMI and non ST elevation ACS with different assays, and time points of sampling. In addition, some studies that sampled only at the time of enrollment and other that performed serial measurement, with varying periods of follow-up.
In an acompaning editorial Nissen et al3 noted the large heterogeneity found by Tzoulaki et al1 is due to the "shame of publication bias". This bias is clearly not limited to biomarkers, or even cardiology, as many promising new findings are not substantiated in later studies. A better explanation for this is likely the effect of "irrational exuberance", with excitement over interesting new findings that are likely to receive substantial press. What is important to recognize is that initial findings in small studies, clearly need to be confirmed in much larger, well conducted trials. This is evident by how small the majority of the studies included in most of the meta-analyses. As the authors point out, any biomarker that has an odds ratio >2 is likely overestimated, and are rarely maintained in subsequent studies.
To believe all negative studies can be published, as indicated by Nissen, is unlikely to occur. Smaller studies that have negative results are unlikely to be published, unless an initial positive study precedes it. In addition, negative studies must be well conducted and may require substantial numbers of patients, increasing cost.
In conclusion, many biomarker studies are limited by inclusion of small numbers of patients, with even less adverse events. As a result, predictive values are likely to be frequently overestimated. In reality, single biomarkers with large effect sizes are probably rare. An important lesson from this study is that odds ratio or relative risks that are >2 are likely "too good to be true", and would be best be considered hypothesis generating, with subsequent confirmation in much larger studies.
- Tzoulaki I, Siontis KC, Evangelou E, Ioannidis JP. Bias in associations of emerging biomarkers with cardiovascular disease. JAMA Intern Med. 2013;173:664-71
- Ottani F, Galvani M, Nicolini FA, et al. Elevated cardiac troponin levels predict the risk of adverse outcome in patients with acute coronary syndromes. Am Heart J. 2000;140:917-927
- Nissen SE. Biomarkers in cardiovascular medicine: the shame of publication bias comment on "bias in associations of emerging biomarkers with cardiovascular disease". JAMA Intern Med. 2013;173:671-2.
< Back to Listings