Editor’s Corner: Observations and Inferences
By Alfred A. Bove, MD, PhD, Editor-in-Chief, CardioSource WorldNews

CardioSource WorldNews | A resident came to my office one day with a spreadsheet of data on patients with aortic valve disease. We looked at the worksheet on the screen, and I asked him what question he wanted to answer from the data. His comment was that he didn’t have a question, but there should be some useful information in their somewhere. As we went through possible hypotheses to test from the data, each depended on important but missing data. So, several months of effort went down the drain because, at the start, no question was posed, no hypothesis stated, and no method of comparing the outcome data to support or reject a hypothesis was provided.

The lesson is clear: research asking important questions to advance our knowledge must be done in a formal way to allow us to understand the information, test the validity of our hypothesis, and state some meaningful measure of confidence in interpreting the data. Also, we need to account for the fact that there is variation in the measures we take in any given patient population; it is this variation that can mislead us into making erroneous conclusions about a treatment.

To account for the variation, we have developed statistical methods to first calculate the average measure in a population, then quantify variations within that population and finally make inferences if we detect measurable differences when we intervene with a new therapy or compare established ones.

There are numerous statistical methods for analyzing biomedical data that can be tailored to the type of data being measured; conclusions can then be inferred from those measures. Over the years we have come to accept a probability value (p value) of 0.05 as a threshold for accepting true differences in the outcome of our clinical studies. This is formally called rejecting the null hypothesis and suggests that the likelihood of assuring that the difference is real occurs 95% or more of the time. In many situations, where a calculated p value is greater than 0.05 (e.g., 0.07), the conclusion is rejected, and the study turned down for publication. (Sometimes the p is precisely 0.05, with 95% confidence that the calculation will be followed by swearing.) While we think we have good methods for making inferences about our clinical research, the statistical community is highly critical of our use of the p value as a bright white line to make firm conclusions about our research. They point out that use of a single threshold for p values leaves serious questions about reproducibility and replicability of scientific conclusions. Indeed, achieving a significant p value (p < 0.05) can be done by simply increasing the sample size until the p value reaches the 0.05 threshold.

The American Statistical Association (ASA), in particular, remains highly critical of the way we use statistics and p values in arriving at conclusions about the value and efficacy of therapies studied in clinical trials. They provide six principles for use of statistical analysis and interpretation of statistical results that temper our reliance on p values.1 (Read this month’s cover story to see these principles.) These need to be understood by all investigators and clinicians to avoid misleading conclusions from clinical studies.

The debate is not new; it probably started soon after p values were first calculated by Pierre-Simon Laplace in the 1770s. However, discussions really started to heat up when Ronald Fisher popularized the p value in his 1925 book, “Statistical Methods for Research Workers.” Two years ago, Gelman and Loken wrote of “The Statistical Crisis in Science,”2 and now we have the ASA throwing down the gauntlet and, in effect, saying to researchers, “We need to talk.”

And we should talk—and learn what statisticians are thinking when they propose moving to a post-p < 0.05 world.

Reference:

  1. Wasserstein RL, Lazar NA. Am Stat. 2016. [Epub before print] Gelman A, Loken E. American Scientist. 2014;102;460.
Read the full May issue of CardioSource WorldNews at ACC.org/CSWN

Keywords: CardioSource WorldNews, Heart Defects, Congenital, Heart Valve Diseases, Probability, Reproducibility of Results, Research, Sample Size


< Back to Listings