Machine Learning Derived Patterns of Echocardiographic Phenotypes on Prediction of Heart Failure

Quick Takes

  • Risk for progression from Stage B heart failure to symptomatic disease is poorly predicted by current definitions of left ventricular (LV) structural heart disease.
  • This study leveraged machine learning (ML) clustering and classification techniques to identify a parsimonious set of echocardiographic variables comprising mitral annular early diastolic velocity (e'), LV volume and LV mass combinations.
  • Three distinct echocardiographic phenotypes were identified that were associated with unique biomarker and vascular function profiles as well as cardiovascular outcomes.

Abnormalities of cardiac structure and function detected on echocardiograms in asymptomatic individuals may predict the risk of subsequent adverse cardiovascular (CV) outcomes. However, the complexity of current echocardiographic metrics, particularly in defining diastolic dysfunction (DD), and the lack of clarity on how to prioritize multiple metrics, limit their predictive capacity, particularly on the individual patient level. Deep phenotyping methods using machine learning (ML) techniques enable the identification of combinations of the most distinct echocardiographic parameters which may not be clinically obvious. Such phenotypic clusters identified by ML can then be used to identify potential phenotypes at highest risk for adverse outcomes and to determine prognostic factors and physiologic profiles related to the development of symptomatic heart disease.

Synopsis of Study
In the recent cross-sectional, population-based study by Kobayashi et al.,1 the use of ML clustering and classification to identify distinct echocardiographic phenotypes was applied to two population-based European study cohorts: STANISLAS, used as the derivation cohort, and Malmö, used as the validation cohort.

The derivation cohort, STANISLAS, consisted of a single-center familial longitudinal population-based in France. A sample of 827 patients with a mean age of 60±5 years and 48% men were used for the study after exclusion of children, individuals with a history of heart failure (HF), or missing echocardiographic variables. The mean body mass index (BMI) was 27 kg/m2, 28.3% had hypertension (HTN), and the mean left ventricular ejection fraction (LVEF) was 66%. Non-invasive imaging variables such as cardiac and vascular function, clinical data, and select cardiac biomarkers were collected.

Cardiac function variables obtained by echo include LA volume, diastolic function, mitral annular early diastolic velocity (e'), as well as the calculation of E/A ratio, e' mean, and E/e' mean ratio. LV systolic deformation and diastolic dysfunction were also assessed. Vascular analysis includes BP, aortic stiffness by pulse wave velocity, intima media thickness by linear probe and echo tracking system, total arterial compliance index (calculated by SV/central pulse pressure x BSA), and systemic vascular resistance (=80xmean central pressure/cardiac index). Circulating biomarkers associated with DD in HFpEF that are expressed because of inflammation (FABP-4, GDF-15, Gal-3, VCAM1, ICAM1, TGFb1, MCP1, RAGE products, interleukins, TNF-r1, and OPN), ECM remodeling and angiogenesis (MMP, PIIINP, PICP, ST-2, NT-proBNP, BNP, CNP, TIMP, ADM, ANG, VEGF-A) and renal function (BUN and Creatinine) were also evaluated in this study.

For validation, the authors used the Malmö cohort, a population-based, longitudinal cohort in Sweden. A total of 1,394 people with a mean age of 67±6 years and 70% men without history of HF were included in this study, and similar imaging and clinical data were collected.

Clustering analysis using ML methods was performed on echocardiographic data using K-mean clustering (used for primary analysis), latent class model, and hierarchical clustering. While the clustering methods generally showed consistent results, K-means clustering had the most consistent performance, hence its use as the primary statistical method. Decision trees were constructed to predict the most relevant variables for phenotypic clustering and identify the most relevant threshold for each variable, which comprised the final algorithm. Clinical variables (i.e., age, sex, BMI) were not incorporated into the final algorithm as they did not alter the decision-making.

Three distinct clusters were identified, and the phenotypes were defined as "mostly normal (MN)", "diastolic (D)", and "diastolic changes with structural remodeling (D/S)". The D and D/S phenotypes were similar in age, body mass indices, and prevalence of CV risk factors and vascular and DD. The D phenotype comprised mostly women who had increased levels of inflammatory biomarkers and most abnormal diastolic function parameters. The D/S phenotype were predominantly men who had the highest values of left ventricular mass, volume, and remodeling biomarkers (with examples including cardiac protein troponin-1 and interleukin ST-2 which are implicated in LV [left ventricle] remodeling and fibrosis). Both D and D/S demonstrated overall worse prognosis in terms of mortality compared to controls (MN). Results support the impact of inflammation on increasing LV pressure with changes in diastolic function. They further support the impact of endothelial dysfunction and biomarkers of fibrosis on cardiac structural changes in asymptomatic patients.

ML classification by decision tree and random forest methods further identified a distinct parsimonious set of three echo parameters, namely mean e' velocity, LV volume index, and LV mass index (e'VM), as the most important in distinguishing the clusters with high accuracy. The e'VM algorithm was then tested in the external cohort and shown to indeed identify differences in vascular parameters, proteomic profiles, long-term risk of HF and CV death with high diagnostic performance that was incremental to that of the ARIC HF risk score. The application of the e'VM algorithm on the Malmö cohort demonstrated the highest NT-proBNP levels, LV mass, and left atrial area in the D/S phenotype compared to the D phenotype. Survival analysis in the Malmö cohort found that with a median follow-up of 10.3 years, 10.1% met primary outcome. Both D and D/S phenotypes had increased rates of the primary outcome, even after adjustment with established HF risk scores plus NT-proBNP. Furthermore, increased CV hospitalization and mortality were demonstrated in the D/S phenotype compared to the other groups. Lastly, the prevalence of DD defined by the 2016 American Society of Echocardiography (ASE) guidelines was low across all three phenotypes.

In a prior study, Ernande et al.2 demonstrated effectiveness in the use of cluster analysis in identifying echocardiographic phenotypes in patients with diabetes mellitus (DM). Similarly, three clusters were identified and the two groups with phenotypes consisting of worse profiles of LV mass index, LVEF, clinical variables, along with other parameters demonstrated poorer prognostic values regarding LV remodeling and subclinical dysfunction. As such, similar applications of ML cluster analysis may be useful in stratifying asymptomatic patients.

The findings of the present study illustrate how ML-based methods to cluster and classify echo data can improve upon current approaches (e.g., ARIC HF risk score, 2016 ASE guidelines to define DD) to identify patient phenotypes at highest risk for symptomatic HF progression and adverse outcome. Notably, the proposed e'VM algorithm used three simple echo parameters, LV mass, LV volume, and e' velocity, and was able to identify high-risk phenotypic profiles despite normal risk according to conventional assessments. Additionally, the nature of the computerized analysis facilitates its implementation into current electronic medical record (EMR) systems, potentially allowing for the ML-guided identification and categorization of high-risk individuals. This approach using ML cluster and classification analysis on imaging, biomarker, and clinical data may be useful in stratifying asymptomatic patients at higher risk for adverse events.

Specifically, limitations of this study include those associated with cross-sectional study design, particularly the inability to infer causality. Furthermore, e'VM algorithm used only echocardiographic parameters without consideration of other diagnostic variables such as electrocardiograms. Regarding the specific cohorts chosen for this study, there was lack of ethnic diversity with unknown generalizability to United States (US) populations, applicability to older populations, and those with poorer health. Participants who were missing echocardiographic parameters were excluded from the data set, even though they had more comorbidities. Not all echo parameters were measured in the Malmö cohort and importantly, metrics such as LV volume had to be derived from linear measurements. Furthermore, the echocardiograms were performed using different machine vendors, during different time periods and without central analysis. However, extensive within cohort inter- and intra-variability studies were performed and reported.

Despite the practicality of ML use in medicine, its applicability continues to pose many challenges. One general challenge related to the application of ML to medicine is selecting accurate and comprehensive variables to predict the disease of interest. In this case with the use of echocardiography to define phenotypes, the goal is to find an appropriate number of variables (i.e., LV mass index) that adequately describe various aspects of DD in patterns that were not previously considered. Selecting too few variables may lead to a model that inadequately represents the disease, which introduces bias in the formed clusters.3 Choosing too many variables can lead to an overly complex model, which can compromise the generalizability of data to a new population due to the susceptibility of overfitting, where excessive data noise is captured and accounted for in a model. Overfitting of a data set can disproportionally impact the relationship between bias and variance of the data, which may lead to inaccuracy of the models and redundancy.4 Other components of accuracy rely on the raw quality of data obtained for analysis as well as the suitability of cohort selection and data point measurements. Notably, missing data or incomplete data sets may compromise algorithm results. These factors underscore the capability of developed algorithms to manage data with different qualities (i.e., clinical, imaging, biopsy), which may especially be in the evaluation of HF.3

In summary, this study demonstrated that an ML-based approach using echocardiographic data to phenotype patients may help identify which asymptomatic patients are at higher risk for adverse events using a parsimonious, simple set of three echo metrics. The study underscores the importance of improving prognostic approaches to the current guidelines and traditional methods and explores the predictive role of select echocardiographic parameters in combination with that of serum biomarkers on prognosis. Future prospective studies are needed to assess the clinical applicability of the e'VM algorithm in other, sicker cohorts and its impact on patient management.


  1. Kobayashi M, Huttin O, Magnusson M, et al. Machine learning-derived echocardiographic phenotypes predict heart failure incidence in asymptomatic individuals. JACC Cardiovasc Imaging 2021;Sep 8:[Epub ahead of print].
  2. Ernande L, Audurea E, Jellis CL, et al. Clinical implications of echocardiographic phenotypes of patients with diabetes mellitus. J Am Coll Cardiol 2017;70:1704-16.
  3. Deo RC. Machine learning in medicine. Circulation 2015;132:1920-30.
  4. Krittanawong C, Johnson KW, Rosenson RS, et al. Deep learning for cardiovascular medicine: a practical primer. Eur Heart J 2019;40:2058-73.

Clinical Topics: Noninvasive Imaging

Keywords: Cross-Sectional Studies, Stroke Volume, Body Mass Index, Carotid Intima-Media Thickness, Creatinine, Growth Differentiation Factor 15, Vascular Endothelial Growth Factor A, Ventricular Remodeling, Follow-Up Studies, Prognosis, Benchmarking, Blood Pressure, Blood Urea Nitrogen, Electronic Health Records, Heart Failure, Latent Class Analysis, Prevalence, Pulse Wave Analysis, Proteomics, Vascular Stiffness, Ventricular Function, Left, Echocardiography, Hypertension, Risk Factors, Vascular Resistance, Electrocardiography, Diabetes Mellitus, Survival Analysis, Hospitalization, Machine Learning, Kidney, Inflammation, Interleukins, Biomarkers, Phenotype, Troponin, Fibrosis, Matrix Metalloproteinases, Decision Trees

< Back to Listings