Machine learning insight into the role of imaging and clinical variables for the prediction of obstructive coronary artery disease and revascularization: An exploratory analysis of the CONSERVE study

Lohendran Baskaran*, Xiaohan Ying, Zhuoran Xu, Subhi J. Al’Aref, Benjamin C. Lee, Sang Eun Lee, Ibrahim Danad, Hyung Bok Park, Ravi Bathina, Andrea Baggiano, Virginia Beltrama, Rodrigo Cerci, Eui Young Choi, Jung Hyun Choi, So Yeon Choi, Jason Cole, Joon Hyung Doh, Sang Jin Ha, Ae Young Her, Cezary KepkaJang Young Kim, Jin Won Kim, Sang Wook Kim, Woong Kim, Yao Lu, Amit Kumar, Ran Heo, Ji Hyun Lee, Ji Min Sung, Uma Valeti, Daniele Andreini, Gianluca Pontone, Donghee Han, Todd C. Villines, Fay Lin, Hyuk Jae Chang, James K. Min, Leslee J. Shaw

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Background Machine learning (ML) is able to extract patterns and develop algorithms to construct data-driven models. We use ML models to gain insight into the relative importance of variables to predict obstructive coronary artery disease (CAD) using the Coronary Computed Tomographic Angiography for Selective Cardiac Catheterization (CONSERVE) study, as well as to compare prediction of obstructive CAD to the CAD consortium clinical score (CAD2). We further perform ML analysis to gain insight into the role of imaging and clinical variables for revascularization. Methods For prediction of obstructive CAD, the entire ICA arm of the study, comprising 719 patients was used. For revascularization, 1,028 patients were randomized to invasive coronary angiography (ICA) or coronary computed tomographic angiography (CCTA). Data was randomly split into 80% training 20% test sets for building and validation. Models used extreme gradient boosting (XGBoost). Results Mean age was 60.6 ± 11.5 years and 64.3% were female. For the prediction of obstructive CAD, the AUC was significantly higher for ML at 0.779 (95% CI: 0.672–0.886) than for CAD2 (0.696 [95% CI: 0.594–0.798]) (P = 0.01). BMI, age, and angina severity were the most important variables. For revascularization, the model obtained an overall area under the receiver-operation curve (AUC) of 0.958 (95% CI = 0.933–0.983). Performance did not differ whether the imaging parameters used were from ICA (AUC 0.947, 95% CI = 0.903–0.990) or CCTA (AUC 0.941, 95% CI = 0.895–0.988) (P = 0.90). The ML model obtained sensitivity and specificity of 89.2% and 92.9%, respectively. Number of vessels with ≥70% stenosis, maximum segment stenosis severity (SSS) and body mass index (BMI) were the most important variables. Exclusion of imaging variables resulted in performance deterioration, with an AUC of 0.705 (95% CI 0.614–0.795) (P <0.0001). Conclusions For obstructive CAD, the ML model outperformed CAD2. BMI is an important variable, although currently not included in most scores. In this ML model, imaging variables were most associated with revascularization. Imaging modality did not influence model performance. Removal of imaging variables reduced model performance.

Original languageEnglish
Article numbere0233791
JournalPLoS ONE
Volume15
Issue number6
DOIs
Publication statusPublished - Jun 2020

Cite this