Uncovering the Black Box of Coronary Artery Disease Diagnosis: The Significance of Explainability in Predictive Models

LINK: TBA

Abstract

In recent times, CAD prediction and diagnosis have been the subject of many Medical Decision Support Systems (MDSS) that make use of Machine Learning (ML) and Deep Learning (DL) algorithms. The common ground of the vast majority of such applications is that they function as black boxes. They reach a conclusion/diagnosis using multiple features as input; however, the user is oftentimes oblivious to the prediction process and the feature weights leading to the eventual prediction. The main goal driving this work is to provide a degree of explainability for the schematics and procedures of a black-box prediction model for Coronary Artery Disease (CAD). In this work, a dataset was used containing biometric and clinical data from 571 patients (21 total features, 43% ICA-confirmed CAD instances). Furthermore, a prediction model utilizing the afore mentioned dataset and CatBoost algorithm is analyzed in order to highlight its predic-tion-making process and the significance of each input datum. State-of-the-art explainability mechanics are employed to highlight the significance of each feature, and common patterns and differences with the medical bibliography are then discussed. Moreover, the findings are compared with common risk factors for CAD, so as to offer an evaluation of the prediction process from the medical expert’s point of view. By depicting how the algorithm weighted the information contained in features, we shed light on the black-box mechanics of ML prediction models; by analyzing the findings, we explore their validity in accordance with the medical bibliography on the matter.