Predictive Modeling on Cardiovascular Health





Journal Title

Journal ISSN

Volume Title



With expenses escalating in American healthcare, leveraging data analytics can help cut costs and improve patient satisfaction. Employing machine learning for predictive modeling can pinpoint high-risk patients, enabling proactive care instead of reactive. In my thesis, I replicated healthcare data analytics studies to showcase the potency of machine learning. I compared methods, algorithms, and models while highlighting ethical issues in big data analytics for healthcare. In the study analysis, Dristas & Trigka (2022) exhibited the most replicable and comprehensive results. Tazin et al. (2021) erred by including synthetic samples in the testing set during the SMOTE process while Dev et al. (2022) used an under-sampling technique, diminishing an already small dataset and risking accuracy issues. Despite the immense potential of machine learning in healthcare, my results revealed execution flaws that highlight the importance of additional research to validate big data analytics in healthcare. Replicating studies is crucial for making well-informed decisions based on reliable evidence. A collaborative effort between data scientists, healthcare professionals, and policymakers is essential to safeguard patient privacy and ensure responsible technology use in healthcare.


LCSH Subject Headings