Classification and predictive models using supervised machine learning: A conceptual review

Main Article Content

M A Pienaar
K Naidoo

Abstract





Supervised machine learning models (SMLMs) are likely to be a prevalent approach in the literature on medical machine learning. These models have considerable potential to improve clinical decision-making through enhanced prediction and classification. In this review, we present an overview of SMLMs. We provide a discussion of the conceptual domains relevant to machine learning, model development, validation, and model explanation. This discussion is accompanied by clinical examples to illustrate key concepts.





Article Details

Section

Research Articles

How to Cite

Classification and predictive models using supervised machine learning: A conceptual review. (2025). Southern African Journal of Critical Care, 41(1), e2937. https://doi.org/10.7196/SAJCC.2025.v411.2937

References

1. Singer M, Deutschman CS, Seymour C, et al. The Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). JAMA 2016;315(8):801-810 (accessed 25 November 2024). https://jamanetwork.com/journals/jama/fullarticle/2492881

2. Qadir N, Sahetya S, Munshi L, et al. An update on management of adult patients with acute respiratory distress syndrome: An official Am Thoracic Soc Clin Pract Guide 2023;209(1):24-36. https://doi.org/101164/rccm202311-2011ST

3. Emeriaud G, López-Fernández YM, Iyer NP, et al. Executive summary of the Second Interna- tional Guidelines for the Diagnosis and Management of Pediatric Acute Respiratory Distress Syn-drome (PALICC-2). Pediatric Crit Care Med 2023;24(2):143-168.

4. Buell KG, Spicer AB, Casey JD, et al. Individualised treatment effects of oxygen targets in mechanically ventilated critically ill adults. JAMA 2024;331(14):1195-1204 (accessed 25 Novem- ber 2024). https://jamanetwork.com/journals/jama/fullarticle/2816677

5. The National Heart L and BIARDS (ARDS) CTN. Efficacy and safety of corticosteroids for persistent acute respiratory distress syndrome. N Engl J Med 2006;354(16):1671-1684 (accessed 25 November 2024). https://www.nejm.org/doi/full/10.1056/NEJMoa051693

6. Kahneman D. Thinking, fast and slow. Reflections on the liar. New York: Farrar, Straus and Giroux; 2011:499.

7. Ajibade SSM, Alhassan GN, Zaidi A, et al. Evolution of machine learning applications in medical and healthcare analytics research: A bibliometric analysis. Intelligent Systems with Applications 2024;24:200441.

8. Hamet P, Tremblay J. Artificial intelligence in medicine. Metabol 2017;69:S36-S40.

9. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev

2000;44(1-2):207-219.

10. Turing AM. Computing machinery and intelligence.Mind 1950;49.

11. Holzinger A, Langs G, Denk H, Zatloukal K, Müller H. Causability and explainability of artificial

intelligence in medicine. Wiley Interdiscip Rev Data Min Knowl Discov 2019;9(4):1-13.

12. Peng RD, Matsui E. The art of data science. Victoria, Canada: Leanpub Publishing; 2017.

13. Beam AL, Kohane IS. Big data and machine learning in health care. JAMA 2018;319(13):1317-1318.

14. Ouyang D, He B, Ghorbani A, et al. Video-based AI for beat-to-beat assessment of cardiac

function. Nature 2020;580(7802):252 (accessed 25 November 2024). https://pmc.ncbi.nlm.nih.

gov/articles/PMC8979576/

15. Mohri M, Afshin Rostamizadeh AT. Foundations of machine learning, 2nd ed. Vol. 60, Statistical Papers 2019.

16. Knaus WA, Draper EA, Wagner DP, Zimmerman JE. APACHE II: A severity of disease classification system. Crit Care Med 1985;13(10):818-829.

17. Straney L, Clements A, Parslow RC, et al. Paediatric index of mortality 3: An updated model for predicting mortality in pediatric intensive care. Pediatric Crit Care Med 2013;14(7):673-681.

18. Graupe D. Principles of artificial Neural Networks. 3rd ed. Singapore: World Scientific

Publishing; 2013.

19. Splitting a dataset into train and test sets | Baeldung on computer science (accessed 28 April 2022.

https://www.baeldung.com/cs/train-test-datasets-ratio

20. Laksana E, Aczon M, Ho L, Carlin C, Ledbetter D, Wetzel R. The impact of extraneous features on the performance of recurrent neural network models in clinical tasks. J Biomed Inform 2020;102:103351. https://doi.org/10.1016/j.jbi.2019.103351

21. Pienaar MA, Sempa JB, Luwes N, George EC, Brown SC. Elicitation of domain knowledge for a machine learning model for paediatric critical illness in South Africa. Front Pediatr 2023;11:1005579.

22. Kotsiantis S, Kanellopoulos D, Pintelas PE. Data preprocessing for supervised learning. Int J Comp Inf Eng 2007;1(12):4104-4109. https://www.researchgate.net/publication/228084519

23. Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: Machine learning in Python. 2 January 2012. http://arxiv.org/abs/1201.0490

24. Claesen M, De Moor B. Hyperparameter search in machine learning. 2015 (accessed 4 April 2022). https://www.codalab.org/competitions/2321

25. 3.1. Cross-validation: Evaluating estimator performance — scikit-learn 1.1.1 documentation (accessed 1 June 2022). https://scikit-learn.org/stable/modules/cross_validation.html

26. Kayambankadzanja RK, Schell CO, Wärnberg MG, et al. Towards definitions of critical illness and critical care using concept analysis. BMJ Open 2022;12(9):e060972 (accessed 17 January 2025). https://pubmed.ncbi.nlm.nih.gov/36606666/

27. Van Calster B, McLernon DJ, Van Smeden M, et al. Calibration: The Achilles heel of predictive analytics. BMC Med 2019;17(1).

28. Schmid CH, Griffith JL. Multivariate classification rules: Calibration and discrimination. In: Encyclopedia of Biostatistics. Hoboken, USA: John Wiley & Sons; 2005.

29. Mandrekar JN. Receiver operating characteristic curve in diagnostic test assessment. J Thoracic Oncol 2010;5(9):1315-1316.

30. Zou KH, O’Malley AJ, Mauri L. Receiver-operating characteristic analysis for evaluating diagnostic tests and predictive models. Circulation 2007;115(5):654-657.

31. Sanchez-Pinto LN, Bennett TD, Dewitt PE, et al. Development and validation of the Phoenix criteria for pediatric sepsis and septic shock. JAMA 2024;331(8):675-686 (accessed 16 January 2025). https://jamanetwork.com/journals/jama/fullarticle/2814296

32. Saito T, Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One 2015;10(3):1-21.

33. Davis J, Goadrich M. The relationship between precision-recall and ROC curves. ACM Interna- tional Conference Proceeding Series. 2006;148:233-234.

34. Jeni LA, Cohn JF, Torre FD LA. Facing imbalanced data--recommendations for the use of performance metrics. In: 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction. 2013. p. 245-251.

35. Van Calster B, Vickers AJ. Calibration of risk prediction models. Medical Decision Making 2015;35(2):162-169.

36. Van Calster B, Nieboer D, Vergouwe Y, De Cock B, Pencina MJ, Steyerberg EW. A calibration hierarchy for risk models was defined: From utopia to empirical data. J Clin Epidemiol 2016;74:167-176.

37. Vickers AJ, van Calster B, Steyerberg EW. A simple, step-by-step guide to interpreting decision curve analysis. Diagn Progn Res 2019;3(1):1-8 (accessed 12 March 2022). https://diagnprognres. biomedcentral.com/articles/10.1186/s41512-019-0064-7

38. Vickers AJ, Elkin EB. Decision curve analysis: A novel method for evaluating prediction models. Medical Decision Making 2006;26(6):565-574 (accessed 12 March 2022). https://journals. sagepub.com/doi/10.1177/0272989X06295361

39. Holzinger A. Interactive machine learning for health informatics: When do we need the human- in-the-loop? Brain Inform 2016;3(2):119-131.

40. Holzinger A. From machine learning to explainable AI 2018. https://hci-kdd.org

41. Lundberg SM, Lee SI. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst 2017;December:4766-4775 (accessed 6 May 2022). https://arxiv.org/abs/1705.07874v2

Similar Articles

You may also start an advanced similarity search for this article.