A validation of machine learning models for the identification of critically ill children presenting to the paediatric emergency room of a tertiary hospital in South Africa: A proof of concept
Main Article Content
Abstract
Background. Machine learning (ML) refers to computational algorithms designed to learn from patterns in data to provide insights or predictions related to that data.
Objective. Multiple studies report the development of predictive models for triage or identification of critically ill children. In this study, we validate machine learning models developed in South Africa for the identification of critically ill children presenting to a tertiary hospital.
Results. The validation sample comprised 267 patients. The event rate for the study outcome was 0.12. All models demonstrated good discrimination but weak calibration. Artificial neural network 1 (ANN1) had the highest area under the receiver operating characteristic curve (AUROC) with a value of 0.84. ANN2 had the highest area under the precision-recall curve (AUPRC) with a value of 0.65. Decision curve analysis demonstrated that all models were superior to standard strategies of treating all patients or treating no patients at a proposed threshold probability of 10%. Confidence intervals for model performance overlapped considerably. Post hoc model explanations demonstrated that models were logically coherent with clinical knowledge.
Conclusions. Internal validation of the predictive models correlated with model performance in the development study. The models were able to discriminate between critically ill children and non-critically ill children; however, the superiority of one model over the others could not be demonstrated in this study. Therefore, models such as these still require further refinement and external validation before implementation in clinical practice. Indeed, successful implementation of machine learning in practice within the South African setting will require the development of regulatory and infrastructural frameworks in conjunction with the adoption of alternative approaches to electronic data capture, such as the use of mobile devices.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
The SAJCC is published under an Attribution-Non Commercial International Creative Commons Attribution (CC-BY-NC 4.0) License. Under this license, authors agree to make articles available to users, without permission or fees, for any lawful, non-commercial purpose. Users may read, copy, or re-use published content as long as the author and original place of publication are properly cited.
Exceptions to this license model is allowed for UKRI and research funded by organisations requiring that research be published open-access without embargo, under a CC-BY licence. As per the journals archiving policy, authors are permitted to self-archive the author-accepted manuscript (AAM) in a repository.
How to Cite
References
1. Turner EL, Nielsen KR, Jamal SM, von Saint A, Musa NL. A review of pediatric critical care in resource-limited settings: A look at past, present, and future directions. Front Pediatr 2016;4(Feb):1-15.
2. Hodkinson P, Argent A, Wallis L, et al. Pathways to care for critically ill or injured children: A cohort study from first presentation to healthcare services through to admission to intensive care or death. PLoS One 2016;11(1):1-17.
3. Rajkomar A, Dean J, Kohane I. Machine learning in medicine. N Engl J Med 2019;380(14): 1347-1358.
4. Goto T, Camargo Jr CA, Faridi MK, Freishtat RJ, Hasegawa K. Machine learning-based prediction of clinical outcomes for children during emergency department triage. JAMA Netw Open 2019;4:2(1):e186937.
5. Raita Y, Goto T, Faridi MK, Brown DFM, Camargo CA, Hasegawa K. Emergency department triage prediction of clinical outcomes using machine learning models. Crit Care 2019;23(1):1-13. 6. Hwang S, Lee B. Machine learning-based prediction of critical illness in children visiting the
emergency department. PLoS One 2022;17(2): e0264184
7. Solomon LJ, Naidoo KD, Appel I, et al. Pediatric index of mortality 3-an evaluation of function
among ICUs in South Africa. Pediatr Crit Care Med 2021;22(9):813-821.
8. Pienaar MA, Sempa JB, Luwes N, Solomon LJ. An artificial neural network model for pediatric mortality prediction in two tertiary pediatric intensive care units in South Africa. A development
study. Front Pediatr 2022;10.
9. Pienaar MA, Sempa JB, Luwes N, George EC, Brown SC. Development of artificial neural network
models for paediatric critical illness in South Africa. Front Pediatr 2022;10.
10. Graupe D. Principles of Artificial Neural Networks. 3rd ed. Singapore: World Scientific; 2013. 11. Chen T, Guestrin C. XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining (accessed 27 July
2022). https://doi.org/10.1145/2939672.2939785
12. Splitting a dataset into train and test sets. Baeldung on computer science (accessed 28 April 2022).
https://www.baeldung.com/cs/train-test-datasets-ratio
13. Roberts JS, Yanay O, Barry D. Age-based percentiles of measured mean arterial pressure in
pediatric patients in a hospital setting. Pediatric Crit Care Med 2020;E759-768.
14. Van Rossum GDF. Python 3 Reference Manual. Scotts Valley: CreateSpace; 2009.
15. Kluyver T, Ragam-Kelley B, Perez F, Granger B. Jupyter notebooks - a publishing format for
reproducible computational workflows. In: Positioning and Power in Academic Publishing:
Players, Agents and Agendas. Amsterdam: IOS Press.
16. Saito T, Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when
evaluating binary classifiers on imbalanced datasets. PLoS One 2015;10(3):1-21.
17. Boyd K, Costa VS, Davis J, Page CD. Unachievable region in precision-recall space and its effect on empirical evaluation. Proceedings of the 29th International Conference on Machine Learning,
ICML 2012. 2012;1:639-646.
18. Mandrekar JN. Receiver operating characteristic curve in diagnostic test assessment. J Thoracic
Oncol 2010;5(9):1315-1316.
19. Schmid CH, Griffith JL. Multivariate classification rules: Calibration and discrimination. In:
Encyclopedia of Biostatistics. Hoboken: John Wiley & Sons; 2005.
20. Van Calster B, Nieboer D, Vergouwe Y, De Cock B, Pencina MJ, Steyerberg EW. A calibration
hierarchy for risk models was defined: from utopia to empirical data. J Clin Epidemiol
2016;74:167-176.
21. Van Calster B, McLernon DJ, Van Smeden M, et al. Calibration: The Achilles heel of predictive
analytics. BMC Med 2019;17(1).
22. Vickers AJ, van Calster B, Steyerberg EW. A simple, step-by-step guide to interpreting decision
curve analysis. Diagn Progn Res 2019;3(1):1-8. https://diagnprognres.biomedcentral.com/
articles/10.1186/s41512-019-0064-7
23. Holzinger A. Interactive machine learning for health informatics: When do we need the human- in-the-loop? Brain Inform 2016;3(2):119-131.24. Holzinger A. From machine learning to explainable AI 2018. https://hci-kdd.org
25. Lundberg SM, Lee SI. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst 201722; 2017-December:4766–75. https://arxiv.org/abs/1705.07874v2 (accessed
6 May 2022).
26. Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent reporting of a multivariable
prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD statement. Ann Intern Med 2015;162(1):55-63.
27. Elkan C. The Foundations of Cost-Sensitive Learning. San Diego: University of California; 2001. 28. National Department of Health. National digital health strategy for South Africa 2019 - 2024. Pretoria: National Department of Health. 2019. pp 1-36. http://www.health.gov.za/wp-content/
uploads/2020/11/national-digital-strategy-for-south-africa-2019-2024-b.pdf