A Framework for Integrating Domain Knowledge in Logistic Regression with Application to Hospital Readmission Prediction

Sandro Radovanović, Boris Delibašić, Miloš Jovanović, Milan Vukićević, Milija Suknović

December, 2019

Abstract

It is commonly understood that machine learning algorithms discover and extract knowledge based on data at hand. However, a huge amount of knowledge is available which is in machine-readable format and ready for inclusion in machine learning algorithms and models. In this paper, we propose a framework that integrates domain knowledge in form of ontologies/hierarchies into logistic regression using stacked generalization. Namely, relations from ontology/hierarchy are used in stacking manner in order to obtain higher, more abstract concepts. Obtained concepts are further used for prediction. The problem we solved is unplanned 30-days hospital readmission, which is considered as one of the major problems in healthcare. Proposed framework yields better results compared to Ridge, Lasso, and Tree Lasso Logistic Regression. Results suggest that the proposed framework improves AUC by up to 9.5% on pediatric datasets and up to 4% on morbidly obese patients’ datasets and also improves AUPRC by up to 5.7% on pediatric datasets and up to 2.6% on morbidly obese patients’ datasets on average. This indicates that the inclusion of domain knowledge improves the predictive performance of Logistic Regression.

Type

Journal article

Publication

In International Conference on INnovations in Intelligent SysTems and Applications

A Framework for Integrating Domain Knowledge in Logistic Regression with Application to Hospital Readmission Prediction

Abstract

Sandro Radovanović

Assistant Professor at University of Belgrade