Predicting Dropout in Online Learning Environments

Abstract

Online learning environments became popular in recent years. Due to high attrition rates, the problem of student dropouts became of immense importance for course designers, and course makers. In this paper, we utilized lasso and ridge logistic regression to create a prediction model for dropout on the Open University database. We investigated how early dropout can be predicted, and why dropouts occur. To answer the first question, we created models for eight different time frames, ranging from the beginning of the course to the mid-term. There are two results based on two definitions of dropout. Results show that at the beginning AUC of the prediction model is 0.549 and 0.661 and rises to 0.681 and 0.869 at mid-term. By analyzing logistic regression coefficients, we showed that at the beginning of the course demographic features of the student and course description features are the most important variables for dropout prediction, while later student activity gains more importance.

Publication
In Computer Science and Information Systems
Sandro Radovanović
Sandro Radovanović
Assistant Professor at University of Belgrade

My research interests include machine learning, development and design of decision support systems, decision theory, and fairness and justice concepts in algorithmic decision making.