Ensemble Predictive Model for Academic Churn Risk Using Plurality Voting
Keywords:
predictive modeling, data mining, academic churn, churn analysis, student attritionAbstract
Academic churn analysis involves identifying students who are most likely to discontinue schooling. Although churn is an unavoidable phenomenon, timely detection and early intervention have been proven to be effective retention mechanisms. The study aimed to develop a model that predicts the likelihood of students to churn to provide insights to school administrators to initiate activities to prevent student attrition. This study examined academic, demographic and psychological data of students admitted as freshmen from 2005 to 2010 in two programs (Bachelor of Science in Information Technology [BSIT] and Bachelor of Science in Computer Systems [BSCS]) of the University of San Jose-Recoletos, Cebu City, Philippines. The psychological data representing the personality traits of students were gathered through Manchester Personality Questionnaire (MPQ). This study applied the ensemble method in machine learning to create a predictive model to define profiles for churners and non-churners. The predictive model was created by bagging three different classification models, namely support vector machine (SVM), random forest (RF) and k-Nearest Neighbor (k-NN) via plurality voting. The performance of the model was verified through 10-fold cross-validation with an overall accuracy of 78%. The model will be integrated into the college student advising system to provide notifications to administrators on students who need intervention on their subsequent enrollment.