To read the full version of this content please select one of the options below:

Using educational data mining techniques to increase the prediction accuracy of student academic performance

Gomathy Ramaswami (School of Natural and Computational Sciences (SNCS), Massey University - Albany Campus, North Shore City, New Zealand)
Teo Susnjak (School of Natural and Computational Sciences (SNCS), Massey University - Albany Campus, North Shore City, New Zealand)
Anuradha Mathrani (School of Natural and Computational Sciences (SNCS), Massey University - Albany Campus, North Shore City, New Zealand)
James Lim (Department of Civil Engineering, University of Auckland, Auckland, New Zealand)
Pablo Garcia (Xorro Solutions, Auckland, New Zealand)

Information and Learning Sciences

ISSN: 2398-5348

Publication date: 8 July 2019

Abstract

Purpose

This paper aims to evaluate educational data mining methods to increase the predictive accuracy of student academic performance for a university course setting. Student engagement data collected in real time and over self-paced activities assisted this investigation.

Design/methodology/approach

Classification data mining techniques have been adapted to predict students’ academic performance. Four algorithms, Naïve Bayes, Logistic Regression, k-Nearest Neighbour and Random Forest, were used to generate predictive models. Process mining features have also been integrated to determine their effectiveness in improving the accuracy of predictions.

Findings

The results show that when general features derived from student activities are combined with process mining features, there is some improvement in the accuracy of the predictions. Of the four algorithms, the study finds Random Forest to be more accurate than the other three algorithms in a statistically significant way. The validation of the best-known classifier model is then tested by predicting students’ final-year academic performance for the subsequent year.

Research limitations/implications

The present study was limited to datasets gathered over one semester and for one course. The outcomes would be more promising if the dataset comprised more courses. Moreover, the addition of demographic information could have provided further representations of students’ performance. Future work will address some of these limitations.

Originality/value

The model developed from this research can provide value to institutions in making process- and data-driven predictions on students’ academic performances.

Keywords

Citation

Ramaswami, G., Susnjak, T., Mathrani, A., Lim, J. and Garcia, P. (2019), "Using educational data mining techniques to increase the prediction accuracy of student academic performance", Information and Learning Sciences, Vol. 120 No. 7/8, pp. 451-467. https://doi.org/10.1108/ILS-03-2019-0017

Publisher

:

Emerald Publishing Limited

Copyright © 2019, Emerald Publishing Limited