Virtual Learning Environments (VLEs) are Web-based platforms where educational contents, together with study support tools, are provided. Logs recording the interactions between students and VLEs are collected on a daily basis, thus automatic techniques are needed to manage and analyze such huge quantities of data. Students, teachers, managers, and in general all stakeholders involved in the VLEs’ learning activities, can take advantage of the insights coming from educational data and useful information can be extracted by using machine learning techniques. Traditionally, educational data have been studied as stationary data by using conventional machine learning methods. However, educational data are non-stationary by nature and they can be better treated as data streams. In this paper, we show the results of a classification study where the random forest algorithm, applied both in batch and adaptive mode, is used to develop a model for predicting the failure/success of students’ exams. Moreover, a feature importance analysis is carried out to detect the most discriminant attributes for the predictive task. Experiments were performed on the Open University Learning Analytics Dataset (OULAD) showing the reliability of adaptive random forest in creating accurate classification models from evolving educational data.

Educational Stream Data Analysis: A Case Study

Casalino, Gabriella;Castellano, Giovanna;Vessio, Gennaro
2020-01-01

Abstract

Virtual Learning Environments (VLEs) are Web-based platforms where educational contents, together with study support tools, are provided. Logs recording the interactions between students and VLEs are collected on a daily basis, thus automatic techniques are needed to manage and analyze such huge quantities of data. Students, teachers, managers, and in general all stakeholders involved in the VLEs’ learning activities, can take advantage of the insights coming from educational data and useful information can be extracted by using machine learning techniques. Traditionally, educational data have been studied as stationary data by using conventional machine learning methods. However, educational data are non-stationary by nature and they can be better treated as data streams. In this paper, we show the results of a classification study where the random forest algorithm, applied both in batch and adaptive mode, is used to develop a model for predicting the failure/success of students’ exams. Moreover, a feature importance analysis is carried out to detect the most discriminant attributes for the predictive task. Experiments were performed on the Open University Learning Analytics Dataset (OULAD) showing the reliability of adaptive random forest in creating accurate classification models from evolving educational data.
2020
978-1-7281-5200-4
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/307761
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? ND
social impact