Educational Stream Data Analysis: A Case Study

IRIS

Virtual Learning Environments (VLEs) are Web-based platforms where educational contents, together with study support tools, are provided. Logs recording the interactions between students and VLEs are collected on a daily basis, thus automatic techniques are needed to manage and analyze such huge quantities of data. Students, teachers, managers, and in general all stakeholders involved in the VLEs’ learning activities, can take advantage of the insights coming from educational data and useful information can be extracted by using machine learning techniques. Traditionally, educational data have been studied as stationary data by using conventional machine learning methods. However, educational data are non-stationary by nature and they can be better treated as data streams. In this paper, we show the results of a classification study where the random forest algorithm, applied both in batch and adaptive mode, is used to develop a model for predicting the failure/success of students’ exams. Moreover, a feature importance analysis is carried out to detect the most discriminant attributes for the predictive task. Experiments were performed on the Open University Learning Analytics Dataset (OULAD) showing the reliability of adaptive random forest in creating accurate classification models from evolving educational data.

Educational Stream Data Analysis: A Case Study

Casalino, Gabriella;Castellano, Giovanna;Mannavola, Andrea;Vessio, Gennaro

2020-01-01

Abstract

Virtual Learning Environments (VLEs) are Web-based platforms where educational contents, together with study support tools, are provided. Logs recording the interactions between students and VLEs are collected on a daily basis, thus automatic techniques are needed to manage and analyze such huge quantities of data. Students, teachers, managers, and in general all stakeholders involved in the VLEs’ learning activities, can take advantage of the insights coming from educational data and useful information can be extracted by using machine learning techniques. Traditionally, educational data have been studied as stationary data by using conventional machine learning methods. However, educational data are non-stationary by nature and they can be better treated as data streams. In this paper, we show the results of a classification study where the random forest algorithm, applied both in batch and adaptive mode, is used to develop a model for predicting the failure/success of students’ exams. Moreover, a feature importance analysis is carried out to detect the most discriminant attributes for the predictive task. Experiments were performed on the Open University Learning Analytics Dataset (OULAD) showing the reliability of adaptive random forest in creating accurate classification models from evolving educational data.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2020
			
	Codice ISBN
	
				978-1-7281-5200-4
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/307761

Citazioni

ND

10

8

social impact