Classification benchmarking of fake account datasets using machine learning models and feature selection strategies

IRIS

Social network platforms are highly used for social interactions, and due to their increasing number of registered users, it is crucial to verify the authenticity of such accounts and the data they generate. In particular, the phenomenon of malicious accounts represents a crucial aspect that social network platforms have to deal with, and it is crucial to develop new methodologies and strategies to discriminate against malicious accounts automatically. To this end, data from social network platforms plays a crucial role in defining analytical activities devoted to fake account discrimination. In this proposal, we organized and cleaned fake account datasets collected by online sources and provided classification results obtained employing machine learning models and feature selection strategies. Moreover, we extend classification results by using a new proposed fake accounts dataset collected through data crawling activity. Experimental results produced by employing several machine learning models and feature selection techniques on the fake account datasets reveal discrimination improvements when feature selection strategies are exploited. Our proposal aims to support stakeholders, data analysts, and researchers by providing them with fake account datasets cleaned and organized for analytical activities, together with statistical classification results obtained using machine learning models and feature selection strategies.

Classification benchmarking of fake account datasets using machine learning models and feature selection strategies

Caivano D.^{Conceptualization};Cerullo M.^Software;Desiato D.^Methodology;Polese G.^{Writing – Review & Editing}

2024-01-01

Abstract

Social network platforms are highly used for social interactions, and due to their increasing number of registered users, it is crucial to verify the authenticity of such accounts and the data they generate. In particular, the phenomenon of malicious accounts represents a crucial aspect that social network platforms have to deal with, and it is crucial to develop new methodologies and strategies to discriminate against malicious accounts automatically. To this end, data from social network platforms plays a crucial role in defining analytical activities devoted to fake account discrimination. In this proposal, we organized and cleaned fake account datasets collected by online sources and provided classification results obtained employing machine learning models and feature selection strategies. Moreover, we extend classification results by using a new proposed fake accounts dataset collected through data crawling activity. Experimental results produced by employing several machine learning models and feature selection techniques on the fake account datasets reveal discrimination improvements when feature selection strategies are exploited. Our proposal aims to support stakeholders, data analysts, and researchers by providing them with fake account datasets cleaned and organized for analytical activities, together with statistical classification results obtained using machine learning models and feature selection strategies.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2024

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/506604

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

ND

social impact