Social network platforms are highly used for social interactions, and due to their increasing number of registered users, it is crucial to verify the authenticity of such accounts and the data they generate. In particular, the phenomenon of malicious accounts represents a crucial aspect that social network platforms have to deal with, and it is crucial to develop new methodologies and strategies to discriminate against malicious accounts automatically. To this end, data from social network platforms plays a crucial role in defining analytical activities devoted to fake account discrimination. In this proposal, we organized and cleaned fake account datasets collected by online sources and provided classification results obtained employing machine learning models and feature selection strategies. Moreover, we extend classification results by using a new proposed fake accounts dataset collected through data crawling activity. Experimental results produced by employing several machine learning models and feature selection techniques on the fake account datasets reveal discrimination improvements when feature selection strategies are exploited. Our proposal aims to support stakeholders, data analysts, and researchers by providing them with fake account datasets cleaned and organized for analytical activities, together with statistical classification results obtained using machine learning models and feature selection strategies.
Classification benchmarking of fake account datasets using machine learning models and feature selection strategies
Caivano D.;Desiato D.;
2024-01-01
Abstract
Social network platforms are highly used for social interactions, and due to their increasing number of registered users, it is crucial to verify the authenticity of such accounts and the data they generate. In particular, the phenomenon of malicious accounts represents a crucial aspect that social network platforms have to deal with, and it is crucial to develop new methodologies and strategies to discriminate against malicious accounts automatically. To this end, data from social network platforms plays a crucial role in defining analytical activities devoted to fake account discrimination. In this proposal, we organized and cleaned fake account datasets collected by online sources and provided classification results obtained employing machine learning models and feature selection strategies. Moreover, we extend classification results by using a new proposed fake accounts dataset collected through data crawling activity. Experimental results produced by employing several machine learning models and feature selection techniques on the fake account datasets reveal discrimination improvements when feature selection strategies are exploited. Our proposal aims to support stakeholders, data analysts, and researchers by providing them with fake account datasets cleaned and organized for analytical activities, together with statistical classification results obtained using machine learning models and feature selection strategies.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.