LLMs to Detect Cyber Child Abuse in the in Textual Conversations

IRIS

In contemporary online interactions, identifying inappropriate language and safeguarding minors from harmful communication is a critical challenge. This study explores the use of Large Language Models (LLMs) to analyze text, detecting patterns indicative of age-specific language and the presence of sexual or pornographic references. A fine-tuning of the LLaMAntino model was performed, using a dataset of synthetically generated sentences designed to replicate real-world scenarios. The fine-tuned model demonstrated enhanced performance compared to its baseline (given by LLaMAntino 3 ANITA 8B), providing detailed and context-sensitive explanations for its classifications. The results highlight the potential of LLMs in addressing sensitive linguistic phenomena with precision, offering a foundation for detecting indirect combinations of sexual references in conversations involving minors. Future work can focus on incorporating real conversational data and involving subject matter experts to refine the model’s interpretability and reliability. Additionally, the exploration of advanced architectures and fine-tuning techniques will be considered to further balance model complexity and processing efficiency.

LLMs to Detect Cyber Child Abuse in the in Textual Conversations

Baldassarre M. T.;Barletta V. S.;Bavaro Vito;Caivano D.;De Matteis Alberto Pio;Lippolis Andrea;Piccinno Antonio

2025-01-01

Abstract

In contemporary online interactions, identifying inappropriate language and safeguarding minors from harmful communication is a critical challenge. This study explores the use of Large Language Models (LLMs) to analyze text, detecting patterns indicative of age-specific language and the presence of sexual or pornographic references. A fine-tuning of the LLaMAntino model was performed, using a dataset of synthetically generated sentences designed to replicate real-world scenarios. The fine-tuned model demonstrated enhanced performance compared to its baseline (given by LLaMAntino 3 ANITA 8B), providing detailed and context-sensitive explanations for its classifications. The results highlight the potential of LLMs in addressing sensitive linguistic phenomena with precision, offering a foundation for detecting indirect combinations of sexual references in conversations involving minors. Future work can focus on incorporating real conversational data and involving subject matter experts to refine the model’s interpretability and reliability. Additionally, the exploration of advanced architectures and fine-tuning techniques will be considered to further balance model complexity and processing efficiency.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2025

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/558721

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

ND

social impact