In mammals, RNA editing involves the hydrolytic deamination of adenosine (A) to inosine (I) or of cytosine (C) to uracil (U) by the ADAR and APOBEC families of enzymes, respectively. Direct RNA (dRNA) sequencing by Oxford Nanopore Technology (ONT) allows the detection of Us and, thus, facilitates the unveiling of edited Cs avoiding Reverse Transcription and PCR amplification steps. However, dRNA data are noisy, and very rare events such as C-to-U conversions cannot be easily distinguished from background noise or mutation errors. To overcome this issue, we developed a novel machine-learning strategy based on the Isolation Forest (iForest) algorithm to denoise the signal deriving from dRNA highly-informative ONT data. Here we present a step-by-step protocol illustrating the usage of the C-to-U-Classifier package and how to apply its pretrained iForest models for ameliorating the detection of C-to-U events in mammalian transcriptomes. As an example, we show here the whole pipeline in action on data deriving from wild-type (WT) and APOBEC1 knock-out (KO) macrophagic cell lines. Additionally, the polishing power of our algorithm is proved through a synthetic in-vitro transcribed (IVT) sample in which C-to-U events are not present.

Profiling rare C-to-U editing events via direct RNA sequencing

Fonzino, Adriano;Mazzacuva, Pietro Luca;Pesole, Graziano;Picardi, Ernesto
2025-01-01

Abstract

In mammals, RNA editing involves the hydrolytic deamination of adenosine (A) to inosine (I) or of cytosine (C) to uracil (U) by the ADAR and APOBEC families of enzymes, respectively. Direct RNA (dRNA) sequencing by Oxford Nanopore Technology (ONT) allows the detection of Us and, thus, facilitates the unveiling of edited Cs avoiding Reverse Transcription and PCR amplification steps. However, dRNA data are noisy, and very rare events such as C-to-U conversions cannot be easily distinguished from background noise or mutation errors. To overcome this issue, we developed a novel machine-learning strategy based on the Isolation Forest (iForest) algorithm to denoise the signal deriving from dRNA highly-informative ONT data. Here we present a step-by-step protocol illustrating the usage of the C-to-U-Classifier package and how to apply its pretrained iForest models for ameliorating the detection of C-to-U events in mammalian transcriptomes. As an example, we show here the whole pipeline in action on data deriving from wild-type (WT) and APOBEC1 knock-out (KO) macrophagic cell lines. Additionally, the polishing power of our algorithm is proved through a synthetic in-vitro transcribed (IVT) sample in which C-to-U events are not present.
2025
9780443317866
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/575584
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? 1
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact