The amino acid sequence of gene products is routinely deduced from the nucleotide sequence of the relative cloned cDNA, according to the rules for recognition of start codon (first-AUG rule, optimal sequence context) and the genetic code. From this prediction stem most subsequent types of product analysis, although all standard methods for cDNA cloning are affected by a potential inability to effectively clone the 5' region of mRNA. Revision by bioinformatics and cloning methods of 109 known genes located on human chromosome 21 (HC 21) shows that 60 mRNAs lack any in-frame stop upstream of the first-AUG, and that in five cases (DSCR1, KIAA0184, KIAA0539, SON, and TFF3) the coding region at the 5' end was incompletely characterized in the original descriptions. We describe the respective consequences for genomic annotation, domain and ortholog identification, and functional experiments design. We have also analyzed the sequences of 13,124 human mRNAs (RefSeq databank), discovering that in 6448 cases (49%), an in-frame stop codon is present upstream of the initiation codon, while in the other 6676 mRNAs (51%), identification of additional bases at the mRNA 5' region could well reveal some new upstream in-frame AUG codons in the optimal context. Proportionally to the HC 21 data, about 550 known human genes might thus be affected by this 5' end mRNA artifact.

mRNA 5' region sequence incompleteness: a potential source of systematic errors in translation initiation codon assignment in human mRNAs

D'ADDABBO, PIETRO;
2003-01-01

Abstract

The amino acid sequence of gene products is routinely deduced from the nucleotide sequence of the relative cloned cDNA, according to the rules for recognition of start codon (first-AUG rule, optimal sequence context) and the genetic code. From this prediction stem most subsequent types of product analysis, although all standard methods for cDNA cloning are affected by a potential inability to effectively clone the 5' region of mRNA. Revision by bioinformatics and cloning methods of 109 known genes located on human chromosome 21 (HC 21) shows that 60 mRNAs lack any in-frame stop upstream of the first-AUG, and that in five cases (DSCR1, KIAA0184, KIAA0539, SON, and TFF3) the coding region at the 5' end was incompletely characterized in the original descriptions. We describe the respective consequences for genomic annotation, domain and ortholog identification, and functional experiments design. We have also analyzed the sequences of 13,124 human mRNAs (RefSeq databank), discovering that in 6448 cases (49%), an in-frame stop codon is present upstream of the initiation codon, while in the other 6676 mRNAs (51%), identification of additional bases at the mRNA 5' region could well reveal some new upstream in-frame AUG codons in the optimal context. Proportionally to the HC 21 data, about 550 known human genes might thus be affected by this 5' end mRNA artifact.
File in questo prodotto:
File Dimensione Formato  
2003 Casadei-5UTR.pdf

accesso aperto

Descrizione: Articolo principale
Tipologia: Documento in Versione Editoriale
Licenza: Creative commons
Dimensione 343.37 kB
Formato Adobe PDF
343.37 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/196824
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 16
  • ???jsp.display-item.citation.isi??? 15
social impact