Understanding protein−protein interactions is central to our under- standing of almost all complex biological processes. Computational tools exploiting rapidly growing genomic databases to characterize protein−protein interactions are urgently needed. Such methods should connect multiple scales from evolutionary conserved interac- tions between families of homologous proteins, over the identifica- tion of specifically interacting proteins in the case of multiple paralogs inside a species, down to the prediction of residues being in physical contact across interaction interfaces. Statistical inference methods detecting residue−residue coevolution have recently trig- gered considerable progress in using sequence data for quaternary protein structure prediction; they require, however, large joint align- ments of homologous protein pairs known to interact. The genera- tion of such alignments is a complex computational task on its own; application of coevolutionary modeling has, in turn, been restricted to proteins without paralogs, or to bacterial systems with the corre- sponding coding genes being colocalized in operons. Here we show that the direct coupling analysis of residue coevolution can be ex- tended to connect the different scales, and simultaneously to match interacting paralogs, to identify interprotein residue−residue con- tacts and to discriminate interacting from noninteracting families in a multiprotein system. Our results extend the potential applications of coevolutionary analysis far beyond cases treatable so far.

Simultaneous identification of specifically interacting paralogs and interprotein contacts by direct coupling analysis

Zamparo, Marco;
2016-01-01

Abstract

Understanding protein−protein interactions is central to our under- standing of almost all complex biological processes. Computational tools exploiting rapidly growing genomic databases to characterize protein−protein interactions are urgently needed. Such methods should connect multiple scales from evolutionary conserved interac- tions between families of homologous proteins, over the identifica- tion of specifically interacting proteins in the case of multiple paralogs inside a species, down to the prediction of residues being in physical contact across interaction interfaces. Statistical inference methods detecting residue−residue coevolution have recently trig- gered considerable progress in using sequence data for quaternary protein structure prediction; they require, however, large joint align- ments of homologous protein pairs known to interact. The genera- tion of such alignments is a complex computational task on its own; application of coevolutionary modeling has, in turn, been restricted to proteins without paralogs, or to bacterial systems with the corre- sponding coding genes being colocalized in operons. Here we show that the direct coupling analysis of residue coevolution can be ex- tended to connect the different scales, and simultaneously to match interacting paralogs, to identify interprotein residue−residue con- tacts and to discriminate interacting from noninteracting families in a multiprotein system. Our results extend the potential applications of coevolutionary analysis far beyond cases treatable so far.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/418222
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact