We apologize for Proteopedia being slow to respond. For the past two years, a new implementation of Proteopedia has been being built. Soon, it will replace this 18-year old system. All existing content will be moved to the new system at a date that will be announced here.

Journal:Proteins:3

From Proteopedia

(Difference between revisions)
Jump to: navigation, search
Line 66: Line 66:
!AF2-HCO_011565
!AF2-HCO_011565
|-
|-
-
|[[Image:HCO_RTF_TUBE_sc70_400x277.GIF|400px]]
+
|[[Image:TaFROG_RTF_Morph_Tube_Sc70_400x284.GIF|400px]]
-
|[[Image:HCO_ESM_Tube_sc70_271_227.jpg|271px]]
+
|[[Image:TaFROG_ESM_sc70_272_278.jpg|271px]]
-
|[[Image:HCO_AF2_Morph_Tube_sc70_400x227.GIF|400px]]
+
|[[Image:TaFROG_AF2_Morph_sc70_400x22.GIF|400px]]
|}
|}
== References ==
== References ==
<references/>
<references/>

Revision as of 14:41, 6 May 2023

Do Newly Born orphan proteins resemble Never Born proteins? A study using three deep learning algorithms

Newly Born proteins, or orphan proteins, have no sequence homology to other proteins and occur in single species or within a taxonomically restricted gene (TRG) family.

Never Born proteins are random polypeptides with amino acid content similar to that of native proteins.

Can recently developed AI/Deep Learning tools for predicting 3D protein structures like:

  • AlphaFold2 (AF2)
  • RoseTTAFold (RTF)
  • Evolutionary Scale Modeling (ESM-2)

be useful to see if Newly Born proteins are similar to Never Born proteins? AF2 and RTF predict, by default, five top models, while ESM-2 predicts only one model. Morphing between the top models of AF2 and those of RTF give a visual feeling of how similar these 5 models are for each method.

True orphan proteins have no sequence homology to any existing protein. We thought, therefore, that the Never Born proteins generated and investigated by Tretyachenko et al.[1] would serve as a valuable benchmark for comparison. In their study they experimentally showed that some Never Born proteins folded into compact structures, e.g., as seen for Sequences #1856 and #6387.

RTF-1856 ESM-1856 AF2-1856
RTF-6387 ESM-6387 AF2-6387

Other never born proteins experimentally appear to belong to the category of intrinsically disordered proteins (IDPs)[2], e.g., as seen for Sequence #3703.

RTF-3703 ESM-3703 AF2-3703

We then went on to use the three algorithms on orphan proteins and taxonomically restricted gene products (TRGP) for which no experimental structures were available. We did this in order to see how the predictions of the three algorithms would compare, and whether they would predict novel folds. Although many ORFs have been identified which code for putative orphan proteins, only in a limited number of cases has their association with a well-defined biological activity been established. We have identified seven such proteins for which the necessary sequence data are also available. The number of amino acids for these seven orphans/TRGPs ranges from 109 to 632.

As an initial step in characterizing these seven proteins, we utilized FoldIndex[3] and flDPnn[4] to investigate whether they were predicted to be intrinsically disordered proteins (IDP) or folded. Five of the proteins are predicted to be almost completely folded, while the other two, TaFROG and Newtic1, are classified as IDPs, since they are predicted to be disordered throughout almost their entire sequences. Of the seven proteins studied only HCO_011565, a 632 residue nematode protein that was shown to be the target of the nematodicidal small molecule, appears to be fully folded. The three algorithms predict almost identical structures as well as very high pLDDT scores. Most likely, this is for two reasons. Firstly, rather than being a true orphan, HCO_011565 is the product of a TRG[5], with the BLAST search having revealed that the first 74 homologous sequences, with the lowest E values, were all from nematodes. Secondly, the DALI server revealed a number of hits for the entire predicted structure, as well as for the three subdomains predicted by all three algorithms.

RTF-HCO_011565 ESM-HCO_011565 AF2-HCO_011565

An example of an Orphan protein that is predicted by all three algorithms to be an IDP, is the wheat protein TaFROG. It contains 130 amino acids localized in the nucleus. It confers resistance on wheat to the mycotoxigenic fungus, Fusarium graminearum[6].

RTF-HCO_011565 ESM-HCO_011565 AF2-HCO_011565

References

  1. Tretyachenko V, Vymětal J, Bednárová L, Kopecký V Jr, Hofbauerová K, Jindrová H, Hubálek M, Souček R, Konvalinka J, Vondrášek J, Hlouchová K. Random protein sequences can form defined secondary structures and are well-tolerated in vivo. Sci Rep. 2017 Nov 13;7(1):15449. PMID:29133927 doi:10.1038/s41598-017-15635-8
  2. Dunker AK, Silman I, Uversky VN, Sussman JL. Function and structure of inherently disordered proteins. Curr Opin Struct Biol. 2008 Dec;18(6):756-64. Epub 2008 Nov 17. PMID:18952168 doi:10.1016/j.sbi.2008.10.002
  3. Prilusky J, Felder CE, Zeev-Ben-Mordehai T, Rydberg EH, Man O, Beckmann JS, Silman I, Sussman JL. FoldIndex: a simple tool to predict whether a given protein sequence is intrinsically unfolded. Bioinformatics. 2005 Aug 15;21(16):3435-8. Epub 2005 Jun 14. PMID:15955783 doi:http://dx.doi.org/10.1093/bioinformatics/bti537
  4. Hu G, Katuwawala A, Wang K, Wu Z, Ghadermarzi S, Gao J, Kurgan L. flDPnn: Accurate intrinsic disorder prediction with putative propensities of disorder functions. Nat Commun. 2021 Jul 21;12(1):4438. PMID:34290238 doi:10.1038/s41467-021-24773-7
  5. Taki AC, Wang T, Nguyen NN, Ang CS, Leeming MG, Nie S, Byrne JJ, Young ND, Zheng Y, Ma G, Korhonen PK, Koehler AV, Williamson NA, Hofmann A, Chang BCH, Häberli C, Keiser J, Jabbar A, Sleebs BE, Gasser RB. Thermal proteome profiling reveals Haemonchus orphan protein HCO_011565 as a target of the nematocidal small molecule UMW-868. Front Pharmacol. 2022 Oct 14;13:1014804. PMID:36313370 doi:10.3389/fphar.2022.1014804
  6. Perochon A, Jianguang J, Kahla A, Arunachalam C, Scofield SR, Bowden S, Wallington E, Doohan FM. TaFROG Encodes a Pooideae Orphan Protein That Interacts with SnRK1 and Enhances Resistance to the Mycotoxigenic Fungus Fusarium graminearum. Plant Physiol. 2015 Dec;169(4):2895-906. PMID:26508775 doi:10.1104/pp.15.01056

Proteopedia Page Contributors and Editors (what is this?)

Joel L. Sussman, Jaime Prilusky

This page complements a publication in scientific journals and is one of the Proteopedia's Interactive 3D Complement pages. For aditional details please see I3DC.
Personal tools