The X-ray crystallography phase problem solved thanks to AlphaFold and RoseTTAFold models: a case study report
Irène Barbarin-Bocahu, Marc Graille [1]
Molecular Tour
Depending on their amino acid sequences, the proteins fold into specific three-dimensional (3D) structures, which are crucial to fulfill their cellular and biochemical functions. The knowledge of the 3D structure of all the proteins is then very important to appreciate their biological and biochemical functions as well as the potential impact of mutations associated with diseases. Very high accuracy in protein 3D structure prediction has recently been reached by the deep-learning based programs AlphaFold or RoseTTAFold. 
Here, we take advantage of these high quality models to solve the crystal structure of a yeast protein involved in an mRNA quality control pathway. This article illustrates one of the important applications of 3D protein structure models generated by AlphaFold or RoseTTAFold programs, i.e. in solving the X-ray crystallography phase problem. We discuss the different strategies to generate search models. Finally, this article also contributes to the validation of the very high quality of these models.
The KlNmd4 protein is made of a three layered 𝛼/β/𝛼 core consisting in a central five stranded parallel β-sheet surrounded by six 𝛼-helices (𝛼1 to 𝛼4, 𝛼10 and 𝛼11) on one side and five (𝛼5 to 𝛼9) on the other side.  colored from its N-terminal (blue) to its C-terminal (red) extremities.
 (beige). The full-length KlNmd4 AlphaFold model is colored according to the pLDDT values. The region 81-114 from the KlNmd4 crystal structure is highlighted in             pinkpink.
- pLDDT > 90 blue
- 70 < pLDDT < 90 cyan
- 50 < pLDDT < 70 yellow
- pLDDT < 50 orange
References
- ↑ Barbarin-Bocahu I, Graille M. The X-ray crystallography phase problem solved thanks to AlphaFold and RoseTTAFold models: a case-study report. Acta Crystallogr D Struct Biol. 2022 Apr 1;78(Pt 4):517-531. doi:, 10.1107/S2059798322002157. Epub 2022 Mar 16. PMID:35362474 doi:http://dx.doi.org/10.1107/S2059798322002157