Today we report the development and initial applications of RoseTTAFold, a software tool that uses deep learning to quickly and accurately predict protein structures based on limited information. Without the aid of such software, it can take years of laboratory work to determine the structure of just one protein. With RoseTTAFold, a protein structure can be computed in as little as ten minutes on a single gaming computer. This work was led by Baker lab postdoctoral scholar Minkyung Baek, Ph.D.
RoseTTAFold is a “three-track” neural network, meaning it simultaneously considers patterns in protein sequences, how a protein’s amino acids interact with one another, and a protein’s possible three-dimensional structure. In this architecture, one-, two-, and three-dimensional information flows back and forth, allowing the network to collectively reason about the relationship between a protein’s chemical parts and its folded structure.
As reported in Science, our team used RoseTTAFold to compute hundreds of new protein structures, including many poorly understood proteins from the human genome. We also generated structures directly relevant to human health, including for proteins associated with problematic lipid metabolism, inflammation disorders, and cancer cell growth. And we show that RoseTTAFold can be used to build models of complex biological assemblies in a fraction of the time previously required.
“In just the last month, over 4,500 proteins have been submitted to our new web server, and we have made the RoseTTAFold code available through the GitHub website. We hope this new tool will continue to benefit the entire research community”.
This work was supported in part by Microsoft, Open Philanthropy Project, Schmidt Futures, Washington Research Foundation, National Science Foundation, Wellcome Trust, and the National Institute of Health. A full list of supporters is available in the manuscript.
Baek M, et al. Accurate prediction of protein structures and interactions using a 3-track network. Science. July 2021. PDF