An AI company called DeepMind, which was acquired by Google, may just have solved one of the most important challenges in biology. It’s a breakthrough that rivals the discovery of DNA’s double helix. It could change biotech, drug discovery, and even vaccine development forever.
In a major scientific advance, the latest version of DeepMind’s AI system AlphaFold has been recognized as a solution to the 50-year-old grand challenge of protein structure prediction, often referred to as the ‘protein folding problem’, according to a rigorous independent assessment. This breakthrough could significantly accelerate biological research over the long term, unlocking new possibilities in disease understanding and drug discovery, among other fields.
Thanks to AI, we now have stunningly powerful tools to decode the structures of life. Recently, both DeepMind and the University of Washington described deep learning-based methods to solve protein folding, the last step of compiling the programming in our DNA, as a “once in a generation advance.”
Results from the CASP14 competition show that DeepMind’s latest AlphaFold system achieves incredible levels of accuracy in structure prediction. The system is able to determine highly accurate structures in a matter of days. CASP, the Critical Assessment of protein Structure Prediction, is a community-run assessment started in 1994, and it’s the standard for assessing predictive techniques. Participants must blindly predict the structure of proteins that have only recently, or in some cases not yet, been experimentally determined, and wait for their predictions to be compared to the experimental data.
Proteins are the constituents of life. They form our bodies, fuel our metabolism, and are the target of most of today’s medicines. They start out as a ribbon shape, translated from DNA, and subsequently fold into extremely complex three-dimensional architectures. Many proteins further assemble into massive, moving structures that can change their structure depending on their functional needs at the moment.
Misfolded proteins can be devastating, causing extreme health problems from sickle cell anemia to cancer and Alzheimer’s disease. One of biology’s grandest mysteries for the past 50 years has been deciphering how a simple one-dimensional ribbon-like structure turns into such 3D shapes.
“Lots of people have broken their head on it,” said Professor John Moult at the University of Maryland.
CASP uses the “Global Distance Test (GDT)” metric to assess accuracy, ranging from 0-100. The new AlphaFold system achieves a median score of 92.4 GDT overall across all targets. The system’s average error is approximately 1.6 Angstroms, about the width of an atom. According to Professor John Moult, also a co-founder and chair of CASP, a score of around 90 GDT is informally considered to be competitive with results obtained from the experimental methods.
“We have been stuck on this one problem, how do proteins fold up, for nearly 50 years. To see DeepMind produce a solution for this, having worked personally on this problem for so long and after so many stops and starts wondering if we’d ever get there, is a very special moment.”
The challenge that mystified chemists and biologists for so long that computer-based protein structure prediction has been made into crowd-sourcing games, global competitions, and a Nobel Prize in search of a breakthrough.
DeepMind rose to fame with algorithms that outperformed humans in games such as Go and the entire Atari games list. The win for protein structure prediction, however, marks its most important entry into solving problems in the real world.
DeepMind isn’t the only contender in the protein folding game. AlphaFold relies on biological data and insights. This week, a group of experimental scientists also delivered. By tactically changing the genes of a complicated protein assembly and observing the outcome, the team was able to build an algorithm that reconstructs the protein with extremely high accuracy.
Together, we’re on a fast-track to a paradigm shift. “This will change medicine,” said Dr. Andrei Lupas at the Max Planck Institute for Developmental Biology. “It will change research. It will change bioengineering. It will change everything.”
A central belief in biology is “structure explains function.” The discovery of the double helix shape of DNA, for example, lead to insights into how genetic information is copied and stored. Without the structure, we wouldn’t have gene editing, DNA computers, or storage devices.
Protein structures contain as much, if not more, information. But they’re far harder to decipher. Based on enormously complicated biophysics, much of which remains mysterious, the string of amino acids folds into delicate shapes, such as sheets of twisting and turning strands, or helices that wrap around each other. Many of these structures further couple into a megaplex. Only then can they function as intended to sustain life.
If we know a protein’s structure, we can make educated guesses about its function. And by mapping thousands of protein structures, we can begin to decipher the biology of life, and find ways to manipulate it.
To conclude, AlphaFold is one of DeepMind’s most significant advances. But as with all scientific research, there’s still more to be understood, including figuring out how multiple proteins form complexes, how they interact with DNA, RNA, or small molecules, and how to determine the precise location of all amino acid side chains. AlphaFold breaks new ground in demonstrating the stunning potential for AI as a tool to aid fundamental scientific discovery. DeepMind looks forward to collaborating with others to unlock that potential. “AlphaFold is one of our most significant advances to date,” wrote the DeepMind team. [The progress] “gives us further confidence that AI will become one of humanity’s most useful tools in expanding the frontiers of scientific knowledge, and we’re looking forward to the many years of hard work and discovery ahead!”