AlphaFold – a Game Changer in Structural Biology?

July 25, 2022
Maximilian Edich

Knowing a protein’s 3D structure enables scientists to investigate its general shape and stability, deduce its potential function, and run drug-binding simulations to find a cure if the protein is from a pathogen. However, solving a structure is not an easy task, as proteins are too small to observe with optical microscopes. Furthermore, the process of collecting and preparing a protein sample and getting the final structure out of the experimental data can take several weeks or even months and is, hence, quite expensive.
In 2021, everything changed when AlphaFold2 was published: a new protein structure prediction software, which is not only incredibly accurate but also easy to use. Immediately, a great hype dominated the news, but did AlphaFold really change structural biology forever and will it make conventional methods obsolete?

Making Protein Structures Visible

Before we dive into the solution to an old problem of structural biology, we have to cover a few basics first. So, what are proteins again? In a nutshell, proteins are tiny nano-sized machines that perform any kind of task in your body. Some proteins organize the replication of your cells, while others digest your food. Another group of proteins makes up your hair, and plenty of other things in your body are also performed by proteins. A protein’s capabilities are determined by its shape, which raises great interest in its exact 3D structure. Unfortunately, proteins are smaller than visible light, so we cannot observe them with any optical microscope. Wait, smaller than light? How is this possible, you may ask? This picture should help to understand this:

AlphaFold – a Game Changer in Structural Biology? 1
Figure 1: Wavelengths in comparison to objects of vastly different size scales. Note that atoms and proteins are much smaller than the smallest wavelength of visible light, making objects on such scales literally invisible to us. The exact wavelength of light also determines the color we perceive.

While visible light can easily interact with objects, which are larger than the wavelength of visible light itself, it does not interact much with smaller objects like proteins and atoms. Luckily, we can also create “light” in the invisible spectrum and measure the proteins with a detector instead of our eyes. One method to do this is X-ray crystallography. The details are complicated, but the important steps are to produce and purify your protein from bacteria, grow crystals out of your protein (yes, you can actually do this!) and shoot X-ray beams at it. With the data from the detector, you are able to create a 3D model of your protein, which may not represent reality perfectly, but is accurate enough to work with. You can find out more about these 3D models in another blogpost.

However, a lot can go wrong with this: The protein could kill the bacteria which should produce it for you, crystals might not form properly, and the process of turning the collected data into a model is not straightforward as well. All in all, this method can produce the desired result, but with a high investment of both, money and time.

The Protein Folding Problem

Proteins consist of long chains of amino acids. Most life on earth, us included, uses twenty different amino acids, each one with its own unique properties. For example, some are positively charged and like to be surrounded by negatively charged molecules or water. Others are not charged at all and prefer to hide inside a protein core and stay away from the water in which the protein is located. A protein’s amino acids are chained together in sequence and their properties define the shape, the final fold, of a protein. So, if you changed one amino acid in the chain into another one with vastly different properties, the whole protein would fold a bit differently. In fact, small changes might only modify the fold slightly, but multiple and huge differences in the sequence result usually in large changes in the fold. Since shape determines a protein’s function, the new protein might also do different things. The important takeaway here is: The information about the 3D structure is hidden in the sequence of amino acids. However, it remained a mystery how the folding into the final shape works in all details and could therefore not be replicated in simulations. This mystery is today known as the protein folding problem.

The hidden information in the amino acid sequence motivated many scientists around the globe to work on protein structure prediction software. Soon, those scientists started to compete against each other in the CASP competition – the Critical Assessment of Techniques for Protein Structure Prediction. Every two years since 1994, the state-of-the-art techniques have been measured against each other, but for decades all methods were not reliable enough and made many mistakes. Only recently, at CASP13 in 2018, Deepmind’s AlphaFold succeeded the first time with predictions of phenomenal quality for many of the given input sequences. Nevertheless, there was still much room for improvement. In 2020, AlphaFold2 followed and produced for the very first time in history structures almost as good as the conventionally solved ones. The protein folding problem itself has been known for more than 50 years now, but only with current technology and new approaches like the use of deep learning, structure prediction became finally real.

How accurate is AlphaFold?

When people talk about AlphaFold, they usually mean AlphaFold2. While scientists use calculations to measure the difference between a prediction and the experimentally solved structure it should resemble, the probably best way to demonstrate its accuracy to a newcomer in the field is a picture:

AlphaFold – a Game Changer in Structural Biology? 2
Figure 2: AlphaFold2 predictions (blue) in comparison to structures obtained from experimental data (green). Left, is the NAB domain from SARS-
CoV-2 protein nsp3 (PDB code 7LGO), on the right is the Mac1 domain from the same protein (PDB code 6WEY).

The green structures are the ones obtained from experimental data; the blue structures are predicted by AlphaFold2. It is already a challenging task to predict which parts fold into helices and sheets (the spirals and the arrows), but AlphaFold2 even predicted a correct alignment of those and also the parts in between, resulting in nearly perfect agreement. The biggest deviations between experimental structure and prediction are usually found at the ends of the chains. While it does not give such excellent results for all proteins, it still performs pretty reliably and even provides some feedback on its confidence, making it easy to spot regions with a wrong fold.

AlphaFold – a Game Changer in Structural Biology? 3
Figure 3: AlphaFold2 prediction of Ubl1 domain of the protein nsp3 from SARS-CoV. It is colored according to AlphaFold2’s confidence, where deep blue means high confidence about a correct fold and red means low confidence.

And to put the breakthrough of AlphaFold2 into perspective: to measure how similar the predictions are to experimentally solved structures, one could calculate a similarity value with one of many available metrics. One metric used in the CASP competition is the GDT, the global distance test, which returns values from 0% (no similarity at all) to 100% (identical structures). While all methods in the past did not pass on average the 60% mark, AlphaFold2 scored consistently with GDTs of over 90%.

Does AlphaFold replace conventional methods?

As we have seen, AlphaFold ist almost as accurate as the models obtained from X-ray crystallography. Does this mean that we no longer need those expensive and time-consuming methods? Well, for several reasons, this is not quite the case.

First and most importantly, predictions are not reality. They can help to simplify things or let us work into the right direction, but they cannot incorporate all the details of real biology. AlphaFold just takes a protein’s sequence of amino acids into account, but in reality, proteins are surrounded by water, small molecules and other proteins, and all of these affect the fold as well. Depending on the environment and on interaction with other molecules, some proteins even switch between multiple folds that are quite stable. Therefore, AlphaFold predictions are only a small part of the complete picture.

Another problem is posed by the so-called membrane proteins, a whole class of proteins which fix themselves at a membrane like, for example, the wall of a cell. Although AlphaFold does predict the individual parts across the membrane correctly, those are rarely aligned to each other correctly. This alignment problem occurs also with very huge proteins consisting of many smaller folded parts.

Last but not least, there are many proteins out there which were not predicted correctly by AlphaFold, so there’s still some room for improvement.
Important to mention is also the fact that AlphaFold was trained on the PDB, a database of experimentally solved protein structures. Without any new experimental data, prediction software like AlphaFold cannot be improved.

In summary, AlphaFold is good, but it is still not perfect.

In any case, the new predictions are useful not only to get a first glimpse of an unknown structure, but also to help solve structures by conventional methods. Remember that protein crystals are not always easy to grow? Well, this is sometimes due to regions which do not form a stable fold. AlphaFold2 can predict those regions and, thus, helps to design more successful experiments.

All in all, it is an incredible tool which does not only generate knowledge in a matter of hours instead of days but can also be used by everyone with no need for laboratory access or being an expert in structural biology. From here, we can only be curious what the next generation of structure prediction will be capable of, as AlphaFold2 already helped scientist around the globe to reveal the structural mysteries of various proteins.

Corinna the Corona Cactus

@
Corinna works as an outreach person for all plant-related business and as a mascot. She gathered previous experience in the garden center, and even though she can be a bit spiky, she likes to cuddle and lie in the sun.
More about this author

Helen Ginn

Senior Research Scientist @ Diamond Light Source, Oxfordshire, UK
Dr Helen Ginn is a senior research scientist at Diamond Light Source in the UK and a computational methods developer in structural biology. She is currently working on Representation of Protein Entities (RoPE) for structural biologists to interpret subtle conformational changes in dynamic protein systems. She has developed Vagabond for torsion angle-driven model refinement and […]
More about this author

Nick Pearce

Assistant Professor @ SciLifeLab DDLS Fellow
Nick obtained his undergraduate degree in Physics from the University of Oxford in 2012, and then his PhD in Systems Approaches to Biomedical Sciences in 2016. He moved to Utrecht in the Netherlands in 2017 to work with Piet Gros, where he obtained an EMBO long-term fellowship and worked on analysing disorder in macromolecular structures. […]
More about this author

Mathias Schmidt

Molecular Life Sciences M.Sc. Student @ Hamburg University
Mathias is currently doing his Master's degree in Molecular Life Sciences at the University of Hamburg and has been an auxiliary scientist in the Corona Structural Taskforce since March 2022. There he is working on the question of the origin of SARS-CoV-2. His undergraduate research focuses on the development of synthetic molecular mechanisms to regulate […]
More about this author

David Briggs

Principal Laboratory Research Scientist @ Francis Crick Institute in London, UK
David Briggs is a Principal Laboratory Research Scientist in the Signalling and Structural Biology lab at the Francis Crick Institute in London, UK. A crystallographer by training, his work focuses on the biophysical and structural characterisation of human extracellular proteins involved in the synapse, which have important ramifications in both psychiatric and neurodegenerative disorders. He […]
More about this author

Lisa Schmidt

Web Developer and Illustrator @ Mullana
Lisa Schmidt is a freelance illustrator who studied Multimedia and Communication (BA) in Ansbach, Germany. Her work is focused on visualising topics around science and technology. She joined the Coronavirus Structural Task Force as media designer, where she does web design, 3D rendering for scientific illustrations and outreach work.
More about this author

Philip Wehling

Nanosciences M.Sc. Student @ Institute for Nanostructure and Solid-State Physics, Hamburg University
Philip has long had an enthusiasm for biological processes which is paired with an analytical understanding of the world. After having worked for a long time as a registered nurse in various fields, he first studied mathematics and finally nanosciences. During a lecture series in preparation for a bachelor's thesis, he came into contact with […]
More about this author

Binisha Karki

Postdoctoral Research Associate @ BioNTech SE
Binisha works as a research associate at BioNTech where she works on the development of COVID-19 vaccine and cancer immunotherapies. She graduated as a Molecular Biology major from Southeastern Louisiana University in May 2019. Post-graduation she worked as a research technician in the Chodera Lab performing biophysical measurements of model protein-ligand systems for computational chemistry […]
More about this author

Binisha Karki

Wissenschaftliche Mitarbeiterin @ BioNTech SE
Binisha ist als wissenschaftliche Mitarbeiterin bei BioNTech angestellt und arbeitet an der Entwicklung von Impfstoffen gegen COVID-19 sowie Krebsimmuntherapien. Sie beendete ihr Studium der Molekularbiologie an der Southeastern Louisiana University im Mai 2019. Anschließend arbeitete sie als Forschungstechnikerin im Chodera-Lab, wo sie biophysikalische Messungen an Modellen von Protein-Liganden-Systeme für computerchemische Benchmarks durchführte.
More about this author

Hauke Hillen

Assistant Professor at the University Medical Center Göttingen & Group Leader at the MPI for Biophysical Chemistry @ University Medical Center Göttingen
Hauke ist Biochemiker und Strukturbiologe. Mit seinem Forschungsteam untersucht er mittels Röntgenkristallografie und Kryo-Elektronenmikroskopie die Struktur und Funktion von molekularen Maschinen, die für die Genexpression in eukaryotischen Zellen verantwortlich sind. Er interessiert sich dabei besonders dafür wie genetisches Material außerhalb des Zellkerns exprimiert wird, zum Beispiel in menschlichen Mitochondrien oder durch Viren im Zytoplasma.
More about this author

Richardson Lab

Richardson Lab @ Duke University, Durham, North Carolina, USA
The long-term goal of the Richardson lab is to contribute to a deeper understanding of the 3D structures of proteins and RNA, including their description, determinants, folding, evolution, and control. Their approaches include structural bioinformatics, macromolecular crystallography, molecular graphics, analysis of structures, and methods development, currently focussed on the improvement of structural accuracy. In this […]
More about this author

Holger Theymann

Agile Leadership Coach @ mehr-Freu.de GmbH
Holger keeps websites running. He makes data from scientific databases appear in nice tables. He also has an eye on keeping the sites fast, safe and reliable. His experience as a software developer, systems architect, agile project manager and coach enabled the Task Force to get the whole process well organized and he even taught […]
More about this author

Florens Fischer

Biology M.Sc. Student @ Rudolf Virchow Center, Würzburg University
Florens is studying biology (M.Sc.) and worked in the Task Force as a student assistant. He has focused on bioinformatics and supports the work on automation of scripts and structuralization of big data with machine learning. He also supported the team in other areas, such as scientific research.
More about this author

Ezika Joshua Onyeka

Public Health M.Sc. student @ Hamburg University of Applied Sciences
Joshua joined Thorn Lab as a student assistant. He is a Public Health practitioner, holds a bachelor's degree in Public Health and is currently enrolled at Hamburg University of Applied Sciences for his MPH. He has helped in implementing some vaccination programmes to improve immunisation coverage and training of immunisation frontline health workers. For the […]
More about this author

Katharina Hoffmann

Molecular Biology M.Sc. student @ Institut für Nanostruktur und Festkörperphysik, Universität Hamburg
Katharina worked as a student assistant at Thorn Lab. Normally, she studies molecular biology at the University of Hamburg. In her master's thesis, which was put on hold by Corona, she is working on the interruption of bacterial communication. Since the lockdown, she has been digging around in databases and analyzing sequences. She never thought […]
More about this author

Nicole Dörfel

Media Designer @
Nicole Dörfel ensures that we and our work are looking good! She is the illustrator, media designer and the artistic soul of the Task Force. She works her magic both in print and digitally—her focus is general media design. In the Task Force, she is mainly responsible for graphics, photo editing, design of all our […]
More about this author

Pairoh Seeliger

Administration Assistant @ Institute for Nanostructure and Solid-State Physics, Hamburg University
Pairoh Seeliger is the admin wizard of the Task Force. She takes care of media requests, handles any logistical issues that come up and makes sure our science doesn’t sound too complicated in our German outreach efforts. She self-describes as "a jack of all trades with a University education in German studies and business administration, […]
More about this author

Oliver Kippes

Biochemistry B.Sc. Student @ Rudolf Virchow Center, Würzburg University
Oli is studying biochemistry (B.Sc) and has completed a training as an IT specialist prior to his studies. With the combined knowledge of his studies and training, he helps maintaining the structural database, programs applications for it and supports the team in literature research. In spite of his study, structural biology was still a new […]
More about this author

Luise Kandler

Biochemistry B.Sc. Student @ Rudolf-Virchow Center, Würzburg University
Luise is a B.Sc. student in biochemistry at the University of Würzburg and joined the Task Force during the first Corona lockdown. She did her bachelor's thesis with the Thorn Lab, where she learned programming with Python and worked on the implementation of a GUI for our machine learning tool HARUSPEX in Coot. In the […]
More about this author

Ferdinand Kirsten

Biochemistry B.Sc. Student @ Rudolf Virchow Center, Würzburg University
Ferdinand did his bachelor's thesis at Thorn Lab on solvent exchange and interactions in macromolecular crystallography. Still new to the world of crystallography and structural refinement, he tries to help wherever he can, with a main focus on literature and genome research as well as structural refinement with Coot. Even if he's more of the […]
More about this author

Kristopher Nolte

Biochemistry B.Sc. Student @ Rudolf-Virchow Center, Würzburg University
Kristopher joined Thorn Lab as part of his bachelor thesis. In this thesis he refined aspects of the diagnostic tool for graphical X-Ray data analysis (AUSPEX) with the help of machine learning. But since the corona crisis halted all our lives, he contributes to the Task Force by using his knowledge of bioinformatics and programming […]
More about this author

Erik Nebelung

Nanoscience M.Sc. Student @ Institute for Nanostructure and Solid-State Physics, Hamburg University
Erik is studying nanoscience with a focus on biochemical methods and applications. From August 2020 till January 2021 he pursued his studies at the iNano institute in Aarhus, before starting his master's thesis back in Hamburg. He had his first taste of protein crystallization during his bachelor's thesis work and this sparked his interest in […]
More about this author

Toyin Akinselure

Nanoscience M.Sc. Student @ Institute for Nanostructure and Solid-State Physics, Hamburg University
Toyin ist a microbiologist and presently an M.Sc. student in nanoscience with a focus on nanobiology and nanochemistry. She is interested in scientific research especially in protein chemistry and drug discovery. In the previous autumn and winter, she interned with two research projects, one in drug discovery and the other in protein structure. She found […]
More about this author

Lea von Soosten

Physics M.Sc. Student @ Institute for Nanostructure and Solid-State Physics, Hamburg University
Lea is a M.Sc. physics student with a great interest in everything related to biology. Even though she comes from a different field, she joined the team to expand her knowledge in biochemistry and help the Task Force with a main focus on literature research. Also, she loves drawing!
More about this author

Sabrina Stäb

Biotechnology M.Sc. Student @ Institute for Nanostructure and Solid-State Physics, Hamburg University
Sabrina is studying biochemistry (M.Sc.) and works as a research assistant for the Thorn Lab and the CSTF. During her bachelor thesis on "Crystallization and Structure Solution of High-Quality Structures for MAD Experiments", she was able to gain a lot of experience in the field of crystallography and now brings this experience to the project. […]
More about this author

Alexander Matthew Payne

Chemical Biology Ph.D. Student @ Chodera Lab, Memorial Sloan Kettering Center for Cancer Research, New York, U.S.
Alex is a Ph.D. student interested in understanding how proteins move! He has recently joined the labs of John Chodera and Richard Hite to work on a joint project involving molecular dynamics and Cryo-EM. His goal is to generate conformational ensembles from Cryo-EM data and simulate the ensemble using massive scale molecular dynamics via Folding@Home. […]
More about this author

Maximilian Edich

Bioinformatics Ph.D. Student @ Institute for Nanostructure and Solid-State Physics, Hamburg University
Max studied bioinformatics and genome research in Bielefeld and joined the CSTF as a Ph.D. student in 2021. Previously, his focus was on molecular modeling. Now, he works on the so-called R-factor gap. He already learned what it is like to be part of a young, scientific team as a member of the iGEM contest […]
More about this author

Agnel Praveen Joseph

Computational Scientist @ Science and Technology Facilities Council, UK
Dr. Agnel Praveen works as a computational scientist in the CCP-EM team at the Science and Technology Facilities Council, UK. He is interested in approaches to interpret and validate maps and atomic models derived from Cryo-EM data and looks also into computational methods for the interpretation of Cryo-ET data. In collaboration with five other sites […]
More about this author

Dale Tronrud

Research Scientist @
Dale Tronrud has both solved protein crystal structures and developed methods and software for the optimization of macromolecular models against X-ray data and known chemical structural information. He has had a long-standing interest in enzyme:inhibitor complexes and photosynthetic proteins, focusing on the Fenna-Matthews-Olson protein. In addition, he has also been involved in the validation and […]
More about this author

Sam Horrell

Beamline Scientist @ Diamond Light Source, Oxfordshire, UK
Sam is a structural biologist working on method development around structural biology at Diamond Light Source, in particular for ways of better understanding how enzymes function through the production of structural movies. Sam is working through deposited structures related to SARS-CoV and SARS-CoV-2 with a view to providing the most accurate protein structures possible for […]
More about this author

Cameron Fyfe

Postdoctoral Research Associate @ Micalis Institute, INRAE, Paris, France
Cameron is a structural biologist who has worked extensively on proteins from microorganisms. With many years of experience in the pharmaceutical industry and in structural biology research, he joined the Task Force to contribute his skills to improve existing models for drug development. He is currently researching Radical SAM enzymes at INRAE. When not in […]
More about this author

Tristan Croll

Postdoctoral Research Associate @ Cambridge Institute for Medical Research, University of Cambridge
Tristan is a specialist in the modelling of atomic structures into low-resolution crystallographic and cryo-EM density, and developer of the model-building package ISOLDE. His focus in the project is on correcting the various errors in geometry and/or chemical identity that tend to occur in less well-resolved regions, with the overall aim of bringing the standards […]
More about this author

Gianluca Santoni

Serial Crystallography Data Scientist @ European Synchrotron Radiation Facility, Grenoble, France
Gianluca is an expert in protein crystallography data collection and analysis. After a PhD in structure-based drug design, he has worked as a postdoc on the beamline ID23-1 at the European Synchrotron Radiation Facility (ESRF) and has developed the SSX data analysis software ccCluster. His current interests are the optimization of data collection strategies for […]
More about this author

Yunyun Gao

Postdoctoral Research Associate in the AUSPEX Project @ Institute for Nanostructure and Solid-State Physics, Hamburg University
Yunyun is a method developer for strategies of analysing data from biomacromolecules. Before joining the Thorn group, he had been working on SAXS/WAXS of polymers and proteins. He is interested in improving objectivity and reliability of data analysis. Yunyun is currently extending the functionality of AUSPEX. He is the repository manager and AUSPEX handler for […]
More about this author

Johannes Kaub

Scientific Coordinator @ Institute for Nanostructure and Solid-State Physics, Hamburg University
Johannes Kaub studied chemistry at RWTH Aachen, with a focus on solid-state physical chemistry, before serving as a scientific employee at the Max Planck Instiute for the Structure and Dynamics of Matter. He supports the Coronavirus Structural Task Force as a scientific coordinator with his organizing ability and his talent for solving problems. Other than […]
More about this author

Andrea Thorn

Group Leader @ Institute for Nanostructure and Solid-State Physics, Hamburg University
Andrea is a specialist for crystallography and Cryo-EM structure solution, having contributed to programs like SHELX, ANODE and (a little bit) to PHASER in the past. Her group develops the diffraction diagnostics tool AUSPEX, a neural network for secondary structure annotation of Cryo-EM maps (HARUSPEX) and enables other scientists to solve problem structures. Andrea is […]
More about this author

Leave a Reply

Your email address will not be published. Required fields are marked *

cross