Knowing a protein’s 3D structure enables scientists to investigate its general shape and stability, deduce its potential function, and run drug-binding simulations to find a cure if the protein is from a pathogen. However, solving a structure is not an easy task, as proteins are too small to observe with optical microscopes. Furthermore, the process of collecting and preparing a protein sample and getting the final structure out of the experimental data can take several weeks or even months and is, hence, quite expensive.
In 2021, everything changed when AlphaFold2 was published: a new protein structure prediction software, which is not only incredibly accurate but also easy to use. Immediately, a great hype dominated the news, but did AlphaFold really change structural biology forever and will it make conventional methods obsolete?
Making Protein Structures Visible
Before we dive into the solution to an old problem of structural biology, we have to cover a few basics first. So, what are proteins again? In a nutshell, proteins are tiny nano-sized machines that perform any kind of task in your body. Some proteins organize the replication of your cells, while others digest your food. Another group of proteins makes up your hair, and plenty of other things in your body are also performed by proteins. A protein’s capabilities are determined by its shape, which raises great interest in its exact 3D structure. Unfortunately, proteins are smaller than visible light, so we cannot observe them with any optical microscope. Wait, smaller than light? How is this possible, you may ask? This picture should help to understand this:
While visible light can easily interact with objects, which are larger than the wavelength of visible light itself, it does not interact much with smaller objects like proteins and atoms. Luckily, we can also create “light” in the invisible spectrum and measure the proteins with a detector instead of our eyes. One method to do this is X-ray crystallography. The details are complicated, but the important steps are to produce and purify your protein from bacteria, grow crystals out of your protein (yes, you can actually do this!) and shoot X-ray beams at it. With the data from the detector, you are able to create a 3D model of your protein, which may not represent reality perfectly, but is accurate enough to work with. You can find out more about these 3D models in another blogpost.
However, a lot can go wrong with this: The protein could kill the bacteria which should produce it for you, crystals might not form properly, and the process of turning the collected data into a model is not straightforward as well. All in all, this method can produce the desired result, but with a high investment of both, money and time.
The Protein Folding Problem
Proteins consist of long chains of amino acids. Most life on earth, us included, uses twenty different amino acids, each one with its own unique properties. For example, some are positively charged and like to be surrounded by negatively charged molecules or water. Others are not charged at all and prefer to hide inside a protein core and stay away from the water in which the protein is located. A protein’s amino acids are chained together in sequence and their properties define the shape, the final fold, of a protein. So, if you changed one amino acid in the chain into another one with vastly different properties, the whole protein would fold a bit differently. In fact, small changes might only modify the fold slightly, but multiple and huge differences in the sequence result usually in large changes in the fold. Since shape determines a protein’s function, the new protein might also do different things. The important takeaway here is: The information about the 3D structure is hidden in the sequence of amino acids. However, it remained a mystery how the folding into the final shape works in all details and could therefore not be replicated in simulations. This mystery is today known as the protein folding problem.
The hidden information in the amino acid sequence motivated many scientists around the globe to work on protein structure prediction software. Soon, those scientists started to compete against each other in the CASP competition – the Critical Assessment of Techniques for Protein Structure Prediction. Every two years since 1994, the state-of-the-art techniques have been measured against each other, but for decades all methods were not reliable enough and made many mistakes. Only recently, at CASP13 in 2018, Deepmind’s AlphaFold succeeded the first time with predictions of phenomenal quality for many of the given input sequences. Nevertheless, there was still much room for improvement. In 2020, AlphaFold2 followed and produced for the very first time in history structures almost as good as the conventionally solved ones. The protein folding problem itself has been known for more than 50 years now, but only with current technology and new approaches like the use of deep learning, structure prediction became finally real.
How accurate is AlphaFold?
When people talk about AlphaFold, they usually mean AlphaFold2. While scientists use calculations to measure the difference between a prediction and the experimentally solved structure it should resemble, the probably best way to demonstrate its accuracy to a newcomer in the field is a picture:
The green structures are the ones obtained from experimental data; the blue structures are predicted by AlphaFold2. It is already a challenging task to predict which parts fold into helices and sheets (the spirals and the arrows), but AlphaFold2 even predicted a correct alignment of those and also the parts in between, resulting in nearly perfect agreement. The biggest deviations between experimental structure and prediction are usually found at the ends of the chains. While it does not give such excellent results for all proteins, it still performs pretty reliably and even provides some feedback on its confidence, making it easy to spot regions with a wrong fold.
And to put the breakthrough of AlphaFold2 into perspective: to measure how similar the predictions are to experimentally solved structures, one could calculate a similarity value with one of many available metrics. One metric used in the CASP competition is the GDT, the global distance test, which returns values from 0% (no similarity at all) to 100% (identical structures). While all methods in the past did not pass on average the 60% mark, AlphaFold2 scored consistently with GDTs of over 90%.
Does AlphaFold replace conventional methods?
As we have seen, AlphaFold ist almost as accurate as the models obtained from X-ray crystallography. Does this mean that we no longer need those expensive and time-consuming methods? Well, for several reasons, this is not quite the case.
First and most importantly, predictions are not reality. They can help to simplify things or let us work into the right direction, but they cannot incorporate all the details of real biology. AlphaFold just takes a protein’s sequence of amino acids into account, but in reality, proteins are surrounded by water, small molecules and other proteins, and all of these affect the fold as well. Depending on the environment and on interaction with other molecules, some proteins even switch between multiple folds that are quite stable. Therefore, AlphaFold predictions are only a small part of the complete picture.
Another problem is posed by the so-called membrane proteins, a whole class of proteins which fix themselves at a membrane like, for example, the wall of a cell. Although AlphaFold does predict the individual parts across the membrane correctly, those are rarely aligned to each other correctly. This alignment problem occurs also with very huge proteins consisting of many smaller folded parts.
Last but not least, there are many proteins out there which were not predicted correctly by AlphaFold, so there’s still some room for improvement.
Important to mention is also the fact that AlphaFold was trained on the PDB, a database of experimentally solved protein structures. Without any new experimental data, prediction software like AlphaFold cannot be improved.
In summary, AlphaFold is good, but it is still not perfect.
In any case, the new predictions are useful not only to get a first glimpse of an unknown structure, but also to help solve structures by conventional methods. Remember that protein crystals are not always easy to grow? Well, this is sometimes due to regions which do not form a stable fold. AlphaFold2 can predict those regions and, thus, helps to design more successful experiments.
All in all, it is an incredible tool which does not only generate knowledge in a matter of hours instead of days but can also be used by everyone with no need for laboratory access or being an expert in structural biology. From here, we can only be curious what the next generation of structure prediction will be capable of, as AlphaFold2 already helped scientist around the globe to reveal the structural mysteries of various proteins.
The instructions and files below will allow you to create your own model of the virus! All you need is some spare time and a 3D printer. In addition, those without access to a 3D printer can still use the STL files to request printing from external services and then follow the instructions on assembling the same way. We do hope that this model will make the virus more tangible and that the model will not only be printed as a private project, but also be used for outreach activities and in educational institutions.
These instructions refer to the updated SARS-CoV-2 3D model released in early 2022 that considers new scientific insights and improves on the original. You can find details about the changes (and our reasons for them) in another blog post soon.
Our design is based on the best scientific evidence available. Not only are the shapes of the various proteins as close to the measured molecular structures as possible, but their numbers as well as the overall size of the virion match experimental results on a scale of 1:1,000,000. Therefore, 1 mm on the model represents 1 nm (10 Å). (By the way, this would make the RNA that is inside the virus hull 10 meters long and 1 mm thick.
For easier printing and assembly, the virus structure has been broken down into individual components:
|Objekt||Virion (Oberteil & Unterteil)||Spike Protein|
|Druckanzahl||je 1||insgesamt 26|
|Details||Die beiden Virion-Komponenten sind komplett gefüllte, unregelmäßige Halbkugeln. Der Viruskörper wurde in zweigeteilt, um flache Oberflächen zu erzeugen. Dadurch wird der Bedarf an Stützen minimiert und die Menge an überschüssigem Material beim Drucken verringert. Die Außenflächen enthalten Einbuchtungen, in denen die Spike-Proteine in einem Deklinationswinkel sitzen. Dieses Modell enthält 26 solcher Löcher. Die Oberfläche des Virions ist strukturiert, um die E- und M-Proteine von SARS-CoV-2 darzustellen.||Das Spike-Protein ist der schwierigsten Teil des Druckprozesses. Jeder Stachel besteht aus einer komplexen kronenartigen Oberfläche am Kopf und einem stützenden Stamm, der ihn mit dem zentralen Virion verbindet. Die einzelnen Spike-STL-Dateien zeigen den Spike in verschiedenen Ausprägungen und Winkeln|
To date, the structures have been printed successfully on several Fused Deposition Modelling (FDM) printers (Rostok MAX v2 & Prusa I3 MK3 printers), and we anticipate the even higher quality structures will be feasible with alternative methods, such as stereolithography (watch this space). Each of the parts is available in STL format and is printable through any suitable slicer software. Personal discretion is advised when setting up the prints, as the exact details may differ depending on conditions and equipment. The procedure outlined below will serve as a good starting point. Let us know of your experience in the comments!
Printing of the component parts
The first step is to print the individual components. For the body parts this is very straight forward as the surface negates the need for supports. The body objects can be printed with the minimum infill for support, though infill of 10% is recommended for rigidity.
The other parts spike proteins provide a more challenging print. The spike protein must be printed 26 times to complete the model. To represent the variety of conformations among the proteins in any given moment, we provide spikes in three different tilt angles in both the extended and the retracted state each. For a mixture of spike proteins that reflects a real-life virion reasonably well, we recommend this distribution:
- 3x 30° extended
- 4x 30° retracted
- 5x 40° extended
- 7x 40° retracted
- 3x 50° extended
- 4x 50° retracted
To represent their flexibility, the spikes can be fixed by springs to the body. The springs have to be 3.25 mm in outer diameter and 19 mm in length. We recommend using stainless steel springs. To bend the springs, they are pulled onto several solid wires, which are bent into different spiral shapes. The wires with springs are then placed on a hot plate at 250 °C for 30 minutes. By using this method, different, random spring angles can be realized and the springs can be prevented from bending back into their original position. For the model, 26 bent springs are required.
It is recommended that the spike protein is printed lying sideways, as this results in stronger stalks. It is not too difficult to remove the supports of the stalks without breaking them.
We used FDM printing and ubiquitous poly-lactic acid (PLA), which made the post-processing easier.
A dual extruder printer would be ideal for spike printing as it would allow supports to be printed with water-soluble plastic, speeding up post-processing. In any case, printing individual or at least fewer spikes with greater spacing generally produces nicer objects which are easier to work with at the price of longer printing time.
Regardless of the approach taken for printing, some tidying will typically be needed to get the virion ready for assembly. Removing the supports can be done with a pair of pliers, while the smaller artifacts and issues will need brushing off or sanding. A dental pick can be quite useful.
For PLA, we found the best thing to clean and smooth the surfaces (after support removal) is ethyl acetate. While ethyl acetate is readily available in many chemical labs or at a pharmacy, acetone-free nail-polish remover offers a commercially accessible alternative. You should be using safety glasses and accurately fitting (!) gloves when handling ethyl acetate, ventilate the room well and, in case of skin contact, use a skin cream after washing your hands! It dissolves the plastic, breaking down the small extrusion artifacts on the surfaces and can be applied in many ways. We found it best to leave the parts in a sealed ethyl acetate vapor environment, e.g. by putting it in a smaller open vessel into a stainless-steel pot, which should be cleaned carefully afterwards. This technique results in even and clean results, though it will take up to a few days to fully smooth each object. The faster method is to simply submerge the small objects in ethyl acetate for 10-30 seconds, and then remove each object, leaving them to dry out on a surface. For the larger virion parts, the surface can be smoothed by rubbing it down with a cloth damped with ethyl acetate, which was also used to “weld” the two viral hull halves parts together. A small amount was dropped onto the flat surfaces on each section, before the two were pressed together until the plastic fused to become a single object. The seam was then smoothed down using the same process as before.
For acrylonitrile butadiene styrene (ABS), acetone may produce the same results.
Ready for Assembly!
Finally, the 3D model can be assembled. For assembling, the springs are first fixed with superglue in the holes of the body. UV resin is then used for their final fixation. The UV resin also serves as a filling material to completely close the holes. The spikes are attached to the springs in the same way as the springs are attached to the body. If the intention is to paint the model, we recommend assembling the model before starting with the coloring. In this way, the UV resin used to fill the holes can also be painted and a more visually appealing result can be achieved.
We hope that our adventure in 3D printing the coronavirus inspires you to give it a try! The process we described was completed in a little over a week. The printing jobs were completed in just over two days, the cleaning and post-processing took another two days, while the painting was done over the course of a weekend. This article provides a description of our technique and should provide enough detail on how, with the outlined necessary tools, you can create a similar result. The files have been distributed through Thingiverse under a Creative Commons BY-NC license: You may remix, adapt, and build upon this work non-commercially and acknowledge the "Coronavirus Structural Task Force" as original author.
As with every 3D printed model, there are many different ways this could be tackled and achieved, and we look forward to seeing the many creative ways explored by others in this endeavor. Please do share experiences and results with us, either through the comments Thingiverse or on Twitter (you can tag us @thornlab).
For a sense of perspective, we have also produced a model of the rhinovirus, which is one of the viruses that cause the common cold, at the same scale. It is available in STL format here: https://www.thingiverse.com/thing:4556845
We want to emphasize that the writing of this blog entry was a collaboration of a several people:
Dale Tronrud and Thomas Splettstößer worked together to create the STL files for the 3D model. Dale was the person to suggest it first (with Andrea Thorn picking up on the idea). Thomas then selected the experimental models and placed all the parts to form a realistic representation. Dale provided the knowledge about the limitations imposed by the nature of 3D printing and broke up Thomas’ model into printable parts that can be assembled without too much difficulty. He printed and assembled the first virion from this design. The updated model was printed at the facilities of the Physics department at the Universität Hamburg with generous support from PhysNET and Martin Stieben. Yunyun Gao and Philip Wehling refined the model, and Matthias Stäb painted the one shown in the pictures.
Our goal was to create a 3D printable model of SARS-CoV-2 that is as close to the actual virus as possible, but there are a lot of confusing and contradictory descriptions of the coronavirus SARS-CoV-2 in the literature. Combing through them in order to establish a complete understanding and a clear image of the virus that causes COVID-19 is a tremendous task and as new scientific observations come to light, our conceptions are being challenged, and need for adjustments arises.
At the start of the pandemic, when there were few measurements of the novel virus, its appearance was mainly inferred from knowledge about other coronaviruses, especially the closely related SARS-CoV-1 which caused the previous SARS pandemic . The Coronavirus Structural Task Force created a printable model based on those images, but since then a lot of imaging data has been collected on the new virus, itself, and those images indicate that SARS-CoV-2 is sufficiently different. We have combed the literature and used what we learned to design an updated model.
The Shape of the Virus
Because viruses are so small, it is very difficult to directly observe them even with modern imaging techniques. (For more details, see here.) In the olden days, viruses were studied by coating samples with metal and then taking an electron microscopic image (see figure) so that the surface can be clearly seen. This is, by the way, the origin of the name “coronavirus”: The viral particles looked a bit like little suns surrounded by a corona of rays of light.
The most powerful ways to study the shape of a virus do not just investigate a single particle, but average the images of hundreds of them together resulting in information that does not describe a single virion but instead their average. In reality, and particularly for coronaviruses, every individual virus particle looks different, some being larger, and some smaller. They are only perfectly round in the absence of any exterior forces or perturbing internal structure. It does not take much to deform the “wobbly" thin double membrane hull of SARS-CoV-2. For example, it has been suggested that coronavirus particles are created in a close-to-perfect spherical shape and exposure to a slightly acidic environment causes virions to become a bit deformed, a change that might be important for infectiousness, and it has been postulated that the conformation of the M proteins affects the membrane curvature and, hence, the shape of the virus . Our model is therefore not exactly round but more shaped like a potato. We do not claim that this is “the shape” of the virus, but simply one of many possible shapes.
A second difference between the old and new data is that SARS-CoV-2 appears to be smaller than the other viruses studied before this pandemic. To reflect the new data we have reduced the diameter of our model by 12%, from 100 mm to 88 mm. (Our model is scaled so 1 mm corresponds to 1 nm.) As with virual shape, individual viruses have a variety of diameters and 88 mm is simply one of many sizes which are consistent with the population. Due to the adjustments to the virus body its hull surface was reduced by roughly 23%. Research suggests that M proteins are distributed on the surface of a virion with a roughly constant density so we reduced their number by a matching 23%. The data for the E protein, however, indicates a roughly constant number in each virion so we have left their number at twenty. At this scale, the RNA contained in the virion would be about 10 meters long and one mm thick. One of the models for the virion body that we supply has a hole in the bottom that, both, allows the model to sit on a table and be used as a paper weight, and allows a 10 m long piece of twine to be hidden inside. The twine can be extracted during a demonstration to emphasize the surprising length of the SARS-CoV-2 RNA molecule.
Reconsidering the Spikes
As was found for the size of the virion, the experimental data derived from direct study of SARS-CoV-2 indicates a different number of spikes per particle than seen in previous studies, and the new number is far smaller. Current literature suggests the average number of spikes on the SARS-CoV-2 viral surface ranges from 26  to 48  . Our first model of SARS-CoV-2, based on images of other coronaviruses, had around 100 spikes but in our new model we have reduced their number to only 26.
While it is possible that SARS-CoV-2 does, indeed, have a different number of spikes than other coronaviruses, it is also possible that this substantial difference might well stem from changes in experimental approaches and circumstances. The microscopes and imaging techniques of today are much more powerful than those of even ten years ago.
In the past, despite the inconceivably small size of a single virion, coronavirus particles were too large to for all of its parts to be in focus in an image. Since one could only “see” the spikes in a limited region of a virus some rather elaborate methods were employed to estimate the total number of spikes. One of these is the Tammes problem, a mathematical puzzle that aims to distribute the maximum number of nonoverlapping, equal-sized circles on the surface of a sphere. Estimates for the minimal distance between two spike proteins on the viral surface are readily available even from 2D imaging, and following the assumption that the entire surface of the virion is covered with spikes packed as closely as possible the Tammes method suggests the presence of around 50 spikes per virus particle.
However, many images of SARS-CoV-2 show gaps between spikes, and it is quite likely that the density of spikes is lower. New techniques applied to SARS-CoV-2 can produce a sharp image of an entire, individual particle allowing the spikes to simply be counted. These studies show an average of 26 spikes per virion and this is probably the most reasonable estimate available until further experimental evidence emerges. This is just an average; the exact number of spikes on a particular virion varies around that value. The same publication also states that spikes are able to rotate freely on the viral surface and are not standing entirely upright, but are instead inclined by 40° on average , again varying a lot from spike to spike. Our updated version of the SARS-CoV-2 model incorporates these new findings. It has 26 spikes whose stalks are bent at angles of 30°, 40° and 50°. To improve the model’s flexibility and sturdiness, we also produced a version where the spike tops are connected to the body via springs. However, enthusiasts should remember that the spike proteins (as well as membrane and envelope proteins) are also “swimming” in the bilipidic membrane that serves as the outer hull, which we could not represent in our model.
Now, where’s the Model?
We made the updated files for the 3D-printable virus model available on thingiverse.
In this blog article you can find detailed construction guidance for the new model.
And here's a 3D preview of the new model on Sketchfab: