Coronavirus
Structural Task Force

Spike Glycoprotein: Corona’s Key for Invasion

COVID-19 is caused by the new coronavirus SARS-CoV-2. This virus has a characteristic virus hull featuring surface proteins which are commonly called “spikes”. Protruding from the viral hull like “spikes of a crown”, they give the coronavirus its name (corona = crown).  These proteins make the first contact with human cells and are akin to keys that use a human receptor called “angiotensin-converting enzyme2” (ACE2) as a backdoor to gain access to and infect the cell.

SARS-COV2 Animated picture. Realistic surface and spike proteins with glycosylation. Image: Thomas Splettstoesser; www.scistyle.com
Fig. 1. SARS-COV2 Animated picture. Numerous spike proteins, coloured in green, protrude from the virus hull which is coloured in brown. Spikes enable the coronavirus to invade human epithelial cells. Image: Thomas Splettstoesser; www.scistyle.com

1. Fuction of ACE2

ACE2 is a membrane protein which is anchored in the human cell membrane of epithelial cells. This type of cells can be found on the surface of lung, intestine, heart and kidney tissue. As a type I membrane protein, its primary function is to take part in maturation of angiotensin, a peptide hormone which controls vasoconstriction and blood pressure. ACE2 can be compared to a lock which can be unlocked by the coronavirus spike protein. The virus can then enter the cell and hijack its functions to reproduce itself, thus causing the Covid-19 infection which poses a serious danger to humanity, especially for older people and people with pre-existing conditions. For this reason, one approach to combating SARS-CoV-2 is to target and inhibit the spike to prevent infection. In order to do so, knowledge of the structural features of the spike and its interaction processes with ACE2 are indispensable. (Further information about how macromolecular structures are visualized can be found on our homepage: https://insidecorona.net/visualizing-macromolecular-structures/)

2. Spike: Structure and Fusion Mechanism

Fig. 2. Image of a spike protein (green) protruding out of the viral envelope (brown). This image shows the structure of a spike protein divided into several subdomains. Each subdomain comprises a specific function necessary for binding and fusion. The transmembrane domain anchors the spike protein in the virus membrane.  Heptat repeat 1, 2 and the fusion peptide play key roles in mediation of the fusion process and with the RBD domain, the virus makes contact to human cells. Note that only “stumps” of carbohydrate chains are shown. Image: Thomas Splettstoesser; www.scistyle.com

The Spike protein has a trimeric shape comprising three identical monomeric structural elements. Each of these monomers can fold out akin to a modern car key with a fold-out key element with specific teeth on its surface. This fold-out key element is the so-called “receptor binding domain” (RBD). The spike can only interact with ACE2 when its RBD is in a folded-out position, exposing its teeth, or  “receptor binding motive” (RBM). As the name suggests, it comprises a motive of different amino acids which then can bind and unlock the ACE2 receptor. This key lock mechanism triggers a cascade of events initiating fusion with the host cell. First, protein scissors are recruited to the binding site. These scissors (furin & transmembrane serine protease 2) cleave the spike protein for subsequent activation. The active spike molecule then rearranges itself to form a long structural “hook” (formed of HR1/ HR2 and FP see Fig.2) that brings the epithelial cell and viral cell membrane into close proximity for fusion. Once the fusion is completed, the path for the virus is clear to transfer its genome encoded in ribonucleic acid (RNA) into the host cell. This successful transfer then enables the virus to multiply itself and finally spread from cell to cell, causeing Covid-19 in its wake.

Fig. 3. This image shows a spike protein in complex with the human ACE2 receptor. (PDB:6vsb/6lzg). Left: The structure of a spike protein coloured in orange in complex with the human ACE2 receptor coloured in light orange. The white box shows the interaction site which is shown enlarged in the image ion the right. Right: The interaction site between spike and ACE2. Spike's "receptor binding domain (RBD)" includes a "receptor binding motif (RBM)" whose amino acids interact with those of the human receptor through hydrophilic interactions. These amino acids are shown as sticks protruding from the RBM and ACE2. Image: Sabrina Stäb

3. Evading the Immune System with Carbohydrate Chains

The human immune system normally recognizes the surface proteins of foreign organisms such as viruses or bacteria and reacts with an immune response to combat them. Spike proteins are such surface proteins but because of structural peculiarities, the coronavirus evades both the innate and the adaptive human immune system. The secret of these structural peculiarities are the N-glycans. These are long carbohydrate chains which sit on spike’s surface.  Each spike comprises 66 N-glycans forming a protective shield around the protein. Hence the human immune system has problems recognizing spikes and identifying the coronavirus as an enemy.

Fig. 5. Ribbon diagrams of a spike trimer with N-glycans on its surface coloured in cyan (PDB: 6vxx). In Image a, the spike protein is shown sideways and in b, the trimer can be seen from above. Unfortunately, both X-ray crystallography and cryo-EM cannot resolve long carbohydrate chains, so the structures of the chains shown in Figure 4 contain a maximum of three sugar monomers, while in most cases, the carbohydrate chains are much longer, covering most of the contact surfaces of the upper spike protein. Image: Sabrina Stäb

The COVID 19 pandemic has a massive impact on our lives, our health and the global economy. Scientists around the world are trying to develop new drugs to combat the virus. Since the spike plays a critical role in the infection process, it is a prime target for drug development against the pandemic.  One drug approach to inhibit the interaction between spike and the ACE2 receptor is to cap the spike protein using antibodies. Antibodies are proteins, normally produced by the human immune system to fight viruses. The idea is to treat patients with antibodies that cap the RBD of spike, thus preventing interactions with ACE2. This would lead to a nonfunctional spike, blocking the coronavirus from entering the cell (The key would no longer fit the lock). Another approach includes the development of small molecules that target and inactivate the protein scissor transmembrane serine protease 2 (see chapter 2), as the spike’s functionality depends on its cleavage activity. Since the spike protein decorates the virus hull, it could even be part of a potential vaccine. For this reason,  the spike protein could also become the key in the molecular fight against COVID-19.

Overview

The surface proteins, also called the “spike” or S-proteins, protrude from the viral envelope of SARS-CoV-2 like “spikes of a crown”, thus giving the coronavirus its name. They mediate entry into the host cell by binding to a cellular receptor called angiotensin-converting enzyme (ACE2), triggering a cascade of events leading to membrane fusion and entry. The Spike protein is formed by three identical monomers, each consisting of the two subunits S1 and S2. Subunit S1 comprises a receptor binding domain (RBD), which interacts with ACE2 on human epithelial cells. ACE2 is a type I membrane protein expressed in lungs, heart, kidneys, and intestines, and takes part in maturation of angiotensin, a peptide hormone which controls vasoconstriction and blood pressure.

Fig. 1. Image of a spike glycoprotein (yellow) protruding out of the viral envelope. Spike subunits S1 and S2 can be divided into several subdomains. The S1 subunit comprises an N-terminal domain (NTD) followed by the receptor binding domain (RBD). The S2 subunit is mainly composed of a fusion peptide (FP) and two heptad repeats (HR1 and 2) which play a key role in mediating fusion with the host cell. Spike proteins are anchored in the virus envelope via a transmembrane domain (TM) and the cytoplasmic tail (CP), both of which have not yet been structurally determined – so their depiction in this image is an educated guess. Note that only “stumps” of carbohydrate chains are shown. Image: Thomas Splettstoesser; www.scistyle.com

Binding Mechanism

To engage the ACE2 receptor, the RBD of S1 undergoes a hinge-like conformational rearrangement that transiently exposes the residues necessary for receptor binding. The hepta-repeat 1 and 2 domains (HR1 and HR2) play a key role in mediating fusion and entry (see Fig. 1). The exact mechanism of entry and fusion of SARS-CoV-2 with and into the host cell is still not fully established, but it is likely that the fusion mechanism is similar to SARS-CoV. The putative mechanism is that after RBD binds to the ACE2 receptor, the S2 subunit binds to the host membrane via a fusion peptide (FP), and changes conformation to trigger the association between the HR1 and HR2 domains to form the “fusion core”, which brings the viral and cellular membranes in close proximity for fusion.

The structure of the RBD in complex with the human ACE2 receptor reveals that the interaction occurs via the spike protein RBD and the ACE2 N-terminal peptidase domain. The RBD consists of a twisted five stranded antiparallel β-sheet (β1, β2, β3, β4 und β7) forming the core together with short connecting α-helices, β-sheets and loops. These short α-helices, β-sheets and loops constitute the receptor binding motif (RBM) which is located as an extended insertion between two β-strands (β4 and β7) and contains most of the ACE2 contacting residues. The ACE2 N-terminal peptidase domain consists of two lobes that form the substrate binding site. The contact between the RBM and ACE2 is made at the bottom side of the ACE2 small lobe, with a concave outer surface in the RBM accommodating the N-terminal helix of the ACE2 and thus generating an interface of 1687Å2 (see Fig. 2).

Fig. 2. This image shows the spike RBD/ RBM in complex with the ACE2 receptor (PDB: 6lzg). a. The complex between the RBD (yelloworange) and the small lobe of ACE2 (cyan)  is shown. b. The interface of the RBM (yelloworange) and the N-terminal α-helix of ACE2 (cyan) comprises 15 hydrophilic interactions (dashed lines). Image: Sabrina Stäb

The RBM/ACE2 interface contains a network of different interactions, including hydrophilic interactions with 13 hydrogen bonds and 2 salt bridges which are shown in Fig.3. Key residues for receptor binding include the amino acids Leu-455, Phe-486, Gln-493, and Asn-501. The RBD residues Gln-493 and Asn-501 form hydrogen bonds with the respective ACE2 residues Glu-35 and Tyr 41. Phe-486 interacts with the ACE2 amino acids Gln-24, Leu-79 as well as Tyr-83 and makes contact to Met-82 by Van-der-Waals forces. Another important interaction takes place between the non-polar RBD Leu‑455 and ACE2 Asp-30, Lys-31 and His-34. Outside the RBM the amino acids Lys-417 and the ACE2 Asp-30 contribute to receptor binding by forming a salt bridge. Binding of the host cell receptor by subunit S1 destabilizes the prefusion trimer and triggers a structural rearrangement resulting in cleavage and shedding of the S1 subunit and transition of the S2 subunit to a stable postfusion conformation.

Fig. 3. This image shows some of the interactions between RBM and ACE2 (PDB: 6lzg). The table on the right site lists polar interactions between amino acids of the RBM and ACE2, such as hydrogen bonds and salt bridges. The images on the right (a-d) show 13 hydrogen bonds present in this Interface. The RBM is coloured in yelloworange and ACE2 in cyan. Image: Sabrina Stäb

The Role of Glycosilation

The surface of coronavirus spike proteins is densely decorated with heterogenous N-linked glycans protruding from the trimeric surface. SARS-CoV-2 spike comprises 22 N-linked glycosylation sequons per protomer. N-linked glycans play a key role in proper protein folding and in priming for fusion by host proteases. Glycans can also shield the amino acid residues and other epitopes from cells and antibody recognition, so glycosylation enables the coronavirus to evade both the innate and adaptive immune responses. It may also play a role in binding to the host cell. Unfortunately, both X-ray crystallography and cryo-EM cannot resolve long carbohydrate chains, so the structures (below) contain a maximum of three sugars. In most cases, the carbohydrate chains are much longer, covering most of the contact surfaces of the upper spike protein.

Fig. 4. This image shows ribbon diagrams of a spike trimer wit N-glycans on its surface coloured in cyan (PDB: 6vxx). In Image a, the spike protein is shown sideways and in b, the trimer can be seen from above. Image: Sabrina Stäb

Summary

The spike protein acts as key molecule for fusion and entry, so development of drugs directly targeting this protein may be essential to contain the COVID-19 pandemic. "Capping" the spike proteins with antibodies would interrupt infection. Binding of antibodys to S1 RBD could lead to an inhibition of the RBD-ACE2 interaction, which then could prevent fusion with the host cell. In addition, in lung cells, spike functionality depends on furin-mediated pre-cleavage at the S1/S2 site for subsequent activation by TMPRSS2 (transmembrane Serinprotease 2). Thus, inhibitors of either furin or TMRPSS2 could also be considered as a potential treatment for COVID-19. As the spike protein decorates the virus hull, it could also be part of a vaccine. All of this makes the spike protein a major target in the molecular fight against COVID-19.


Further reading:

Chemistry and Engineering New: Adding the missing sugars to coronavirus protein structures

https://www.nature.com/articles/s41423-020-0426-7

https://www.nature.com/articles/s41586-020-2180-5

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7102599/

https://www.nature.com/articles/s41423-020-0426-7

https://www.dpz.eu/en/home/single-view/news/die-vermehrung-von-sars-coronavirus-2-im-menschen-verhindern.html

Due to a new outbreak of pulmonary diseases caused by SARS-CoV-2, the development of new drugs is essential to contain the COVID-19 pandemic. One promising drug target is the 3C-like protease, also known as main protease or MPro. Most of the virus proteins are translated as one long polypeptide chain, which then has to be cleaved into functional proteins. For the viral polyproteins ppa1a and ppa1ab, 11 sites at the C-terminal end (downwards from nsp4) are cleaved by main protease. As the RNA polymerase complex is part of this chain (nsp7, nsp8, nsp12 and nsp14), inhibition of this main protease stops replication.

The 3C-like protease is a cysteine protease which is characterized by a catalytic dyad consisting of the amino acids cysteine and histidine.  The homodimer is comprised of two perpendicular protomers forming a catalytic cleft in between. Each of these protomer is composed of three domains. Domain I and II (N-terminal domain) form an antiparallel chymotrypsin-like β-barrel structure in which the substrate binding site is located. Domain III (C-terminal end) consist of five α-helices arranged in a cluster regulating dimerization through a salt-bridge interaction between Glu‑290 of one protomer and Arg-4 of the other.

The N-terminal residues called “N-finger” (see image) make contact predominantly to domain II of the other protomer generating a contact interface of ~1394 Å2. The dimerization in essential for protease activity as the N-terminal residue Ser-1 of one protomer interacts with Glu-166 of the other protomer keeping a substrate binding site in the right shape. This substrate binding site contains a catalytic dyad consisting of the residues Cys-145 and His-41. Next to the catalytic dyad is the substrate binding pocket called S1. It consists of the side chains Phe-140, His-163 and the main chain atoms of Glu-166, Asn-142, Gly-143 and His-172. This pocket mediates the high specificity for a Gln [Leu-Gln↓(Ser,Ala,Gly)] of the substrate to be cut, as the carbonyl oxygen of this Gln is stabilized by the amino acids Gly‑143 and Cys-145. 

Picture of corona virus main protease structure with labels.
Picture of corona virus main protease structure with labels. One protomer shown as ribbon diagram in yellow with its three domains labelled. The other (identical) protomer shown as sticks with the experimental electron density from PDB entry 6y2e in purple. Picture by Sam Horrell.

The specific substrate binding site S1 is thought to bind an inhibitor to work as a drug. This would then inhibit the cleavage of polyproteins and hence stop replication of the virus RNA. An advantage of 3C-like protease as a drug target is that up to date, no human proteases with similar cleavage specificity are known, and as a consequence, newly designed drugs are unlikely to be toxic. The potential inhibitors can be divided into two classes based on their chemical structures: The first class involves peptide chains that fit the catalytic site of the enzyme by making a covalent link with Cys-145, therefore blocking substrate binding. The second class consists of small organic compounds that bind to the enzyme active site, acting as a competitive inhibitor and hinder the substrate from entering the active site cavity. A potential drug which belongs to the second class is Lopinavir, a HIV1 protease inhibitor, which seems to be a promising candidate for the treatment of coronavirus infections. If the efficacy of Lopinavir against SARS-CoV-19 is confirmed, it would have the advantage that it is already approved as an HIV drug for humans.

Learn more:

https://www.the-scientist.com/image-of-the-day/image-of-the-day-coronavirus-in-3d-67315

https://science.sciencemag.org/content/early/2020/03/20/science.abb3405

https://www.jbc.org/content/280/35/31257

https://www.nature.com/articles/srep22677

Coronavirus Structural Taskforce