This article has been written for the journal Crystallographic Reviews and has been published on 11 June 2022 online. For additional information, data and tables, please see the full publication:


The SARS-CoV-2’s endoribonuclease (NendoU) nsp15, is an Mn2+ dependent endoribonuclease specific to uridylate that SARS-CoV-2 uses to avoid the innate immune response by managing the stray RNA generated during replication. As of the writing of this review 20 structures of SARS-CoV-2 nsp15 have been deposited into the PDB, largely solved using X-ray crystallography and some through Cryo-EM. These structures show that a nsp15 monomer consist of three conserved domains, the N-terminal oligomerization domain, the middle domain, and the catalytic NendoU domain. Enzymatically active nsp15 forms a hexamer through a dimer of trimers (point group 32), whose assembly is facilitated by the oligomerization domain. This review summarises the structural and functional information gained from SARS-CoV-2, SARs-CoV and MERS-CoV nsp15 structures, compiles the current structure-based drug design efforts, and complementary knowledge with a view to provide a clear starting point for downstream structure users interested in studying nsp15 as a novel drug target to treat COVID-19.


SARS-CoV-2 is a nidovirus with a non-segmented positive-sense RNA genome, meaning the RNA genome is read from 5′ to 3′ and can be directly translated into viral proteins; it’s effectively messenger RNA. The RNA genome of SARS-CoV-2 is one of the largest RNA genomes among RNA viruses [1], comprised of a replicase gene which encodes non-structural proteins (nsps), structural proteins, and accessory proteins. The genome can produce two different polyprotein chains through a ribosomal frameshift [2] (ORF1a and ORF1b). Once translated, these polyproteins are cleaved by one of the two encoded proteases (3C-like protease (nsp5) or papain-like-protease (nsp3)) to yield between 15 and 16 non-structural proteins, which assemble into a large membrane-bound replicase complex (RTC).

One of these non-structural proteins is nsp15, a 346 amino acid nidoviral RNA uridylate‐specific and Mn2+-dependent [3] endoribonuclease (NendoU). Its gene is found towards the end of the non-structural proteins in the SARS-CoV-2 genome on ORF1b (bases 6453 to 6798[4]). Nsp15 preferentially cleaves the 3′ end of uridine, producing a 2′‐3′ cyclic phosphodiester and 5′‐hydroxyl terminus [5] (Figure 1). Nsp15 is conserved across coronavirus family members [6], to the point where it has been proposed as a universal genetic marker to distinguish nidoviruses [7] from all other RNA virus families. Although highly conserved (88% sequence identity with SARS-CoV-2, 50% with MERS, and 43% with HCoV-229E), nsp15 has been found to be non-essential for viral replication in Mouse Hepatitis Virus [6] (MHV), SARS-CoV, and HCoV-229E. Some nsp15 mutations completely abolished RNA synthesis; however, these mutations resulted in misfolded and insoluble nsp15 when expressed in E. coli[6]. As a result, the loss of RNA synthesis is thought to be a knock-on effect on neighbouring polyprotein components that are critical for replication, as opposed to a genuine effect on viral replication through lack of nsp15 [6]. Further evidence of nsp15’s non-essential role in viral replication comes from insect nidoviruses and invertebrate roniviruses, which completely lack EndoU activity [8,9].

RNA Cleavage as performed by nsp15 to give a 2′‐3′ cyclic phosphodiester and 5′‐hydroxyl terminus from an RNA nucleotide phosphodiester from PDB entry 1RNA
Figure 1: RNA Cleavage as performed by nsp15 to give a 2′‐3′ cyclic phosphodiester and 5′‐hydroxyl terminus from an RNA nucleotide phosphodiester from PDB entry 1RNA. Figure created using Protein Imager [10].

Although not essential for viral replication, recent studies suggest nsp15 plays a role in repressing activation of the host innate immune response [11–13]. During viral replication, positive-sense RNA is translated to produce the viral replication complex, which replicates the positive-sense RNA to produce negative-sense RNA. The negative-sense RNA then acts as a template to produce new positive-sense genomic RNA and subgenomic RNA. Subgenomic RNA consists of smaller transcribed sections of RNA produced by initiating transcription in the middle of the template strand (internal initiation), falling off the template strand before reaching the 5’ stop codon (premature termination), or by jumping off the template strand and reinitiating transcription further down the template (discontinuous transcription). This process produces short and long double-stranded RNA intermediates with polyuridine tracts at the 5′ end which can be recognized by pattern recognition receptors in the host cell such as RIG-I-like receptors (RLRs), protein kinase R (PKR), oligoadenylate synthases (OASes), and melanoma differentiation-associated gene 5 (MDA5). These sensors promote an innate and antiviral immune response [11,14,15] by activating the type I and III interferon (IFN) response, which induces expression of interferon -stimulated genes through the signal transducer and activator of transcription proteins 1 and 2 (STAT1/2) signaling pathways. By cleaving the 5′-polyuridine tracts in negative-sense viral RNA, nsp15, along with nsp16 and nsp10, limit the accumulation of MDA5-dependent pathogen-associated molecular patterns to delay the host’s immune response [16]. Loss of nsp15 activity has been shown to activate the interferon response and reduce viral titers in piglets infected with nsp15-deficient porcine epidemic diarrhea coronavirus (PEDV) [17] and mice infected with nsp15-deficient Mouse Hepatitis Virus [11]. It has also been demonstrated that nsp15 plays a role in disrupting formation of autophagosomes, which are double-membraned vesicles containing cellular material to be degraded.  

Structural overview

SARS-CoV-2 nsp15 consists of an N-terminal oligomerisation domain (Figure 2, blue), a middle domain (Figure 2, purple), and the catalytic C-terminal NendoU domain (Figure 2, teal). The Oligomerisation domain is formed from an anti-parallel β-sheet (β1-3) which wraps around helices α1 and α2. The middle domain consists of three β-hairpins (β5-6, β7-8, and β12-13), a mixed β-sheet (β4, β9, β10, β11, and β15), 2 α-helices (α3 and α4), and a right-handed 310 helix (η4).  The catalytic NendoU domain contains two anti-parallel β-sheets (β16-18 and β19-21) which form a concave surface flanked by five α-helices (α6, α7, α8, α9, and α10). SARS-CoV-2 nsp15 shows high sequence identity with SARS-CoV nsp15 (88%) and lower sequence identity with MERS-CoV (51%), however the overall structural similarity is very high between the three viruses [1]. Three structures have been solved for SARS-CoV nsp15 (PDB entries 2H85 [18], 2OZK [19], and 2RHB [20]) one structure of MERS nsp15 (PDBID: 5YVD [21]), two structures from mouse hepatitis virus (2GTH and 2GTI [3]), and one structure from human coronavirus 229E (PDB entry 4S1T). As of writing this review 20 structures of SARS-CoV-2 nsp15 have been solved with a variety of bound ligands using X-ray crystallography and cryo-EM (Table 1) [1,22,23].

Crystal structure of the nsp15 monomer using PDB entry 6X4I
Figure 2: Crystal structure of the nsp15 monomer represented as a transparent surface and cartoon (left) and as a cartoon (right) coloured by domain using PDB entry 6X4I. The Figure was created using Protein Imager [10]

The biological assembly of nsp15 is a double-ringed hexamer made up of a dimer of trimers (point group 32, Figure 3). The trimeric form retains some ribonuclease activity, but the monomer presents with only residual cleavage [24] The hexamer is stabilised by an N-terminal oligomerisation domain present in each monomer. A crystal structure from SARS-CoV with a 28 amino acid N-terminal truncation (PDB entry 2H85) presented with a misfolded endoU active site, suggesting oligomerisation may act as an allosteric activation switch [19]. The six monomers come together to form the active enzyme with a 100 Å long negatively charged channel 10 to 15 Å wide open to solvent at the top, bottom, and on three separate side openings in the middle of the hexamer. Formation of the hexamer is essential for enzymatic activity, making the oligomerisation interfaces a potential target for structure-based drug design.

The active site of nsp15 is an electropositive pocket which lies at the interface between each monomer’s NendoU domain. The active site is highly conserved between SARS-CoV-2, SARS-CoV, and MERS proteins. Six key residues (His235, His250, Lys290, Thr341, Tyr343, and Ser294) are arranged in a shallow groove in the N-terminal NendoU domain [1]. His235, His250, and Lys290 are proposed to act as a catalytic triad, using a similar mechanism to that observed in RNase A [23]. However, RNase A is metal-independent, while SARS-CoV-2 nsp15 is Mn2+ dependent, so the mechanism is not an exact match. Mutation of either histidine in the catalytic triad to alanine eliminates RNA cleavage activity in nsp15 but has no effect on the formation of stable hexamers, showing they are not a factor in nsp15 oligomerisation [22]. This is unsurprising, as the N-terminal oligomerisation domain is the key player in the formation of the hexamer, but formation of the hexamer clearly plays an allosteric role in the formation of the active site, as activity in the monomer is significantly reduced.

Uracil specificity is proposed to be governed by Ser294 [20], with the main chain nitrogen of Ser294 predicted to interact with the carbonyl O2 oxygen of uracil and the hydroxyl group of Ser294 binding to uracil N3. However, mutation studies on homologs have shown that a Ser294Ala mutation significantly decreased activity without completely abolishing it[18] and negates uridine specificity. Tyr343 is likely important in governing uracil specificity, as shown by van der Waals stacking between the ribose sugar or Uridine and Tyr343 in cryo-EM structures [20,21]. Mutation of Tyr343 equivalent residues in SARS-CoV and MERS to alanine caused near complete loss of nuclease activity [20,21], suggesting a key role in enzymatic activity.

The Structure of the nsp15 hexamer generated by crystallographic symmetry using PDB entry 6X4I
Figure 3: The Structure of the nsp15 hexamer generated by crystallographic symmetry using PDB entry 6X4I.  On the left-hand side, the nsp15 hexamer is represented as a transparent surface and cartoon from a side-on view. On the right-hand side, the hexamer has been rotated 90 degrees towards the reader to give a top-down view looking down the 10-15 Å wide channel. The hexamer is coloured by trimer with trimer 1 in blue, with 1 light blue monomer, and trimer two in teal. The figure was created using the Protein Imager [10].

The structure of SARS-CoV-2 nsp15 has been solved in the presence of various catalytic intermediates, including 5′UMP (PDB entry: 6WLC), 3′UMP (PDB entry: 6X4I), 5′GpU (PDB entry: 6X1B), and uridine 2′,3′-vanadate (PDB entry: 7K1L). All intermediates bound to the C-terminal catalytic domain, interacting with the seven conserved active site residues (His235, His250, Lys290, Trp333, Thr341, Tyr343, Ser294, Gly248, Lys345, and Val292) and the structures showed no significant conformational deviations from each other (Cα RMSD = 0.29 Å). The uracil moiety of 5′UMP, guanylyl(3’-5’)uridine (GpU), and uridine 2′,3′-vanadate are all bound by Ser294 and Leu346 (Figure 4 Top left, bottom left, and bottom right, respectively), reinforcing the idea of uracil recognition being mediated by these residues. The combination of these structures confirms the predicted parallels between the reaction mechanism of SARS-CoV-2 nsp15 and RNAse A. The 5′UMP, 5′GpU, and uridine 2′-,3′-vanadate bound structures support the previously proposed hypothesis about uracil and purine base discrimination with Ser294 playing a key role [23]. Contrary to this finding, the 3′UMP bound structure shows the uracil base forming a stacking interaction with Trp333 (Figure 4, top right), the guanine binding site identified in the 5′GpU complex, suggesting nsp15’s active site can accommodate both purine and pyrimidine bases. However, the Trp333 interacting base is likely less relevant when binding larger RNA molecules as it provides a potential stacking interaction for bases without selectivity [23].

SARS-CoV 2 nsp15 active site crystal structures with bound reaction intermediates
Figure 4: SARS-CoV 2 nsp15 active site crystal structures with bound reaction intermediates. 5′UMP (PDB entry: 6WLC) in the top left, 3′UMP (PDB entry: 6X4I) in the top right, 5′GpU (PDB entry: 6X1B) in the bottom left, and the cyclic intermediate mimic uridine 2′,3′-vanadate (PDB entry: 7K1L) in the bottom right. Proteins are coloured in teal and represented as a cartoon with active site residues and bound ligands represented as sticks. Bound ligands are coloured white. This figure was made using Protein Imager [10].

Comparison of these ligand-bound structures with RNase A catalytic sites suggests nsp15 acts through a similar reaction mechanism [23]. Based on these findings a two-step mechanism has been proposed starting with a transphosphorylation reaction whereby His250 acts as a base and deprotonates 2′OH of the RNA ribose, with Lys290 stabilising the negative charge that builds up during the transition state. His235 then acts as a general acid donating a proton for the departing 5′OH group. This is followed by a hydrolysis step where the roles of His250 and His235 are reversed, with His235 deprotonating a water molecule and His250 acing as a proton donor for the 5′OH leaving group to convert the 2′-3′ cyclic phosphate back to 2′OH and a 3′-phosphoryl group. Despite the similar mechanisms, the structural environments of His235 in nsp15 and the RNase A equivalent (His119) differ significantly, with the residues being ~8 Å apart and making several different hydrogen bonding interactions. These differences may provide an answer as to why nsp15 is much more sensitive to pH change compared to Rnase A [22]. What remains unclear is the contribution of Mn2+ to the reaction mechanism, particularly as an Mn2+ binding site has not been located in SARS-CoV-2 nsp15 [22].

Therapeutic interest of the protein

As previously mentioned, knockout studies on nsp15 have shown it is not essential for viral replication. Despite this, a nsp15 inhibitor could provide an effective treatment against SARS-CoV-2 by hampering its evasion and modulation of the innate immune response to help promote longer-lasting immunity. Targeting nsp15 is particularly interesting as nsp15 has no close human homologues [25], thereby potentially reducing harmful side effects. A number of biochemical assays have been performed on nsp15 to screen previously approved drugs and various libraries for inhibition of nsp15, as well as a number of in-silico studies to dock approved therapeutics to guide drug design efforts. A fragment screening study has also been performed that yielded 6 small molecule fragments.

Benzopurpurin B, C-473872 (CAS registry number: 331675-78-6), and Congo Red, as well as small molecular Rnase A inhibitors, have been shown to inhibit nsp15 activity and reduce infectivity of SARS-CoV in Vero cells [26] but further testing on SARS-CoV-2 nsp15 is required. Additionally, nsp15 has been screened against the ReFrame [27], Pandemic Response Box (Medicines for Malaria Venture (MMV) & Drugs for neglected disease initiative (DNDi)), and Covid Box drug repurposing libraries for 50% inhibition below concentration of 10 µM, identifying 23, 1, and 0 hits respectively from the libraries [25]. Two fluorescence resonance energy transfer (FRET) assays to determine the half-maximal inhibitory concentration (IC50) reduced the hits to 12 (11 in ReFrame, 1 in Pandemic Response Box), which were whittled down to 3 (Exebryl-1, Piroxantrone, and MMV1580853) after 9 were identified as false positives due to the production of reactive oxygen species such as H2O2, which destabilized protein in the assay. Ligand binding was assessed using high resolution mass spectrometry. Piroxantrone and MMV1580853 showed significantly weaker binding and ultimately no antiviral activity in SARS-CoV-2 assays. Exebryl-1 bound with an affinity constant Kd of ~12 µM per monomer in the first instance, with approximately four molecules binding to one monomer on average per 100 µM Exebryl-1; and molecular docking of Exebryl-1 against PDB entry 6XDH using an automated Qvina docking workflow [28] showed binding in a pocket close to and within the active site. Exebryl-1 demonstrated antiviral activity in three separate assays at concentrations over 10 µM. However, based on blood plasma levels in Sprague-Dawley rats after an oral dose of 100 mg/kg reaching only 9 µM after 1 hour, and dropping to 4 µM after 4 hours, Exebryl-1 is not expected to reach therapeutic levels in its current state [25].

A repurposed colorectal cancer drug, Tipiracil, has been found to partially inhibit nsp15 activity in biochemical assays. However, the efficacy is greatly decreased in the presence of increased Mn2+ concentrations. A structure of nsp15 with Tipiracil interacting with the uridine binding pocket has also been solved (PDB entry: 6WXC), with its uracil ring stacking against Tyr341 and forming several hydrogen bonds with Ser294, Lys345, and His250 (Figure 5) as well as several interactions with other active site residues through water and phosphate molecules. The only unique interaction for this ligand is between the Iminopyrrolidin nitrogen of Tipiracil and Gln245 (Figure 5). Although not an immediate treatment option, the uracil derivative drug provides a potential scaffold for further SARS-CoV-2 nsp15 inhibitor development [23]. Based on Tipiracil binding at the active site a library of 85 flavinoid compounds were docked using the molecular mechanics/generalized Born surface area (MMGBSA) method and molecular dynamics with nsp15 (PDB entry 6WXC) as part of an in-silico study; but binding was found to be significantly weaker than Tipiracil in all cases [29].

SARS-CoV 2 nsp15 active site crystal structures with bound Tipiracil from PDB entry 6WXC
Figure 5: SARS-CoV 2 nsp15 active site crystal structures with bound Tipiracil from PDB entry 6WXC. The protein is coloured in teal and represented as a cartoon with active site residues and bound Tipiracil represented as sticks. Tipiracil is coloured white. This figure was made using Protein Imager [10].

Fragment screens have been performed on nsp15, with six structures currently available in the PDB without an accompanying publication. In addition to the soaked fragments present in these structures, all show a citrate molecule bound to the catalytic NendouU domain (Figure 6, CIT), with one fragment bound adjacent to citrate (PDB entry 5S70, Figure 6, EN300-181428 (WUS)) through a stacking interaction with Trp333 and a hydrogen bond between the NO3 hydrogen of EN300-181428 and O5 of the citrate molecule. Four fragments are bound at the interface between the middle domain (Figure 6, purple) and the N-terminal oligomerisation domain (Figure 6, blue), including FUZS-5 (PDB entry 5S71, Figure 6, WUV) Z2889976755 (PDB entry 5S6X, Figure 6, WUG), BBL029427 (PDB entry 5S72, Figure 6, WUY), and PB2255187532 (PDB entry 5S6Z, Figure 6, WUM). Finally, BBL029427 (PDB entry 5S6Y, Figure 6, WUJ) is bound to a loop connecting beta strands in the middle domain.  Unfortunately, the crystal packing in these structures prevents the formation of the active double-ringed hexamer structure using symmetry related molecules, making it difficult to assess how the fragments interact with the active hexamer. However, this monomeric crystal form could provide a starting point for the design of a drug to break up formation of the active hexamer by interfering with surfaces on the N-terminal oligomerization domain.

Small molecule fragment screening against SARS-CoV-2 nsp15, with nsp15 represented as flatfield coloured by domain
Figure 6: Small molecule fragment screening against SARS-CoV-2 nsp15, with nsp15 represented as flatfield coloured by domain (NendoU in teal, Middle Domain in purple, and N-terminal Oligomerisation domain in blue. Fragment binding is shown as a flat field, coloured grey, with ligands represented as sticks in surrounding circles. This is a composite image of PDB entries 5S70 (EN300-181428, WUS), 5S71 (FUZS-5, WUV), 5S6X (Z2889976755, WUG), 5S72 (BBL029427, WUY), 5S6Y (BBL029427, WUJ), and 5S6Z (PB2255187532, WUM). This figure was made using Protein Imager.

Molecular docking, all-atom molecular dynamics, and an assessment of absorption, distribution, metabolism, and excretion (ADME) properties have been carried out on PDB entry 6W01 using 15 scalarane sesterterpenes, compounds purified from Red Sea marine sponges with a variety of relevant pharmacological activities.  to assess their efficacy as drug targets to inhibit nsp15 [30]. Eight compounds were found to have equivalent or better binding energies compared to the reference ligand, Benzopurpurin 4B. All eight compounds bound the C-terminal catalytic domain in the large shallow active site, forming polar interactions with the catalytic triad (His235, His250, and Lys290), interacting with Trp333 through π-stacking, and forming at least one hydrogen bond with Lys290 and further anchoring hydrogen bonds with Gly248 and/or Gln245 [30]. Two of the eight were used in all atom molecular dynamics simulations and showed good stability, high negative binding free energies, and scored well on ADME drug property predictions.

In-silico docking investigations of 32 phytochemicals from Asparagus racemous have also been performed on nsp15 (PDBID: 6W01). The top 5 ligands (Asparoside-C, Asparoside-F, Rutin, Asparoside-D, and Racemoside-A) bound at the C-terminal active site with binding free energy scores between ‒7.165 kcal/mol and ‒5.993 kcal/mol. Complexes of nsp15 and Asparoside-C, -F, and -D were subjected to further analysis by 100 ns molecular dynamics simulations, which found Asparoside-D and -F to have favorable binding interactions and better affinity than the control ligand Remdesivir [31]. 23 previously approved drugs have also been docked to nsp15, with three demonstrating high predicted binding affinities between ‒9.1 and ‒9.6 kcal/mol (Saquinavir, Aprepitant, and Valrubicin) [32]. However, the pocket Saquinavir, Aprepitant, and Valrubicin are docked to sites on the opposite side of the active site pocket which houses the catalytic triad, approximately 17 Å away. Barring an undetermined allosteric effect caused by this binding, which the paper makes no mention of, further development of these drug targets “…modifying them to fit to the SARS-CoV-2 nsp15 active site pocket precisely” needs to be rethought as the active site has not been targeted in the first instance.

Complementary knowledge

The enzymatic activity of nsp15 and its crystal structure have been demonstrated, but the exact role in viral replication remains unclear. SARS-CoV nsp15 has been shown to co-localize with replicating RNA [33] around the nucleus as well as nsp8 and nsp12 from the replication/transcription complex in in situ studies [34], in the presence and absence of RNA. It was also shown that SARS-CoV nsp15 does not co-localise with the M protein [34]. Yeast two-hybrid screens and glutathione S-transferase (GST) pulldown assays have also identified nsp8 and nsp12 as potential binding partners to SARS-CoV nsp15 [35].

Furthermore, nsp15 has demonstrated a strong inhibitory effect on interferon (IFN) production and interferon regulatory factor 3 nuclear localization in in-vitro co-expression assays against the Cantell strain of Sendai virus with nsp13, nsp14, and accessory protein ORF6 [36]. However, interferon antagonization in in-vitro conditions is not necessarily representative of real infection, individual protein expression levels can vary greatly compared to overexpression studies and altered localization can have a significant effect[36]. The individual contribution or mechanism of nsp15 interferon inhibition is not discussed by Yuen et al 2020 in this study. Overall SARS-CoV-2 appears less effective at suppressing interferon signaling compared to SARS-CoV due to the loss of  SARS-CoV-2 papain-like protease (PLpro) as an interferon antagonist [36]. Reverse genetic studies (analysis of a resulting phenotype following genetic engineering) have suggested that ORF6 is the major player in interferon suppression instead [37]. However, SARS-CoV-2 ORF6 is also less conserved between SARS-CoV and SARS-CoV-2 at only 69% sequence identity and only 4 of 10 key amino acids identified from SARS-CoV ORF6 being present in SARS-CoV-2 ORF6 [36].    

It has been shown that nsp15 activity is highly dependent on the presence of Mn2+ ions, showing greatly reduced activity in the presence of Mg2+ ions. In the presence of Mn2+ nsp15 was able to cleave all four uridine sites in an eicosamer, a 20-subunit oligomer consisting of 5′GAACU↓CAU↓GGACCU↓U↓GGCAG3′, with no preference for sequence and increased cleavage rate with rising metal ion concentration [23]. This is particularly interesting as Mn2+ enhances activity in SARS-CoV nsp15, but protein activity does not depend on the presence of Mn2+, and no metal binding sites have been identified in coronavirus structures to date [18]. Considering SARS-CoV-2 nsp15 shares 88% sequence identity with SARS-CoV nsp15, and all active site residues are conserved, SARS-CoV 2 nsp15’s dependence on Mn2+ is a significant difference between the enzymes. Further to this, nsp15 alone is promiscuous, cutting any uridine sites in RNA, but becomes site-specific when in complex with nsp8 and nsp12 and leaves uridine tails between 5 and 10 bases long [16].

A library of 5000 small molecule compounds has been screened against nsp15 for inhibition of nuclease activity, with twelve compounds showing potential as antiviral treatments in a fluorescent biochemical kinetic screen. Further analysis using a gel-based assay found only one compound, NSC95397, able to inhibit nuclease activity at a concentration of 10 µM. However, tests on SARS-CoV 2 infected VERO E6 cells found the compound toxic at concentrations above 10 µM and ineffective at inhibiting viral growth at lower concentrations [38].    

A fluorescence resonance energy transfer (FRET) assay has been performed to measure nsp15 activity on a 6-mer oligonucleotide (5′-AAAUAA) with a 5′-fluorescein and 3′-TAMRA label [21,22]. Activity is measured through an increase in fluorescence caused by the removal of the 5′-TAMRA label. Nsp15 activity was confirmed for the wild-type protein and abolished in H235A and H250A mutants [22]. FRET analysis was paired with liquid chromatography electrospray ionization mass spectrometry to demonstrate that nsp15 3′RNA products show a preference for accumulation of 2′-3′ cyclic phosphate (80%) compared to 3′-phosphate, a significant difference compared to RNAse A which generates a 2’-3’ cyclic phosphate which is then hydrolysed to a 3’-phosphate.    


SARS-CoV 2 nsp15 is an RNA uridylate‐specific Mn2+-dependent [3] endoribonuclease from the nidoviral endoU (NendoU) family, which acts on single-stranded and double-stranded RNA to help SARS-CoV-2 evade detection by the innate immune response. Knockout studies have demonstrated that nsp15 is not essential for viral replication, but numerous studies have shown a reduction in viral titre and virulence in nsp15-deficient SARS-CoV-2 when studied in the presence of an effective immune response.

The sequence of nsp15 is highly conserved between SARS-CoV-2, SARS-CoV, MERS-CoV, and HCoV-229E, as is the fold of the monomer and active hexamer. The monomer consists of three domains, the N-terminal oligomerisation domain, a middle domain, and the NendoU catalytic domain which houses the active site. The active site is a shallow groove made up of six key residues (His235, His250, Lys290, Thr341, Tyr343, and Ser294). A series of structures with different catalytic intermediates have been solved and the reaction mechanism is predicted to act in a similar manner to the well-studied RNaseA enzyme. However, nsp15’s dependence on manganese, where RNase A’s activity is metal independent, throws some aspersions on this theory.

Three in-silico drug screeningstudies have been performed on nsp15, two using 6W01 and one using 6WXC as the protein models. 6W01 is a citrate bound nsp15 structure solved to 1.9 Å resolution, with acceptable data processing and refinement statistics overall, the only minor concern is that 5% of the residues in both chains show one issue with their geometry, and a small subset of that 5% show an issue in their fit to the electron density. 6WXC is a Tipiracil bound nsp15 structure solved to 1.85 Å resolution, it faces a similar minor problem to 6W01 with 7% of residues in both chains showing one issue with their geometry but with fewer electron density fit outliers. Use of either model should present no major stumbling blocks for simulation studies.      

Discussion & Outlook

Nsp15 has been one of the lesser explored proteins compared to other SARS-CoV 2 proteins, such as the main protease and the papain-like protease, which have undergone extensive in-silico drug design studies through a number of large collaborative efforts between universities, synchrotrons, and other organizations [39–45] to feed into the COVID Moonshot project [46]. Overall, the structural work on nsp15 has been sound and all available models could provide a good starting structure for computational drug design. A series of structures with catalytic intermediates suggests a mechanism akin to RNase A, however, the dependence of nsp15 on Mn2+ suggests a departure from this mechanism as RNase A’s mechanism is metal independent. Follow up in-silico studies (described above) were based on well validated models with acceptable statistics for the resolution the structures were solved at, although none have yet pointed to a viable lead compound for clinical application. Nsp15 not being essential for viral replication makes it a much less desirable target for structure-based drug design compared to other essential viral proteins. However, the impact of nsp15 on SARS-CoV-2’s virulence by repressing the innate immune response shows a potential avenue to weaken SARS-CoV-2 through inhibition of nsp15 to allow the immune system to fight off infection before it becomes more severe.


This work was supported by the German Federal Ministry of Education and Research [grant no. 05K19WWA], Deutsche Forschungsgemeinschaft [project TH2135/2-1]. The authors would also like to thank Johannes Kaub and Rosemary Wilson for support and discussion. All figures are courtesy of the Coronavirus Structural Task Force (, who retains copyright for the text and the figures..

[1]        Kim Y, Jedrzejczak R, Maltseva NI, et al. Crystal structure of Nsp15 endoribonuclease NendoU from SARS-CoV-2. Protein Science. 2020;29:1596–1605.

[2]        Cui J, Li F, Shi Z-L. Origin and evolution of pathogenic coronaviruses. Nat Rev Microbiol. 2019;17:181–192.

[3]        Ivanov KA, Hertzig T, Rozanov M, et al. Major genetic marker of nidoviruses encodes a replicative endoribonuclease. Proc Natl Acad Sci U S A. 2004;101:12694–12699.

[4]        Naqvi AAT, Fatima K, Mohammad T, et al. Insights into SARS-CoV-2 genome, structure, evolution, pathogenesis and therapies: Structural genomics approach. Biochim Biophys Acta Mol Basis Dis. 2020;1866:165878.

[5]        Bhardwaj K, Sun J, Holzenburg A, et al. RNA Recognition and Cleavage by the SARS Coronavirus Endoribonuclease. J Mol Biol. 2006;361:243–256.

[6]        Deng X, Baker SC. An “Old” protein with a new story: Coronavirus endoribonuclease is important for evading host antiviral defenses. Virology. 2018;517:157–163.

[7]        Snijder EJ, Decroly E, Ziebuhr J. Chapter Three - The Nonstructural Proteins Directing Coronavirus RNA Synthesis and Processing. In: Ziebuhr J, editor. Advances in Virus Research [Internet]. Academic Press; 2016 [cited 2022 Jan 7]. p. 59–126. Available from:

[8]        Nga PT, Parquet M del C, Lauber C, et al. Discovery of the First Insect Nidovirus, a Missing Evolutionary Link in the Emergence of the Largest RNA Virus Genomes. PLOS Pathogens. 2011;7:e1002215.

[9]        Lauber C, Ziebuhr J, Junglen S, et al. Mesoniviridae: a proposed new family in the order Nidovirales formed by a single species of mosquito-borne viruses. Arch Virol. 2012;157:1623–1628.

[10]      Tomasello G, Armenia I, Molla G. The Protein Imager: a full-featured online molecular viewer interface with server-side HQ-rendering capabilities. Bioinformatics. 2020;36:2909–2911.

[11]      Deng X, Hackbart M, Mettelman RC, et al. Coronavirus nonstructural protein 15 mediates evasion of dsRNA sensors and limits apoptosis in macrophages. PNAS. 2017;114:E4251–E4260.

[12]      Kindler E, Gil-Cruz C, Spanier J, et al. Early endonuclease-mediated evasion of RNA sensing ensures efficient coronavirus replication. PLOS Pathogens. 2017;13:e1006195.

[13]      Volk A, Hackbart M, Deng X, et al. Coronavirus Endoribonuclease and Deubiquitinating Interferon Antagonists Differentially Modulate the Host Response during Replication in Macrophages. Journal of Virology [Internet]. 2020 [cited 2022 Jan 6]; Available from:

[14]      Kato H, Takeuchi O, Sato S, et al. Differential roles of MDA5 and RIG-I helicases in the recognition of RNA viruses. Nature. 2006;441:101–105.

[15]      Mandilara G, Koutsi MA, Agelopoulos M, et al. The Role of Coronavirus RNA-Processing Enzymes in Innate Immune Evasion. Life (Basel). 2021;11:571.

[16]      Hackbart M, Deng X, Baker SC. Coronavirus endoribonuclease targets viral polyuridine sequences to evade activating host sensors. Proc Natl Acad Sci U S A. 2020;117:8094–8103.

[17]      Deng X, Geelen A van, Buckley AC, et al. Coronavirus Endoribonuclease Activity in Porcine Epidemic Diarrhea Virus Suppresses Type I and Type III Interferon Responses. Journal of Virology [Internet]. 2019 [cited 2022 Jan 6]; Available from:

[18]      Ricagno S, Egloff M-P, Ulferts R, et al. Crystal structure and mechanistic determinants of SARS coronavirus nonstructural protein 15 define an endoribonuclease family. PNAS. 2006;103:11892–11897.

[19]      Joseph JS, Saikatendu KS, Subramanian V, et al. Crystal Structure of a Monomeric Form of Severe Acute Respiratory Syndrome Coronavirus Endonuclease nsp15 Suggests a Role for Hexamerization as an Allosteric Switch. Journal of Virology [Internet]. 2007 [cited 2022 Jan 6]; Available from:

[20]      Bhardwaj K, Palaninathan S, Alcantara JMO, et al. Structural and Functional Analyses of the Severe Acute Respiratory Syndrome Coronavirus Endoribonuclease Nsp15*. Journal of Biological Chemistry. 2008;283:3655–3664.

[21]      Zhang L, Li L, Yan L, et al. Structural and Biochemical Characterization of Endoribonuclease Nsp15 Encoded by Middle East Respiratory Syndrome Coronavirus. Journal of Virology [Internet]. 2018 [cited 2022 Jan 6]; Available from:

[22]      Pillon MC, Frazier MN, Dillard LB, et al. Cryo-EM structures of the SARS-CoV-2 endoribonuclease Nsp15 reveal insight into nuclease specificity and dynamics. Nat Commun. 2021;12:636.

[23]      Kim Y, Wower J, Maltseva N, et al. Tipiracil binds to uridine site and inhibits Nsp15 endoribonuclease NendoU from SARS-CoV-2. Commun Biol. 2021;4:1–11.

[24]      Saramago M, Costa VG, Souza CS, et al. The nsp15 Nuclease as a Good Target to Combat SARS-CoV-2: Mechanism of Action and Its Inactivation with FDA-Approved Drugs. Microorganisms. 2022;10:342.

[25]      Choi R, Zhou M, Shek R, et al. High-throughput screening of the ReFRAME, Pandemic Box, and COVID Box drug repurposing libraries against SARS-CoV-2 nsp15 endoribonuclease to identify small-molecule inhibitors of viral activity. PLOS ONE. 2021;16:e0250019.

[26]      Ortiz-Alcantara J, Bhardwaj K, Palaninathan S, et al. Small molecule inhibitors of the SARS-CoV Nsp15 endoribonuclease. Virus Adaptation and Treatment. 2010;2:125–133.

[27]      Janes J, Young ME, Chen E, et al. The ReFRAME library as a comprehensive drug repurposing library and its application to the treatment of cryptosporidiosis. PNAS. 2018;115:10750–10755.

[28]      Alhossary A, Handoko SD, Mu Y, et al. Fast, accurate, and reliable molecular docking with QuickVina 2. Bioinformatics. 2015;31:2214–2216.

[29]      Mishra GP, Bhadane RN, Panigrahi D, et al. The interaction of the bioflavonoids with five SARS-CoV-2 proteins targets: An in silico study. Comput Biol Med. 2021;134:104464.

[30]      Elhady SS, Abdelhameed RFA, Malatani RT, et al. Molecular Docking and Dynamics Simulation Study of Hyrtios erectus Isolated Scalarane Sesterterpenes as Potential SARS-CoV-2 Dual Target Inhibitors. Biology (Basel). 2021;10:389.

[31]      Chikhale RV, Sinha SK, Patil RB, et al. In-silico investigation of phytochemicals from Asparagus racemosus as plausible antiviral agent in COVID-19. J Biomol Struct Dyn. 2021;39:5033–5047.

[32]      Mahmud S, Elfiky AA, Amin A, et al. Targeting SARS-CoV-2 nonstructural protein 15 endoribonuclease: an in silico perspective. Future Virol. :10.2217/fvl-2020–0233.

[33]      Shi ST, Schiller JJ, Kanjanahaluethai A, et al. Colocalization and Membrane Association of Murine Hepatitis Virus Gene 1 Products and De Novo-Synthesized Viral RNA in Infected Cells. Journal of Virology [Internet]. 1999 [cited 2022 Jan 6]; Available from:

[34]      Athmer J, Fehr AR, Grunewald M, et al. In Situ Tagged nsp15 Reveals Interactions with Coronavirus Replication/Transcription Complex-Associated Proteins. mBio [Internet]. 2017 [cited 2022 Jan 6]; Available from:

[35]      Imbert I, Snijder EJ, Dimitrova M, et al. The SARS-Coronavirus PLnc domain of nsp3 as a replication/transcription scaffolding protein. Virus Res. 2008;133:136–148.

[36]      Yuen C-K, Lam J-Y, Wong W-M, et al. SARS-CoV-2 nsp13, nsp14, nsp15 and orf6 function as potent interferon antagonists. Emerg Microbes Infect. 9:1418–1428.

[37]      Schroeder S, Pott F, Niemeyer D, et al. Interferon antagonism by SARS-CoV-2: a functional study using reverse genetics. The Lancet Microbe. 2021;2:e210–e218.

[38]      Canal B, Fujisawa R, McClure AW, et al. Identifying SARS-CoV-2 antiviral compounds by screening for small molecule inhibitors of nsp15 endoribonuclease. Biochem J. 2021;478:2465–2479.

[39]      Cantrelle F-X, Boll E, Brier L, et al. NMR Spectroscopy of the Main Protease of SARS-CoV-2 and Fragment-Based Screening Identify Three Protein Hotspots and an Antiviral Fragment. Angewandte Chemie International Edition. 2021;60:25428–25435.

[40]      Newman JA, Douangamath A, Yadzani S, et al. Structure, mechanism and crystallographic fragment screening of the SARS-CoV-2 NSP13 helicase. Nat Commun. 2021;12:4848.

[41]      Zhao Y, Du X, Duan Y, et al. High-throughput screening identifies established drugs as SARS-CoV-2 PLpro inhibitors. Protein Cell. 2021;12:877–888.

[42]      Ma C, Sacco MD, Xia Z, et al. Discovery of SARS-CoV-2 Papain-like Protease Inhibitors through a Combination of High-Throughput Screening and a FlipGFP-Based Reporter Assay. ACS Cent Sci. 2021;7:1245–1260.

[43]      Douangamath A, Fearon D, Gehrtz P, et al. Crystallographic and electrophilic fragment screening of the SARS-CoV-2 main protease. Nat Commun. 2020;11:5047.

[44]      Ahmad S, Abdullah I, Lee YK, et al. Extensive Crystallographic Fragment-Based Approach to Design SARS CoV2 3CLpro Main Protease Inhibitors and Related Metadata. 2021 [cited 2022 Jan 6]; Available from:

[45]      Günther S, Reinke PYA, Fernández-García Y, et al. X-ray screening identifies active site and allosteric inhibitors of SARS-CoV-2 main protease. Science. 2021;372:642–646.

[46]      Consortium TCM, Achdout H, Aimon A, et al. Open Science Discovery of Oral Non-Covalent SARS-CoV-2 Main Protease Inhibitor Therapeutics [Internet]. 2021 [cited 2022 Jan 6]. p. 2020.10.29.339317. Available from:

Since the outbreak of SARS-CoV-2, infection has continued to spread. At the same time, governmental agencies around the world have adjusted the rules to prevent its spread. Information sources as basis for these rules have been obtained from scientific studies, public health research and simulation tests to understand the efficiency of mask types in preventing spread of infection by SARS-CoV-2. In this article, we will look at the mask types in use today, how much they can impede viral droplets and aerosols and how the construction of different masks helps to protect us from infection by SARS-CoV-2.

SARS-CoV-2 droplet sizes and viral transmission

The SARS-CoV-2 virus can be transmitted via droplets and aerosols. 

Droplets are particles of sizes varying from 0.05 to 500 μm. They are directly emitted while breathing or talking. After being released into the air, larger droplets fall to the ground and others rapidly evaporate to form droplet nuclei less than 5 µm of size, also called aerosols, containing viruses in the range of 0.02 to 0.3 μm. Droplet nuclei can remain suspended in air for a longer time compared to large droplets and potentially contribute to airborne transmission1,2,3.

SARS-CoV-2 has been observed to be transmitted via 3 modes:4,5,6

  •   Contact transmission (usually via direct contact with infected persons, surfaces, or air)
  •   Droplet transmission over short distances when a person is close to an infected person
  •   Aerosol transmission over longer distances via inhalation of aerosols that remain airborne and travel with the air

Although maintaining a safe distance from an infected or possibly infected person will prevent viral spread via direct contact and droplet transmission, maintaining a safe distance may not be able to prevent spread of infection through airborne aerosols. This is why it becomes even more important to wear a mask.

Mask types and structure

Surgical masks, also called medical face masks or mouth-nose protection (MNS), are disposable products that are normally used in clinics or in doctor's offices on a daily basis. They are made of special plastics with multiple layers. They have a rectangular shape with wrinkles so that the mask can adapt to the face. The front (outside) is often coloured, the back (inside) is not. The masks have ear loops and a wire noseband (see Figure 1).

Due to the shape and fit of most medical face masks, some of the breathing air can flow past the edges. Especially during inhalation, unfiltered breathing air can be sucked in. Therefore, medical face masks usually offer the wearer less protection against pathogenic aerosols than particle-filtering half-masks (FFP). Medical face masks, however, can protect the mouth and nose of the wearer from pathogen transmission via direct contact, for example with contaminated hands.

Since they are medical devices, their manufacturing and distribution must be carried out in accordance with medical device law. They must therefore comply with the legal requirements and the European standard EN 14683:2019-10. Only then can manufacturers mark the medical masks with the CE mark and distribute them freely in Europe. This is subject to supervision by competent authorities7.

Surgical mask, picture taken by CSTF.
Figure 1: A surgical mask.

Particle filtering half masks / filtering facepieces (FFP) are objects of personal protective equipment (PPE) within the framework of occupational health and safety. They protect the wearer of the mask from particles, droplets, and aerosols. When worn correctly, FFP masks are tightly attached and offer external and self-protection. Since the masks are disposable products as intended by the manufacturer, they should be changed regularly and disposed of after use.

FFP masks are produced either with or without an exhalation valve. Masks without exhalation valve filter both the inhaled air and the exhaled air over the mask surface and therefore offer both self-protection and external protection. Masks with valves offer less external protection because exhaled aerosols are not intercepted by the filter material but are only slowed down and swirled to a certain extent by the valve.

Like medical face masks, FFP masks must comply with clear requirements of laws and technical standards. In particular, the filter performance of the mask material is tested with aerosols in accordance with the European standard EN 149:2001+A1:2009. FFP2 masks must filter at least 94% of the test aerosols, for FFP3 masks the minimum is even 99% . They are therefore proven to provide effective protection against aerosols. The test standard, together with the CE mark and the four-digit identification number of the notified body, is printed on the surface of the FFP mask7.

FFP2 mask, picture taken by CSTF.
Figure 2: An FFP2 mask.

Mask standards

The table below shows the currently accepted standards for masks and how they are effective in filtering out bacteria as well as particles.

Table showing Filtration Capacity of Mask Standards
Table 1: Filtration capacity of mask standards, evaluated standards include bacteria filtration efficiency (BFE), particle filtration efficiency (PFE), and penetration of filter material (PFM).

Mechanisms of protection

Masks ensure protection from viral spread in three main ways1,5:

Flow resistance inhibits the momentum of exhaled droplets and the velocity of incoming airborne aerosols. This significantly reduces the risk of infection in the vicinity of an infected person, protecting third parties as well. This is afforded by surgical masks, FFP2/N95/KN95, or better particle filtering respirator masks.

Droplet filtration blocks out large droplets via gravity sedimentation, inertial impaction, and minimizing contact of hand to mouth, nose, or other facial canals with access to the respiratory tract. It is afforded by most kinds of masks.

Aerosol filtration reduces the spread of aerosols via interception, diffusion, and electrostatic attraction. Electrostatic effects likely result in charge transfer with nanoscale aerosol particles. It is afforded by FFP2/N95/KN95 or better particle filtering respirator masks.

At small aerosol droplet sizes in the range of 0.1 to 1 μm, the mask layers prevent particles from passing mainly by blocking movement of particles with the fibers in the filter layer and, hence, not allowing diffusion. For nanometer-sized particles, which can easily slip between the openings in the network of filter fibers, electrostatic attraction is the main way by which mask layers remove low mass particles, which are attracted to and bind to the fibers. This filtering of particles by electrostatic attraction is generally most efficient at low speed of the particles such as the speed of aerosols released by breathing through a face mask.

It is important to note that openings and gaps (such as those between the mask edge and the face) can compromise the performance. Findings indicate that leakages around the mask area can reduce efficiencies by ∼50% or more, pointing out the importance of a proper “fit”8.

Although a home-made fabric mask will at least offer some degree of protection against larger droplets and prevent access to facial features, it will not be very effective in protecting against respirable particles and droplets with a diameter of 0.3 to 2 μm, as these pass through the materials largely unfiltered5.

Thus, the inhalation of droplets containing viruses can be prevented by using a tight-fitting mask with particle filtering properties (self-protection). The FFP2/FFP3 mask type is very well suited to protect people from an infection by means of aerosol even when the environment is strongly contaminated with infectious droplets5.

How does mask structure affect filter particles?

For high filtration and blocking efficiency, the construction of masks layers is very important. Factors that contribute to this efficiency are these4,8:

Movement of droplets/aerosols is directly affected by interfiber spacing of the mask material and the number of layers. Combining layers of differing fiber arrangement to form hybrid masks uses mechanical filtering and may be an effective approach.

Electrostatic interaction impeding aerosol transmission is influenced by the type of mask material. Electrostatic attraction mainly affects the removal of low mass particles, which are attracted to and bind to the fibers. Leveraging electrostatic filtering may be another effective approach8.

The SEM pictures below show the structure and construction of mask fibers and give an insight into the factors that contribute to their high filtering and blocking efficiency.

An FFP2 mask combines layers featuring different spacing and fiber network types to form hybrid masks, employing both mechanical and electrostatic filtering.

Microscopic image of FFP2 mask layers, showing different droplet sizes in comparison
Figure 3: SEM image of FFP2 filter layer fibers showing an incoming pseudo droplet and aerosol. A pseudo aerosol, shown here as a yellow dot, is bound to the mask fiber due to electrostatic attraction and, hence, cannot pass through the mask due to electrostatic filtering. A pseudo droplet shown here in blue is larger than the interfiber spacing of the mask fiber and, thus, cannot pass through the mask due to mechanical filtering. Picture: Carl Zeiss GmbH | Coronavirus Structural Task Force.

Why are FFP masks superior? 

Surgical and respiratory masks are compliant to regulations that guarantee to fulfill certain standards (cf. Table 1). The superior protection of FFP masks stems partially from its filtering layer (cf. Figure 3), using electrostatic filtration to block smaller particles (~0.1 µm).


While maintaining a safe distance from an infected or possibly infected person will prevent spread of infection through direct contact and droplet transmission, maintaining a safe distance may not effectively prevent the spread of infection through airborne aerosols. This is where it becomes very important to wear a mask.

Masks offer self-protection and minimize transmission of potentially infectious exhaled droplets to the surrounding atmosphere. However, in some situations like closed rooms or highly contaminated places, only masks with high blocking and filtration efficiencies will offer this kind of protection, provided they are closely fitted to prevent air from flowing around the mask edges.

The authors would like to explicitly thank Carl Zeiss GmbH, who provided the microscopic images.


1.        Anand, S. & Mayya, Y. S. Size distribution of virus laden droplets from expiratory ejecta of infected subjects. Sci. Rep. 10, 1–9 (2020).

2.        Chirizzi, D. et al. SARS-CoV-2 concentrations and virus-laden aerosol size distributions in outdoor air in north and south of Italy. Environ. Int. 146, 106255 (2021).

3.        Lee, B. U. Minimum sizes of respiratory particles carrying SARS-CoV-2 and the possibility of aerosol generation. Int. J. Environ. Res. Public Health 17, 1–8 (2020).

4.        Sanchez, A. L., Hubbard, J. A., Dellinger, J. G. & Servantes, B. L. Experimental study of electrostatic aerosol filtration at moderate filter face velocity. Aerosol Sci. Technol. 47, 606–615 (2013).

5.        Kähler, C. J. & Hain, R. Fundamental protective mechanisms of face masks against droplet infections. J. Aerosol Sci. 148, (2020).

6.        Oct, U. COVID-19 Scienti c Brief : SARS-CoV-2 and Potential Airborne Transmission small particles that can move through the air The term “ airborne transmission ” has a specialized meaning in public health practice respiratory microbes The epidemiology of SARS-Co. 2019–2022 (2021).

7.                       Accessed 21 April 2021.

8.        Konda, A. et al. Aerosol Filtration Efficiency of Common Fabrics Used in Respiratory Cloth Masks. ACS Nano 14, 6339–6347 (2020).


This protein is known under many different names such as non-structural protein NSP1, leader protein, host translation inhibitor and host shutoff factor. Some of these names already tell us about the function and importance of this relatively small protein. It is found in all betacoronaviruses1 and, even though it only contains 180 amino acids2, it is indispensable for the viral life cycle and the pathogenicity of SARS-CoV-2.

It plays an important role when it comes to the point where the virus needs its own genetic information in form of a string of codons. Its mRNA is translated into the corresponding amino acids that make up the viral proteins. Translation occurs either shortly after the virus entered the host cell (see life cycle) or after the viral mRNA has been replicated (as described here).

For this process, the virus does not have its own proteins; instead, it just uses the already existing translation machinery of the host cell: the ribosomes.

As ribosomes are responsible for synthesizing proteins by translating the information on the host’s mRNA into a string of amino acids, they are an important part of human cells. They consist of ribosomal RNA (rRNA) and ribosomal proteins, which form a larger (60S) and a smaller (40S) subunit3.

Here, the NSP1 comes into play. It helps the virus hijack ribosomes and use them for the replication of its own mRNA, while the host cells translation is supressed/inhibited/shut off4.

To understand how the NSP1 is involved in all this, we will first have take a closer look at the structure of the protein.

Structural features & interaction with ribosomes

Even though the full-length structure of NSP1 is unknown so far, we know what the two individual domains (connected via a linker that is 20 amino acids long) of the SARS-CoV-2 NSP1 look like and can even say a lot about its interaction with human ribosomes.

How SARS-CoV-2 takes over its host—NSP1, the Leader Protein 1

Figure 1: a: Schematic structure of NSP1. b: N-terminal domain (PDB: 7K7P)., c: C-terminal domain. KH motif (amino acids K164 and H165) in yellow (PDB: 6ZLW).

The first domain is the globular N-terminal domain (amino acids 1–128), which takes up most of the protein. It consists of a β-barrel of seven β-strands, two 310 helices and one α-helix5, as can be seen in Figure 1b.

The probably more interesting domain, due to the crucial role it plays for interaction with human ribosome, is the C-terminal domain comprising three moieties (Figure 1c). It consists of the two α-helices, α1 and α2, and a loop connecting them4. The shape of this C-terminal domain and its surface charge matches the mRNA entry channel of the ribosome perfectly and therefore covers the whole usual mRNA path4. In Figure 2, the small 40S ribosomal subunit (green) in a complex with the C-terminal domain of NSP1 (pink) is shown.

How SARS-CoV-2 takes over its host—NSP1, the Leader Protein 2

Figure 2: a: Ribosomal 40S subunit in complex with the NSP1 C-terminal domain (PDB: 6ZLW). The C-terminal domain is bound to the mRNA channel between the “head” and “body” of the 40S. b & c: NSP1 C-terminal domain shown with and without surface.

While the C-terminal domain is bound to the mRNA entry channel of the host cell’s 40S ribosomal subunit, the N-terminal domain can move around it within a 60 Å radius, connected by the 20 amino acid long flexible linker6.

All these interactions lead to an inhibition of the translation of the hosts mRNA—but how does the viral mRNA get translated, if the NSP1 is bound to the ribosome’s mRNA entry channel?

Viral translation

The virus needs a mechanism to circumvent its own translational blockage to maintain the capability for translation of the viral mRNA. It is not yet completely clear how this is accomplished, but different suggestions exist.

The first theory involves the N-terminal domain of NSP1 and the 5’ untranslated region (5’UTR) of the viral mRNA7.

In most coronaviruses, the 5’UTR part of the viral mRNA is conserved with a complex secondary structure6. Some scientists7 suggest that it might interact with the N-terminal domain, making the interaction between NSP1 and the ribosome sterically impossible and therefore lifting the blockage. This was also based on their study indicating that the C-terminal domain alone can suppress the host’s protein synthesis, but the N-terminal domain is needed to bypass the translation inhibition. Also, extending the linker between the two domains artificially by additional amino acids could be shown to reduce the viral mRNA translation7.

The second theory suggests that the translational blockage induced by the viral NSP1 is not lifted. In this mechanism, most ribosomes would be blocked by the NSP1s, but those left unblocked could still synthesize proteins. Here the viral 5’UTRs would make the mRNA of the virus more favourable than the host’s mRNA. This would lead the ribosomes into translating the viral mRNA with a higher efficiency than the cellular mRNA6.

Effect on the cells and immune system interference

Translation inhibition of the cellular mRNA by NSP1 results directly in another interesting and significant effect on the human cell. Besides the negative effects on normal cell functions, the translation of proteins involved in innate immune response is also inhibited. This includes interferons (proteins involved in antiviral activity8)  like Interleukin-8, IFN-β, IFN-γ1 and anti-viral factors that are stimulated by interferons, leading to a downregulation of the cell’s defence system4,9.

Earlier studies on SARS-CoV-1 also showed that NSP1 is further inducing cleavage of the host’s mRNA, probably by using one of the host’s proteins. This again does not apply to its own viral mRNA10, making the impact on the host cell even greater.

Taken together, this protein is a major pathogenicity factor of SARS-CoV-2 and might therefore be an interesting drug target1.

Available structures

As of this writing, 16 structures of the SARS-CoV-2 NSP1 are available, of which two display the N-terminal domain. The other structures show the C-terminal domain in complex with a ribosome, ribosomal subunit or preinitiation ribosome. As there is no full-length structure solved so far, only predictions on the whole protein were made, for example given by Clark et al.5.

Available structures of the N-terminal: 7k7p, 7k3n.

Available structures of the C-terminal: 7k5i, 6zoj, 6zok, 6zm7, 6zlw, 6zmi, 6zp4, 6zon, 7jqb, 6zme, 6zmt, 6zn5, 6zmo, 7jpc.


  1. de Lima Menezes, G. & da Silva, R. A. Identification of potential drugs against SARS-CoV-2 non-structural protein 1 (nsp1). Journal of Biomolecular Structure and Dynamics 1–11 (2020) doi:10.1080/07391102.2020.1792992.
  2. Yoshimoto, F. K. The Proteins of Severe Acute Respiratory Syndrome Coronavirus-2 (SARS CoV-2 or n-COV19), the Cause of COVID-19. 19.
  3. Khatter, H., Myasnikov, A. G., Natchiar, S. K. & Klaholz, B. P. Structure of the human 80S ribosome. Nature 520, 640–645 (2015).
  4. Thoms, M. et al. Structural basis for translational shutdown and immune evasion by the Nsp1 protein of SARS-CoV-2. 8 (2020).
  5. Clark, L. K., Green, T. J. & Petit, C. M. Structure of Nonstructural Protein 1 from SARS-CoV-2. Journal of Virology 95, 12 (2021).
  6. Schubert, K. et al. SARS-CoV-2 Nsp1 binds the ribosomal mRNA channel to inhibit translation. Nat Struct Mol Biol 27, 959–966 (2020).
  7. Shi, M. et al. SARS-CoV-2 Nsp1 suppresses host but not viral translation through a bipartite mechanism. (2020) doi:10.1101/2020.09.18.302901.
  8. De Andrea M. et al. The interferon system: an overview. Eur J Paediatr Neurol (2002) doi:10.1053/ejpn.2002.0573.
  9. Vann, K. R. Inhibition of translation and immune responses by the virulence factor Nsp1 of SARS-CoV-2. 4.
  10. Huang, C. et al. SARS Coronavirus nsp1 Protein Induces Template-Dependent Endonucleolytic Cleavage of mRNAs: Viral mRNAs Are Resistant to nsp1-Induced RNA Cleavage. PLoS Pathog 7, e1002433 (2011).

A guest entry by Hauke Hillen

In order for the novel coronavirus SARS-CoV-2 to replicate, it has to achieve two basic tasks: It needs to make copies of its genome that can be packaged into new virus particles, and it needs to activate viral genes to produce the proteins that actually form new virus particles, such as spike or nucleocapsid. Both tasks are carried out by a specialized molecular copying machine called the replication and transcription complex, or short RTC. The RTC is made up of a number of viral non-structural proteins (nsps) which act together to produce copies of the viral RNA genome. Some of the RTC components have been discussed in previous posts, for example the exonuclease nsp14, which can correct errors that occur during RNA copying (this is called proofreading),or the methyltransferases nsp14 and nsp16 that add chemical modifications to the RNA that help stabilize and hide it from the immune system (this is called capping).

In this post, we will have a closer look at the enzyme that carries out RNA replication, the RNA-dependent RNA polymerase (RdRp) nsp12, and how it interacts with the other components to form the RTC.

RNA polymera…what?

First off, let’s briefly discuss what a “RNA-dependent RNA polymerase” is, and why it is important for the virus. Polymerases are enzymes found in every living cell carrying out one of the most fundamental tasks in biology: they replicate genetic information. While most cells store their genetic information in form of DNA (deoxyribonucleic acid) and use RNA (ribonucleic acid) only as transient messenger molecules (mRNAs), some viruses rely on RNA for both information storage and transmission and are hence called “RNA viruses”. To replicate their genetic information, they need a polymerase that uses RNA as template to copy the encoded information into a new RNA molecule – a RNA-dependent RNA polymerase. Chemically speaking, RNA is a polymer (a long string of nearly-identical individual building blocks) composed of the four nucleotides adenosine (A), guanosine (G), cytosine (C) and uracil (U), the sequence of which defines the genetic information.

The job of the RNA polymerase is to read the sequence of nucleotides in a template RNA and synthesize new RNA with the same sequence (technically with a complementary sequence) using individual nucleotides as building blocks. To do this, the RNA polymerase has to achieve three basic steps. First, it needs to bind the template RNA. Second, it has to read the sequence of nucleotides in the RNA. Third, it has to incorporate the correct matching nucleotide building blocks to polymerize a new RNA strand.

Coronaviruses are RNA viruses and therefore have an RNA-dependent RNA polymerase (RdRp, or nsp12). However, they are exceptional in several ways. First, their RNA genomes are almost 30.000 nucleotides in length, which are the largest RNA virus genomes known to date. Second, their RNA polymerase nsp12 requires the additional proteins nsp7 and nsp8 in order to form the active “core” RdRp. Third, this core RdRp assembles with further viral proteins to form the RTC, which can carry out additional functions such as proofreading to remove copying errors - a highly unusual capability for RNA viruses.

Since the RNA polymerase has such a fundamental job during virus replication, it is an attractive drug target to combat viral infections. Indeed, many successful anti-viral drugs against Hepatitis C virus, HIV or Herpes virus act by inhibiting viral polymerases. Strikingly, viral RdRp enzymes are remarkably similar in their overall structure even between unrelated viruses, indicating that they share a common evolutionary ancestor and their function is so essential that it does not allow for drastic changes. Therefore, some known anti-virals developed to treat other viral diseases have also been tested for their activity against the polymerase of SARS-CoV-2 and even approved for clinical use against COVID-19 [1]. However, these repurposed compounds are generally not as effective as many had hoped, because even rather subtle structural differences between the polymerase enzymes of different viruses can have strong effects on the action of anti-viral drugs. Thus, more specific and efficient drugs against SARS-CoV-2 are therefore badly needed. In order to discover and develop such compounds, detailed knowledge of the structure and function of the RdRp is necessary.

Structure of the coronavirus RNA-dependent RNA polymerase (RdRp)

The first structure of a coronavirus polymerase was determined shortly before the outbreak of the current COVID-19 pandemic when Kirchdoerfer and Ward reported the cryo-electron microscopy (cryo-EM) structure of SARS-CoV-1 RdRp [2]. Since SARS-CoV-2 has emerged, scientists all over the world have been racing to determine the structures of its RdRp. As of February 2021, this has led to more than 20 structures of SARS-CoV-2 polymerase-complexes published in the PDB, and often several groups of scientists reported similar structures around the same time.

These structures show that the RNA polymerase nsp12 resembles a right hand with individual domains called palm, fingers and thumb (Figure 1) [3–7]. This “hand” shape is typical for viral RNA polymerases and holds a tight grip on the double-stranded RNA helix that forms between template and product strand during RNA synthesis. Within the palm lies the “active center” of the enzyme, where nucleotides are added to the growing product chain. The active center is accessible from the surface of the enzyme through a special tunnel, so that nucleotides can enter the substrate-binding site. As the template strand is opposite to the substrate-binding site, each nucleotide entering is “sampled” for whether it can form base-pairing interactions with the template base. If this is the case, the nucleotide remains bound, and it is added to the 3’ end of the product strand by forming a chemical bond. After that, the RdRp enzyme must slide ahead on the template strand by one nucleotide, which moves the newly produced 3’ end of the product RNA from the substrate-binding site (which is sometimes also referred as position +1) to the position where the previous 3’ end of the product was located prior to addition (position -1). This “translocation” completes the nucleotide addition cycle, as it positions the next templating base and frees up the substrate-binding site for the next matching building block.

Watching coronavirus multiply – the quest for structures of SARS-CoV-2 RNA polymerase 3
Figure 1 – Structure of SARS-CoV-2 RdRp

Left: Cryo-EM structure of SARS-CoV-2 RdRp (PDB 6YYT). Right: Enlarged active site with template strand in blue and nascent chain in red..

In addition to its polymerase domain, RdRp also contains a part that is only found in nidoviruses (the virus family that coronaviruses belong to) called “nidovirus RdRp-associated nucleotidyltransferase domain”, or short NiRAN-domain. Scientists believe that this domain has the capability to transfer nucleotidyl-residues, which means that it can form chemical bonds between nucleotides and other molecules. This hints that it may be involved in modification of the RNA (so-called “capping”) or in helping the enzyme initially kickstart RNA synthesis, but its precise role during coronavirus replication is still being studied by scientists.

In order to efficiently copy RNA, nsp12 requires two additional viral proteins, nsp7 and nsp8. The structures of coronavirus RdRp show that two molecules of nsp8 and one molecule of nsp7 bind on top of the hand-shaped nsp12. Interestingly, even though identical in amino acid sequence, the two nsp8 molecules adopt slightly different shapes. While one of them interacts with the finger domain of nsp12 directly, the interaction of the other one is mediated by nsp7. Both nsp8 molecules have long “arms” that protrude away from the polymerase and touch the RNA duplex as it emerges from the polymerase during replication. These “sliding poles” are unique to coronaviruses and most likely stabilize the RdRp on the RNA, which may help to make sure it doesn’t fall off during replication of the very large genome.

Seeing is believing - visualizing how anti-viral compounds block RdRp

So how exactly can this structural knowledge help to find new drugs against COVID-19? In many ways, enzymes are like tiny molecular machines. By studying their structure, one can analyze in detail how they work biochemically, and this in turn allows us to come up with ways to block their function. Most known anti-viral drugs that target RNA polymerases are so-called nucleoside analogs, which means they are molecules that structurally resemble the natural building blocks of RNA. These compounds can “trick” the RdRp by binding to the active site, but due to their chemical nature, they either cannot be incorporated into the product or lead to mutations that end the viral life cycle. The structures of SARS-CoV-2 RdRp reveal the exact architecture of the active site and show how the chemical environment that the enzyme creates around the product RNA and the substrate nucleotides, facilitates polymerization (Figure 1). This knowledge can help to rationally design or improve compounds in such a way that they bind more efficiently.

This is exemplified by recent studies analyzing how repurposed anti-virals inhibit SARS-CoV-2 RdRp. One such drug is Remdesivir, a compound originally developed against Ebola and other viruses that has also been approved by the FDA and European agencies for treatment of COVID-19 (see also this previous post). Remdesivir chemically resembles adenosine triphosphate (ATP), but has an additional bulky chemical residue called a cyano-group attached to the C1-atom of its ribose moiety. In contrast to most nucleoside analogs, Remdesivir does not block the RdRp immediately, but only after another three nucleotides are added, a process called “delayed stalling” [8–10]. Initial structures of SARS-CoV-2 RdRp in the presence of Remdesivir showed how it can act as an adenosine analog and how it can be incorporated at the 3’ end of the nascent product RNA strand and translocated to the -1 position (Figure 2a,b) [5,6]. However, this could not explain how it would lead to inhibition of RNA synthesis.

To pinpoint why Remdesivir interferes with RNA synthesis exactly after three subsequent nucleotides are added, the authors of a recent study used a combination of synthetic chemistry and structural biology [11]. They systematically determined structures of SARS-CoV-2 RdRp bound to a template-product RNA duplex which contained Remdesivir and either two or three additional nucleotides at the 3’ end. In the first case, the structure showed that the RdRp was in the post-translocated state, with Remdesivir at position -3 and  an empty substrate binding site, as expected (Figure 2c). In contrast, the structure of the RdRp with an RNA containing Remdesivir and three additional nucleotides was not in the post-translocated state and Remdesivir was not located at position -4. Instead, it remained at position -3, and the third additional nucleotide at the 3’ end of the product was stuck in the substrate-binding site (Figure 2d). This state resembles the situation directly after addition of a new nucleotide but before translocation and is hence called the pre-translocated state. This suggests that remdesivir inhibits the SARS-CoV-2 RNA polymerase by posing a translocation barrier, and the structures provide a molecular and chemical explanation for this: Initially, Remdesivir can be added to the growing RNA just like adenosine triphosphate and also translocated to add another two nucleotides. However, after binding and addition of a third nucleotide, Remdesivir can not be translocated to the -4 position, because its bulky cyano group would clash with a serine residue (Ser861) in the thumb domain of nsp12 (Figure 2c). Therefore, the polymerase gets stuck in the pre-translocated state, which explains why exactly three nucleotides can be added after Remdesivir incorporation – addition of a fourth nucleotide would first require translocation. This proposed mechanism is in agreement with previous modelling [5,10] and was shortly after independently confirmed by another structural study, in which the authors managed to trap an identical pre-translocated, stalled intermediate with Remdesivir in position -3 [12].

Watching coronavirus multiply – the quest for structures of SARS-CoV-2 RNA polymerase 4
Figure 2 – How remdesivir inhibits SARS-CoV-2 RdRp

Structural snapshots of remdesivir (purple) moving through the active site of SARS-CoV-2 RdRp. When it reaches the third position after its incorporation to the RNA (-3), its further movement is blocked because the cyano group would bump into Ser861. A) Remdesivir at position +1 (PDB 7BV2) B) Remdesivir at position -1 PDB 7C2K C) Remdesivir at position -3 (PDB 7B3B) D) Remdesivir at position -3 with the nucleotide at the 3’ end stuck in the substrate binding site (PDB 7B3C).

This mechanism also suggests how Remdesivir may at least partially escape the coronavirus proofreading enzyme nsp14, which removes misincorporated nucleotides at the 3’ end of the RNA and thus counteracts anti-virals that target RdRp. Since Remdesivir can be translocated until it reaches position -3 before it causes stalling, it may leave the active site of the enzyme before it can be recognized by the proofreading machinery.

Importantly, these studies also provide clues as to why Remdesivir has had limited success in fighting COVID-19. The structures show that the steric block between Ser861 and the cyano group of Remdesivir is not severe and can therefore be overcome by the enzyme, for example at high concentrations of substrate NTPs [8] Consistent with this, substitution of Ser861 with residues that clash even less (Alanine or Glycine) make the RdRp less sensitive or even resistant to Remdesivir [5,13]. This suggests that the translocation barrier could potentially be enhanced by a compound that leads to more severe clashes. One way to achieve this could be to modify Remdesivir to contain more bulky chemical moieties than the cyano group. Thus, the detailed molecular insights into the mechanism of Remdesivir also provide a rational basis for designing more potent anti-virals and test their effect on the SARS-CoV-2 RdRp.

Similar structure-function studies are now being undertaken also for other promising anti-viral compounds, such as Favipiravir. Like Remdesivir, it is a nucleoside analogue that was initially developed against other viruses, but showed some promising results against SARS-CoV-2. Structures of the SARS-CoV-2 RNA polymerase with Favipiravir show how it mimics both guanosine or adenosine in the active site of the enzyme by forming unusual base-pairing interactions with cytosine and uracil, respectively, and this leads to errors during RNA copying that eventually kill the virus [14,15]. Another study recently reported the structure of Suramin bound to SARS-CoV-2 RdRp [16]. In contrast to Remdesivir and Favipiravir, this compound is not a nucleoside analog and hence does not get incorporated into the RNA. Instead, two Suramin molecules can apparently bind to the RdRp and thereby prevent its association with template and product RNA, rendering it inactive.

These studies are good examples of how structural biology can visualize complicated chemical reactions in an intuitive way. Based on these results, drugs like Remdesivir or Favipiravir can be rationally optimized to more effectively combat COVID-19 and may cause less side effects.

Dissecting the SARS-CoV-2 RTC structure by structure

In addition to provide detailed snapshots of how anti-viral compounds act to inhibit SARS-CoV-2 RdRp, structural studies are also helping scientists to understand how the unique coronavirus RTC combines different functions such as RNA synthesis, proofreading and modification. After the structure of the “core” SARS-CoV-2 RdRp was determined within a few months after the outbreak of COVID-19, scientists quickly moved to studying how additional non-structural proteins bind to it to form the RTC. One of these is nsp13, which belongs to a protein class called “helicases”. These are enzymes that bind to DNA or RNA and, with the help of chemical energy in the form of ATP, move along them or unwind helices. Cryo-EM structures of nsp13 bound to the SARS-CoV-2 RdRp show that two molecules of nsp13 can bind to the RdRp, and suggest that it may allow the polymerase to move backwards on the RNA (Figure 3) [17,18]. This finding seems unexpected at first, but experts think this may be required for the proofreading enzyme nsp14 to remove errors that the polymerase makes during copying or to produce the mRNAs for certain viral proteins. Another recent cryo-EM structure shows that the small protein nsp9 interacts with the NiRAN domain in the SARS-CoV-2 polymerase [19]. In the accompanying paper, the authors suggest that the NiRAN domain may be involved in RNA capping, and nsp9 seems to block its activity. However, others have proposed that nsp9 is in fact a substrate of nucleotidylation by the NiRAN domain, and that it may be involved in priming the RNA polymerase for initial RNA synthesis [20]. Thus, further studies are necessary to determine whether the NiRAN is involved in capping, priming or even both.

Watching coronavirus multiply – the quest for structures of SARS-CoV-2 RNA polymerase 5
Figure 3 – Structures help uncover the roles of the different RTC components

Structure of the SARS-CoV-2 RdRp complex with nsp13 (salmon) and nsp9 (cyan) (PDB 7CYQ).

What’s next?

Over the past year, scientists have uncovered the structure of the SARS-CoV-2 RdRp and associated proteins at a record-breaking pace. While these structures provide impressive first glimpses at RdRp-complexes, researchers are already working to determine how the remaining nsp proteins interact with the RdRp to form the complete RTC complex. This may ultimately aid the quest to find new treatment options for COVID-19 that not only target the polymerase itself, but also proofreading or RNA capping. Such drugs are not only desperately needed for the current pandemic, but may also prove useful for future emerging coronaviruses, because the RNA polymerase is typically very similar even between different virus strains.

1. Ledford H: Hopes rise for coronavirus drug remdesivir. Nature 2020, doi:10.1038/d41586-020-01295-8.

2. Kirchdoerfer RN, Ward AB: Structure of the SARS-CoV nsp12 polymerase bound to nsp7 and nsp8 co-factors. Nat Commun 2019, 10:2342.

3. Gao Y, Yan L, Huang Y, Liu F, Zhao Y, Cao L, Wang T, Sun Q, Ming Z, Zhang L, et al.: Structure of the RNA-dependent RNA polymerase from COVID-19 virus. Science 2020, 368:779–782.

4. Hillen HS, Kokic G, Farnung L, Dienemann C, Tegunov D, Cramer P: Structure of replicating SARS-CoV-2 polymerase. Nature 2020, 584:154–156.

5. Wang Q, Wu J, Wang H, Gao Y, Liu Q, Mu A, Ji W, Yan L, Zhu Y, Zhu C, et al.: Structural Basis for RNA Replication by the SARS-CoV-2 Polymerase. Cell 2020, 182:417-428.e13.

6. Yin W, Mao C, Luan X, Shen D-D, Shen Q, Su H, Wang X, Zhou F, Zhao W, Gao M, et al.: Structural basis for inhibition of the RNA-dependent RNA polymerase from SARS-CoV-2 by remdesivir. Science 2020, 368:1499–1504.

7. Peng Q, Peng R, Yuan B, Zhao J, Wang M, Wang X, Wang Q, Sun Y, Fan Z, Qi J, et al.: Structural and Biochemical Characterization of the nsp12-nsp7-nsp8 Core Polymerase Complex from SARS-CoV-2. Cell Reports 2020, 31:107774.

8. Gordon CJ, Tchesnokov EP, Woolner E, Perry JK, Feng JY, Porter DP, Götte M: Remdesivir is a direct-acting antiviral that inhibits RNA-dependent RNA polymerase from severe acute respiratory syndrome coronavirus 2 with high potency. J Biol Chem 2020, 295:6785–6797.

9. Tchesnokov EP, Feng JY, Porter DP, Götte M: Mechanism of Inhibition of Ebola Virus RNA-Dependent RNA Polymerase by Remdesivir. Viruses 2019, 11:326.

10. Gordon CJ, Tchesnokov EP, Feng JY, Porter DP, Gotte M: The antiviral compound remdesivir potently inhibits RNA-dependent RNA polymerase from Middle East respiratory syndrome coronavirus. The Journal of biological chemistry 2020, doi:10.1074/jbc.ac120.013056.

11. Kokic G, Hillen HS, Tegunov D, Dienemann C, Seitz F, Schmitzova J, Farnung L, Siewert A, Höbartner C, Cramer P: Mechanism of SARS-CoV-2 polymerase stalling by remdesivir. Nat Commun 2021, 12:279.

12. Bravo JPK, Dangerfield TL, Taylor DW, Johnson KA: Remdesivir is a delayed translocation inhibitor of SARS CoV-2 replication in vitro. Biorxiv 2020, doi:10.1101/2020.12.14.422718.

13. Tchesnokov EP, Gordon CJ, Woolner E, Kocinkova D, Perry JK, Feng JY, Porter DP, Götte M: Template-dependent inhibition of coronavirus RNA-dependent RNA polymerase by remdesivir reveals a second mechanism of action. J Biol Chem 2020, 295:16156–16165.

14. Naydenova K, Muir KW, Wu L-F, Zhang Z, Coscia F, Peet MJ, Castro-Hartmann P, Qian P, Sader K, Dent K, et al.: Structural basis for the inhibition of the SARS-CoV-2 RNA-dependent RNA polymerase by favipiravir-RTP. Biorxiv 2020, doi:10.1101/2020.10.21.347690.

15. Peng Q, Peng R, Yuan B, Wang M, Zhao J, Fu L, Qi J, Shi Y: Structural basis of SARS-CoV-2 polymerase inhibition by Favipiravir. Innovation 2021, doi:10.1016/j.xinn.2021.100080.

16. Yin W, Luan X, Li Z, Zhou Z, Wang Q, Gao M, Wang X, Zhou F, Shi J, You E, et al.: Structural basis for inhibition of the SARS-CoV-2 RNA polymerase by suramin. Nat Struct Mol Biol 2021, 28:319–325.

17. Chen J, Malone B, Llewellyn E, Grasso M, Shelton PMM, Olinares PDB, Maruthi K, Eng ET, Vatandaslar H, Chait BT, et al.: Structural basis for helicase-polymerase coupling in the SARS-CoV-2 replication-transcription complex. Cell 2020, doi:10.1016/j.cell.2020.07.033.

18. Yan L, Zhang Y, Ge J, Zheng L, Gao Y, Wang T, Jia Z, Wang H, Huang Y, Li M, et al.: Architecture of a SARS-CoV-2 mini replication and transcription complex. Nat Commun 2020, 11:5874.

19. Yan L, Ge J, Zheng L, Zhang Y, Gao Y, Wang T, Huang Y, Yang Y, Gao S, Li M, et al.: Cryo-EM Structure of an Extended SARS-CoV-2 Replication and Transcription Complex Reveals an Intermediate State in Cap Synthesis. Cell 2021, 184:184-193.e10.

20. Slanina H, Madhugiri R, Bylapudi G, Schultheiß K, Karl N, Gulyaeva A, Gorbalenya AE, Linne U, Ziebuhr J: Coronavirus replication–transcription complex: Vital and selective NMPylation of a conserved site in nsp9 by the NiRAN-RdRp subunit. Proc National Acad Sci 2021, 118:e2022310118.

1.    Introduction:

With SARS-CoV-2 infections and related death rates continuing to rise worldwide and new variants emerging, the virus is still a great and present danger. Although we have gathered significant knowledge and the first vaccinations have started, new mutations can still set our efforts back and possibly make the virus even more potent. Thus, searches for new treatments are of paramount importance. The nucleocapsid structural protein, or N-Protein, could serve as another drug target.

The nucleocapsid’s main function is to protect the genomic RNA by packaging it into a ribonucleoprotein complex (RNP). Apart from this, the protein has other functions essential for the viral life cycle. It is involved in virion assembly, viral RNA synthesis, transcriptional regulation of genomic RNA, and translation of viral proteins​1​.

2.    Structure

The SARS-CoV-2 nucleocapsid is an RNA-binding protein separated into five domains. Three of the domains are intrinsically disordered, meaning they are challenging for conventional structural characterization . Two of these intrinsically disordered regions (IDRs) are located at the N- and C-terminus of the protein, and the third acting as a linker between the two structured domains. Not much is known about the IDRs: their transient structural details are mostly predictions from molecular simulations​2​. The other two domains, the RNA-binding domain and the Dimerization domain (DMD), are well organised and their structures have been determined by X-Ray diffraction and NMR.

The packaging of the RNA - Nucleocapsid proteins 6

Disordered Domains:

 The Linker

Flexible linkers contain a large number of polar and charged amino acids. The resulting electrostatic repulsion and lack of a stabilizing hydrophobic core prevents a well-structured conformation resulting in the disorder. Experiments have shown that the Linker region of SARS-CoV-2 nucleocapsid incorporates such polar regions which are repelled by the neighbouring folded domains. The Linker contains a positively charged serine and arginine rich motif which likely functions as a phosphorylation site for a direct interaction with RNA, M (membrane) protein, and Nsp3​1,2​. Simulations reveal that the Linker does not often adopt helical conformation as they are transient, but it may contribute to oligomerization or act as a recognition motif for the binding of other proteins. Intrinsically disordered regions, in general, are thought to be involved in a number of regulatory functions including modulation of transcription, translation, post-translational modifications such as phosphorylation, and cell signalling, often through ordering when in contact with another protein domain​1​.

The disordered N- and C-terminal domains

The N- and the C-terminal regions of the nucleocapsid protein are also disordered but have several regions which also may form transient helices​2​. The N-terminal conformation is significantly affected by the neighbouring folded RNA binding domain. Electrostatic interactions with the RNA binding domain are proposed to cause a repulsion of the positively charged N-terminal domain from its positive surface of the and an attraction to the slightly negatively charged parts​2​.

The other disordered tail, the C-terminal domain, interacts with the neighbouring folded dimerization domain, competing with intradomain interactions​2​.

Folded domains:

The RNA-binding domain and dimerization domain are well-organised folded domains. They make up 257 of the 422 residues in nucleocapsid. All five domains, nevertheless, have been proposed to be involved in RNA-binding​1​.

The RNA-binding domain

This domain mainly interacts through residues in a positively charged β-hairpin and the so-called palm region. It is rich in aromatic and basic residues that are folded into a right-hand-like shape with a protruded basic finger, a basic palm, and an acidic wrist (see Fig.1 B). Crystal structures from SARS-CoV-2 show two right-handed loops that surround the β-sheet core in a sandwiched structure. The β-sheet core consists of four antiparallel β-strands, a short 310 helix in front of the β2 strand and a protruding β hairpin that is located between the β2 and β5 strands (see Fig. 1 A). The structural basis for RNA binding by nucleocapsid is not yet known but comparisons with the less dangerous virus type HCoV-OC43 suggest a unique potential RNA binding pocket beside the β-sheet core​3,4​.

The packaging of the RNA - Nucleocapsid proteins 7
Fig.  1. Structure of the RNA binding domain of the Nucleocapsid protein of SARS-CoV-2 (PDB: 7CDZ). Image: Oliver Kippes

The dimerization domain (DMD)

The dimerization domain (DMD) is only stable when several nucleocapsid molecules form a dimer or oligomer. Its structure consists of three 310 -helices, five α-helices and two antiparallel β-strands, which create a β-hairpin. This β-hairpin together with the other parts of the domain form a shape that is like the letter “C”. Two domains form a tight homodimer with a rectangular slab shape, the β-hairpins from each N-Protein are at one side and the helices at the opposite side. The dimer is stabilized through hydrogen bonds and hydrophobic interactions. It is possible that the DMD has RNA binding activity, experiments showed that the amount of free RNA from SARS-CoV 2 is decreased if DMD proteins are added​4,5​.

The packaging of the RNA - Nucleocapsid proteins 8
Fig.  2 Structure of the Dimerization domain of the Nucleocapsid protein of SARS-CoV-2 (PDB: 7C22). Image: Oliver Kippes

The PDB currently has 22 structures that picture the RNA binding domain and the dimerization domain. The structures: 7ACT and 7ACS are particularly interesting because they are the only structures that are in complex with RNA. The RNA binding domain is also a potential inhibitor target and a subject of inhibitor Studies​6​. The dimerization domain has structures that show the domain as a monomer and a dimer. There is no structure of the whole protein in the PDB yet.

3.        Ribonucleoprotein Complex (RNP):

In order to package the viral RNA genome, the nucleocapsid binds the RNA via the RNA binding domain in order to form a long, flexible, helical ribonucleoprotein complex​1​. Two key functionalities are necessary for this process: The nucleocapsid must interact with the nucleic acid, which is preferentially mediated by GGG motifs from the leader RNA sequences​7​ and the nucleocapsids need the ability to oligomerize. They interact with the RNA at multiple sites through specific (sequence dependent) and non-specific (sequence independent) binding. Little is known about specific binding to the RNA, but nonspecific binding is likely to involve interactions between the negatively charged phosphate backbone of the RNA and the positively charged groove formed by the residues 248-280 of the N protein. It seems also clear that the nucleocapsid helps RNA folding​1​. The helical RNPs consist of coils 9 – 16 nm in diameter with a hollow interior 3 – 4 nm wide. It is frequently twisted upon itself and most of the RNPs are supercoiled into compact intertwined structures​1​. New cryoelectron tomography analysis of SARS-CoV-2 revealed another potential structure of the RNP, this structure is described like ‘beads on a string’ that links RNPs together, and more research is urgently needed[16]. In addition to this, the exact mechanisms of RNA protection through the nucleocapsid are still unknown​1​.

The packaging of the RNA - Nucleocapsid proteins 9
Coronavirus Nucleocapsid & RNA - Components of the SARS CoV 2 Virus. Image: by Thomas Splettstößer /

4.    Functions:

Nucleocapsids are multifunctional proteins necessary for the viral life cycle. The main function of the nucleocapsids is the packaging of the genomic RNA into Ribonucleoprotein complexes to protect the RNA. An additional function is to enhance the stability of the entire virion through interactions with the membrane protein located in the enclosing viral membrane​8​. These interactions are also seen in SARS-CoV-1, where the membrane protein binds directly to the nucleocapsid via an ionic interaction​1​. The nucleocapsid of SARS-CoV-2 is an antagonist for interferons, suppressing the host’s defence mechanisms by preventing the synthesis of antiviral proteins​9​. Studies in both SARS-CoV-1 and SARS-CoV-2 have shown interactions between nucleocapsid and gRNA/sgRNA which indicate a role for the nucleocapsid in viral transcription and translation​1,10​. The N-Protein could also have an important role during viral assembly through interactions with envelope proteins​1,11​.

Many of the supplementary functions of the SARS-CoV-2 N-Proteins are still up for debate. A complete atomic structure of the RNP complex would go a long way in answering these questions, but the labile nature of the full-length N-Protein makes this a difficult task​1​.

5.    Comparison between Coronaviruses:

The Nucleocapsid is the most conserved of the structural proteins in all coronaviruses​12​. This has proven useful for the development of SARS-CoV-2 Rapid Antigen Tests (for example the Roche Test). The appearance of the new English SARS-CoV-2 VUI 202012/01 variant, with changes to the spike protein, strengthens the importance of having multiple drug and test targets, particularly those that are less likely to mutate, such as the nucleocapsid​13​.

The high sequence analogy also allows comparison between functions within the coronavirus family, therefore a comparison between related nucleocapsids from β-Coronaviruses may shed light on these proteins’ structures and functions. The two ordered domains and the C-terminal IDR share a similar topological organization with other Coronaviruses and are involved in multiple functions in the viral life cycle. A study of the coronavirus Mouse Hepatitis Virus (MHV) analysed their nucleocapsids recruitment to Replication Transcription Complexes (RTCs) and revealed an interaction between regions on its N-terminal IDR and the serine/arginine rich region of the Linker Domain with NSP3. The interactions with NSP3 stimulate RNA replication in MHV​10​. Experiments have shown that the nucleocapsid from SARS-CoV-1 binds to NSP3 from MHV And interactions with NSP3 have been identified in other coronaviruses as well​14,15​. Thus, interactions between N-Proteins and non-structural proteins (Nsps) are proposed to have a stimulating effect for the RNA synthesis in Coronaviruses as well​10​.

SARS-CoV-1 and MHV both exhibit  helically packed RNP complexes​6​ however, it is believed that SARS-CoV-2 may have a different organization​16​. Crystal structures of the nucleocapsid RNA binding domain from SARS-CoV-1 and SARS-CoV-2 show different crystal symmetry and packaging which could mean that SARS-CoV-2 N-Proteins have other potential contacts then SARS-CoV​​-1​4​. Other possible organizations include a lattice of nucleocapsid complexes with the viral RNA linked to neighbouring RNPs like ‘beads on a string’. This ‘string‘ structure allows an efficient way of packing the large RNA genome and ensures the virus particles a high steric flexibility that is required for the incorporation into budding virions. The packaging mechanism of the SARS-CoV-2 nucleocapsid needs to be explained before we can expect to deduce an effective therapeutic approach or vaccination mechanism​17​.


  1. 1.
    McBride R, van Zyl M, Fielding B. The Coronavirus Nucleocapsid Is a Multifunctional Protein. Viruses. Published online August 7, 2014:2991-3018. doi:10.3390/v6082991
  2. 2.
    Cubuk J, Alston JJ, Incicco JJ, et al. The SARS-CoV-2 nucleocapsid protein is dynamic, disordered, and phase separates with RNA. Published online June 18, 2020. doi:10.1101/2020.06.17.158121
  3. 3.
    Peng Y, Du N, Lei Y, et al. Structures of the            SARS            ‐CoV‐2 nucleocapsid and their perspectives for drug design. EMBO J. Published online September 11, 2020. doi:10.15252/embj.2020105938
  4. 4.
    Kang S, Yang M, Hong Z, et al. Crystal structure of SARS-CoV-2 nucleocapsid protein RNA binding domain reveals potential unique drug targeting sites. Acta Pharmaceutica Sinica B. Published online July 2020:1228-1238. doi:10.1016/j.apsb.2020.04.009
  5. 5.
    Zhou R, Zeng R, von Brunn A, Lei J. Structural characterization of the C-terminal domain of SARS-CoV-2 nucleocapsid protein. Mol Biomed. Published online August 6, 2020. doi:10.1186/s43556-020-00001-4
  6. 6.
    Chang C, Hou M-H, Chang C-F, Hsiao C-D, Huang T. The SARS coronavirus nucleocapsid protein – Forms and functions. Antiviral Research. Published online March 2014:39-50. doi:10.1016/j.antiviral.2013.12.009
  7. 7.
    Lutomski CA, El-Baba TJ, Bolla JR, Robinson CV. Proteoforms of the SARS-CoV-2 nucleocapsid protein are primed to proliferate the virus and attenuate the antibody response. Published online October 6, 2020. doi:10.1101/2020.10.06.328112
  8. 8.
    Lu S, Ye Q, Singh D, Villa E, Cleveland DW, Corbett KD. The SARS-CoV-2 Nucleocapsid phosphoprotein forms mutually exclusive condensates with RNA and the membrane-associated M protein. Published online July 31, 2020. doi:10.1101/2020.07.30.228023
  9. 9.
    Mu J, Fang Y, Yang Q, et al. SARS-CoV-2 N protein antagonizes type I interferon signaling by suppressing phosphorylation and nuclear translocation of STAT1 and STAT2. Cell Discov. Published online September 15, 2020. doi:10.1038/s41421-020-00208-3
  10. 10.
    Cascarina SM, Ross ED. A proposed role for the SARS‐CoV‐2 nucleocapsid protein in the formation and regulation of biomolecular condensates. FASEB j. Published online June 20, 2020:9832-9842. doi:10.1096/fj.202001351
  11. 11.
    Chen H, Cui Y, Han X, et al. Liquid–liquid phase separation by SARS-CoV-2 nucleocapsid protein and RNA. Cell Res. Published online September 8, 2020:1143-1145. doi:10.1038/s41422-020-00408-2
  12. 12.
    Chechetkin VR, Lobzin VV. Ribonucleocapsid assembly/packaging signals in the genomes of the coronaviruses SARS-CoV and SARS-CoV-2: detection, comparison and implications for therapeutic targeting. Journal of Biomolecular Structure and Dynamics. Published online September 9, 2020:1-15. doi:10.1080/07391102.2020.1815581
  13. 13.
    Diagnostics Roche. SARS-CoV-2 Rapid Antigen Test. diagnostics.roche. Published February 22, 2021. Accessed February 22, 2021.
  14. 14.
    Hurst KR, Ye R, Goebel SJ, Jayaraman P, Masters PS. An Interaction between the Nucleocapsid Protein and a Component of the Replicase-Transcriptase Complex Is Crucial for the Infectivity of Coronavirus Genomic RNA. JVI. Published online July 21, 2010:10276-10288. doi:10.1128/jvi.01287-10
  15. 15.
    Cong Y, Ulasli M, Schepers H, et al. Nucleocapsid Protein Recruitment to Replication-Transcription Complexes Plays a Crucial Role in Coronaviral Life Cycle. Dutch RE, ed. J Virol. Published online November 15, 2019. doi:10.1128/jvi.01925-19
  16. 16.
    Hurst KR, Koetzner CA, Masters PS. Characterization of a Critical Interaction between the Coronavirus Nucleocapsid Protein and Nonstructural Protein 3 of the Viral Replicase-Transcriptase Complex. Journal of Virology. Published online June 12, 2013:9159-9172. doi:10.1128/jvi.01275-13
  17. 17.
    Klein S, Cortese M, Winter SL, et al. SARS-CoV-2 structure and replication characterized by in situ cryo-electron tomography. Published online June 23, 2020. doi:10.1101/2020.06.23.167064


During the Corona-dominated year 2020 scientists all over the world united and gathered as much information as possible to understand the exact mechanism behind the lifecycle of SARS-CoV-2.
The main question was: how can we stop the virus from invading the human cell and causing COVID-19? A focus in the quest to answer this question, was the SARS-CoV-2 entry mechanism. The group of Janet Iwasa contributes to this ongoing research process by providing a high-quality video animation of the SARS-CoV-2 entry into the human host cell. This current version of the entry animation has already been shown on PBS News (08.12.20) and we aim to improve it with your help in 2021 (see below)!

The Entry Animation

Click this Link to see the Entry Animation on YouTube.

This entry animation is a collection of current knowledge about the SARS-CoV-2 entry mechanism. What we know at this point is that the mechanism starts with the viral approach. An individual can be infected with SARS-CoV-2 after inhaling airborne viral particles. These viruses can then travel into the airways, where they may encounter host cells of the respiratory epithelium in the trachea and lungs.

As you can read in a previous blogpost, the Spikes (teal) are Corona’s key to invade the host cell and thus of great interest in terms of vaccination and therapeutic approaches against COVID-19. The Spike protein recognizes a specific receptor on the human host cell surface, called ACE2 (purple). Usually, the Spikes are very dynamic and able to undergo opening, closing and bending movements. But after binding to ACE2, the protein is locked into its open position.  Another protein on the cell surface, called TMPRSS2 (orange), can then come along and cut the Spike protein in a specific location. These segments of the Spike protein fall away, exposing a portion of the Spike protein which was previously hidden. 

The Spike protein is then able to undergo a series of dramatic conformational changes. During the first stage, the Spike protein inserts itself into the membrane of the cell. In the second stage, segments of the Spike protein zipper back on itself, forcing the membrane of the cell and the viral membrane to fuse. After fusion, the viral RNA is deposited into the host cell, where it will direct the cell to produce more virions. This process is known as post-fusion.

The Annotation Tool

Click this Link to use the Annotation Tool.

SARS-CoV-2 Entry Animation from Iwasa Group – a little Christmas Present to the Scientific Community 10
Figure 1: Annotation tool with the animation in the center, annotations from the Iwasa Lab on the left and Comments on the right.
SARS-CoV-2 Entry Animation from Iwasa Group – a little Christmas Present to the Scientific Community 11
Figure 2: How it looks like when you hover over the video.

In January, this will be supplemented with a tool so that the knowledge about the SARS-CoV-2 entry mechanism can be discussed interactively by scientists all over the world. This online platform will serve as a basis for scientific discussion by providing an annotation tool. Scientific users can set a pin at any point of the video and comment their suggestions, criticism or questions about the mechanism and the structure depictions (see Fig. 1 for a prototype). Based on these annotations, the Iwasa Group will improve the animation of the entry process to provide an up-to-date detailed representation of this key process. The resulting entry animation is not only addressed to scientists, but it is also used for public outreach and education.

Even though the entry mechanism is not entirely understood yet, it could already be depicted in the fantastic animation of the Iwasa Group. There are still a lot of details and additional information to be found out about this process. From January on, the annotation tool therefore will provide the opportunity to discuss this mechanism publicly.

Thanks to the Iwasa Group for this Christmas present!

Merry Christmas!


It is known as VUI‑202012/01 or B.1.1.7 – the new mutation of the coronavirus Sars-CoV-2. It may be responsible for a sharply increased number of infections in the southeast of England (​1​), however, the scientific results leading to very strict lockdown measurements in the south of the UK, and travel restrictions across Europe are few and far between. Here, we have compiled what is known up until now.

On mutations

Mutations are normal in the evolution of life – and of viruses. If two similar viruses have infected the same cell, their genomes can become mixed-up, one of the reasons why animal influenza strains are considered so dangerous. This is also called recombination. Mutations can be caused by chemicals, radiation (including UV light) and errors during genome copying. A typical SARS-CoV-2 virus accumulates two amino acid changes per month in its genome — a rate of change about half that of influenza (​2​). This is because SARS-CoV-2 can repair RNA to some extent. But even so, this natural process led to thousands of mutations since the beginning of the pandemic. If they affected the virus life cycle negatively, that strain may have likely died out - if they did not make a difference or enhanced its chances of survival, it may have persisted.

Nextstrain interface as of 22/12/2020: Mutations happen a lot. Screenshot by Andrea Thorn / Coronavirus structural Task Force.
SARS-CoV-2 mutations as of 22/12/2020: Mutations happen a lot. A very good interface to the genetic variants of SARS-CoV-2 is Screenshot by Andrea Thorn / Coronavirus structural Task Force.

Many mutations that are observed occur in the spike protein, which both serves to recognize potential host cells but is also what is being recognized by antibodies (i.e., the immune system).

Changes here can be crucial for the survival of the virus (“evolutionary pressure”) as they could significantly alter its affinity to the human receptor ACE2, which the virus uses as gateway to our cells.

Animation of spike protein binding the host cell and the molecular mechanism merging host cell and virus. CC-BY-NC Coronavirus Structural Task Force / Iwasa Lab

What vaccines do

Most, if not all, potential COVID-19 vaccines expose our body to some part of the spike protein, which can be made by the body itself (mRNA vaccines) or carried by a harmless virus instead of SARS-CoV-2 (vector). Our body then produces antibodies which specifically recognize the spike and persist for several months. If we are exposed afterwards to the real virus, the body can recognize it immediately – and the risk of infection is much lower as the immune system swings into action immediately. Earlier this year, the spike mutation D614G (amino acid residue number 614 changing from aspartic acid (D) to glycine (G)) caused quite a stir in the media, and became the predominant form of SARS-CoV-2 (​2​, 3). However, if and in how far this was caused by natural selection is still debated (​3​). Another example which triggered an increased media coverage was the mutation Spike Y453F, which originated from infected minks in Denmark (​4​) and led to a culling of millions of animals. In any case, if we would be vaccinated with a spike protein form that would be different from the one in a virus we encounter later, there is a small chance that the vaccine may be rendered ineffective. This chance is, however, small for SARS-CoV-2, in any case much smaller than for HIV, which famously evaded any attempt to develop a vaccine.

Model of spike (green) with bound antibody (yellow). Both models can be 3D printed (Instructions).  Photo CC-BY-NC 2020 Andrea Thorn / Coronavirus Structural Taskforce.
Model of spike (green) with bound antibody (yellow). Both models can be 3D printed (Instructions). Photo CC-BY-NC 2020 Andrea Thorn / Coronavirus Structural Taskforce.

What do we know?

There was a steep rise in infections in the UK recently, as in most other European countries.

A new mutation of the virus has emerged and seems to replace the old version of SARS-CoV-2 (​5​). Thousands of patients have been found to carry this variant.

This new variant has more mutations at once than expected. These mutations have not observed in this combination before.

The variant has been reported in the UK, the Netherlands, Denmark, Australia and Belgium so far.

What is striking to me as scientist about these findings is one thing in particular: How could the British government find that thousands of people were having the new SARS-CoV-2 variant, instead of the old, if the illness does not look any different? Sequencing samples from each and every patient would be technically very challenging, if not impossible. How could they know? The answer is:


The main PCR test employed in the United Kingdom is Thermo Fisher's TaqPathCOVID-19. This test identifies RNA on three different genome locations: In ORF1ab, nucleotide and spike. Now, it stopped working for the spike portion of the test, while the other two RNAs were still found to be present, which likely prompted scientists to sequence some of the samples in question. And indeed, the new mutant has a deletion of histidine-69 and valine-70, called 69-70del. This permitted easy differentiation of patients with the old SARS-CoV-2 (3 hits) and the new (2 hits) and is the reason why we know so much about the epidemiology of this variant!​*​ It has also to be said that this test is not used as often in other countries, such as Germany, and this could well be the reason why we do not know if and how widespread it is here. In addition, other countries sequence much smaller proportions of virus isolates than the UK, so ongoing circulation of this variant outside of the UK cannot be excluded.

The details of the mutation

The new variant of SARS-CoV-2 VUI-202012/01 has 14 amino acid changes and three deletions affecting the genes for ORF1ab, spike and ORF8. One of these mutations (N501Y) occurs in the receptor binding domain and could lead to an increased binding affinity to the human ACE2. The 69-70 deletion has likely an immunological role and is the reason this mutant was detected so widely, as this RNA location is used for PCR tests. Another interesting mutation is the P681H, which is next to a furin cleavage site that has a biological significance in membrane fusion. These mutations could be responsible for the increased transmissibility. The effects of the other mutations aren’t fully investigated yet. Here is a list of the mutations which have been observed in the VUI‑202012/01 or B.1.1.7 variant:

T1001I in gene ORF1ab
A1708D in gene ORF1ab
I2230T in gene ORF1ab
SGF 3675-3677 deletion in gene ORF1ab
A1708D in gene ORF1ab
HV 69-70 deletion in spikeThe 69-70 deletion on the spike protein is a re-occurring mutation that has shown to often co-occur with other amino acid changes in the RBD (​6​, 7).
(1) Evasion to the human immune response and in association with other receptor binding domain changes (​1​)
(2) Immunological role (​8​)
(3) Leads to diagnostic failures which permit detection (see above, "Serendipity")
(4) Associated with immune escape in immunocompromised patients (​9(​8​))
Furthermore, the 69-70 deletion arose in multiple unrelated lineages and is associated with the evasion of the immune response (​9​). It is being hypothesized that this mutation undergoes a strong positive selection when exposed to convalescent plasma therapy in an immunocompromised human host (​7​).
Y144 deletion in spikeDeletion in the spike N-terminal domain (​9​)
N501Y in spikeOne of six key contact residues in the spike receptor binding domains, this mutation leads to an increasing binding affinity to human and murine ACE2 (​1​).
A570D in spikeMutation located at the spike receptor binding domain (​10​)
P681H in spikeThe P681H mutation is located directly next to the furin cleavage site. It is one of the four residues which are insertions when compared to closely related coronaviruses, creating a furin cleavage site in the spike protein between the spike S1 and S2 domains. This prompts the entry of the virus into respiratory epithelial cells as well as the transmission in animal models (​1​)
The S1/S2 furin cleavage site of SARS-CoV-2 is not found in closely related coronaviruses and has been shown to promote entry into respiratory epithelial cells and transmission in animal models (​9​)
T716I in spikeMutation in in the S2 domain
S982A in spikeMutation in in the S2 domain (​10​)
D1118H in spikeMutation in in the S2 domain (​8​)
Q27 stop in ORF8The Q27stop mutation in the ORF8 leads to the truncation of the ORF8, and as it only consists of 121 amino acids, the consequence might be a loss of function. These and the other mutations could be responsible for the increased transmissibility of the B.1.1.7 variant. In any case, this mutation truncates the ORF8 protein at residue 27 or renders it inactive which allows further downstream mutations to accrue. (​1​)
R52I in ORF8
Y73C in ORF8
D3L in nucleocapsid
S235F in nucleocapsid
picture of Spike mutation sites from the COVID-19 Genomics UK Consortium
Spike mutation sites. Picture by the COVID-19 Genomics UK Consortium (​9​).

Why were there so many mutations at once?

This could be a result of prolonged or chronical SARS-CoV-2 infections as study of these infections reveal unusually large numbers of nucleotide changes and deletion mutations and often high ratios of non-synonymous changes. In addition to this, convalescent plasma treatment can cause intra-patient virus genetic diversity (​11​).

What does the new mutation mean in terms of impact and epidemiology?

There was an increase in cases with the new strain in total and in

proportion to the old (​1​). What does that mean for us?

This is what the internet says:

The COVID-19 genomics UK consortium (COG) reports about a “priority set of SARS-CoV-2 Spike mutations that are of particular interest based on potential epidemiological significance in the UK and/or biological evidence based on the literature or unpublished work.” (​9​)

The New and Emerging Respiratory Virus Threats Advisory Group of the British government (NERVTAG) discussed the new variant on Friday and concluded that its growth rate is higher by 67-75% and that this is likely due to a selective advantage. “In summary, NERVTAG has moderate confidence that VUI-202012/01 demonstrates a substantial increase in transmissibility compared to other variants.” (​12​) This is very likely the source of Boris Johnson’s claim to this strain being “70% more infectious”.

The English government writes that PHE (Public Health England) „is working with partners to investigate and plans to share its findings over the next 2 weeks. There is currently no evidence to suggest that the variant has any impact on disease severity, antibody response or vaccine efficacy. High numbers of cases of the variant virus have been observed in some areas where there is also a high incidence of COVID-19. It is not yet known whether the variant is responsible for these increased numbers of cases.” (​13​)


From this, we conclude that the British government, and we, do not know yet. It has not been conclusively shown that the new variant is more infectious (likely), has an easier time to evade the host immune system or if the vaccine will be less effective against it (very unlikely). The epidemologic model which predicts a higher tranmissability has still to be published, the science is still in the making. Tests of vaccines against the new variant are ongoing and will take a few weeks. There is yet little evidence that this new variant poses a significantly bigger threat than others - or to the contrary.


While I am listed as author of this article, it could not have been written without the help and research by Pairoh Seeliger, Lea von Soosten, Luise Kandler, Erik Nebelung and Oliver Kippes who all helped in this.
I would also thank Nicolai Wilk from Thermo Fisher Scientific who quickly responded to my questions about their test.

The title picture shows mutation cards from the game Pandemic Expansion: On the Brink by Z-Man Games.

  1. ​*​
    The 69-70del mutation is predominantly observed in B.1.1 (including B.1.1.7), B.1.258, and the cluster 5 variant lineages of SARS-CoV-2.


  1. 1.
    A. Rambaut, Preliminary genomic characterisation of an emergent SARS-CoV-2 lineage in the UK defined by a novel set of spike mutations. (2020), (available at
  2. 2.
    E. Callaway, The coronavirus is mutating — does it matter? Nature, 174–177 (2020).
  3. 3.
    L. Zhang, C. B. Jackson, H. Mou, A. Ojha, H. Peng, B. D. Quinlan, E. S. Rangarajan, A. Pan, A. Vanderheiden, M. S. Suthar, W. Li, T. Izard, C. Rader, M. Farzan, H. Choe, SARS-CoV-2 spike-protein D614G mutation increases virion spike density and infectivity. Nat Commun (2020), doi:10.1038/s41467-020-19808-4.
  4. 4.
    ECDC, Detection of new SARS-CoV-2 variants related to mink. (2020), (available at
  5. 5.
    ONS UK , Percentage of COVID-19 cases that are positive for ORF1ab and N genes. (2020), (available at
  6. 6.
    R. M. Dawood, M. A. El-Meguid, G. M. Salum, K. El-Wakeel, M. Shemis, M. K. El Awady, Bioinformatics prediction of B and T cell epitopes within the spike and nucleocapsid proteins of SARS-CoV2. Journal of Infection and Public Health (2020), doi:10.1016/j.jiph.2020.12.006.
  7. 7.
    S. A. Kemp, D. A. Collier, R. Datir, S. Gayed, A. Jahun, M. Hosmillo, I. A. Ferreira, C. Rees-Spear, P. Mlcochova, I. U. Lumb, D. Roberts, A. Chandra, N. Temperton, K. Sharrocks, E. Blane, J. A. Briggs, K. G. Smith, J. R. Bradley, C. Smith, R. Goldstein, I. G. Goodfellow, A. Smielewska, J. P. Skittrall, T. Gouliouris, E. Gkrania-Klotsas, C. J. Illingworth, L. E. McCoy, R. K. Gupta, Neutralising antibodies drive Spike mediated SARS-CoV-2 evasion (2020), , doi:10.1101/2020.12.05.20241927.
  8. 8.
    K. Kupferschmidt, Mutant coronavirus in the United Kingdom sets off alarms, but its importance remains unclear. Science (2020), doi:10.1126/science.abg2626.
  9. 9.
    COG, COG-UK update on SARS-CoV-2 Spike mutations of special interest Report 1. (2020), (available at
  10. 10.
    S. Kemp, W. Harvey, R. Datir, D. Collier, I. Ferreira, A. Carabelii, D. L. Robertson, R. K. Gupta, Recurrent emergence and transmission of a SARS-CoV-2 Spike deletion ΔH69/V70 (2020), , doi:10.1101/2020.12.14.422555.
  11. 11.
    ECDC, Threat Assessment Brief: Rapid increase of a SARS-CoV-2 variant with multiple spike protein mutations observed in the United Kingdom. (2020), (available at
  12. 12.
    NERVTAG, NERVTAG meeting on SARS-CoV-2 variant under investigation VUI-202012/01. (2020), (available at
  13. 13.
    PHE, PHE investigating a novel variant of COVID-19 . (2020), (available at

On Nov 9th, 2020 Pfizer issued a press release stating their conclusion that the COVID-19 vaccine they developed with BioNTech appeared to be 90% effective. While their test contained over 43,000 volunteers they had only detected 94 cases of COVID-19. How confident can you be with only 94 cases? I decided to explore this matter for myself.

I am but a lowly crystallographer, and I’m sure a proper mathematician could do a more rigorous job, but I’ll do the best I can.

The Experimental Design

I was not familiar with the design of this clinical trial but it seems rather straight-forward. You take a whole lot of people and split them into two groups, keeping their group assignment secret from everyone who will be involved in their handling until the end of the trial. The members of one group are given the treatment which we hope is a vaccine while the others are given a sham treatment which is indistinguishable from the “vaccine” by both the participants and their doctors. You then wait to see if anybody comes down with COVID-19.

How long do you wait? You want to wait until you have enough cases to reliably answer the question you hope the study will answer but, to avoid bias, the end point has to be set before the start. If you constantly watch the results and decide to stop when the numbers look good, you could claim success when there is none. After all, life is filled with statistical fluctuations and the results might get worst with longer time.

The press release says that the design of this test says to end after 164 cases of COVID-19 arose among the volunteers but they would peek at the results after 32, 62, and 94 cases. For unspecified reasons they skipped the peek at 32 and, it appears, that the case count shot up to the 94 case trigger while they were discussing the merit of the 62 threshold. I guess this is the only benefit to the world of the huge surge in COVID-19 cases this fall.

It was the 94 case checkpoint that led them to conclude that it was likely that their vaccine candidate was 90% effective at preventing the disease.

But how likely?

To judge the reliability of the 90% number I’ll need to do some statistics. That “proper mathematician” I mentioned earlier would be able to pull out the expected distributions for the experimental results and precisely calculate probabilities and likelihoods. That knowledge is not in my skill set so I’m left with running simulations.

I wrote a program in Mathematica Script to generate many simulations of vaccine trials and then examined their variability. This program has a loop that produces a person with a 100% chance of developing COVID-19 without intervention. That person is assigned either to the Placebo or Vaccinated group.  Those poor souls put in the Placebo group are counted as COVID-19 cases. Those in the Vaccinated group are only sickened if they lose a roll of the dice. For each series of simulations I assume a level of effectively for the vaccine. If the run is for a vaccine with 30% effectiveness, for example, the volunteer only get sick if they roll over 30 (okay, I’m using percentile dice.) Those folk protected by the vaccine are let go and the sick are counted as vaccine failures. When the total number of sick reaches the target that trial is complete, the number of sick in each group recorded, and the next trial is started. To ensure that I have a good sample of all possible clinical trials, I simulated a hundred thousand trials for each assumed efficacy.

To keep the numbers simple I ran 100 case trials instead of 94.

When the vaccine is ineffective (can we still call such a thing a vaccine?) there will be an equal number of COVID-19 cases in the Placebo and Vaccinated groups, and this number will be around 50 but there will be variation. If the vaccine is 100% effective the vaccinated group will be completely protected and all 100 cases will be in the Placebo group. The key result of a vaccine trial is the difference in the number of cases. This difference can never be greater than 100 because there aren’t enough cases to result in a bigger number. The difference can, however, be negative since it is possible to have more cases in the Vaccinated group.  The most likely explanation for such a result is that the vaccine is very ineffective and randomness of infections happens to result this odd distribution.

Here are my results for a series of hypothetical vaccines with varying efficacy.

Plot showing four overlapping histograms, one each for 0%, 50%, 75% and 90% effective vaccines centered on differences of 0, 34, 60 and 82 cases. Each overlaps half of its neighbors’ width. Below the plot are four horizontal lines, each matching one histogram. Where the histogram is taller the color of the line is darker.

Histograms of the probability of a clinical trial of a vaccine with an assumed efficacy resulting in a particular difference in COVID-19 case numbers between the placebo and vaccinated groups. CC-BY-NC Dale E. Tronrud / Coronavirus Structural Task Force

There are a whole lot of interesting things in this graph. When the vaccine is completely ineffective the most common result of a trial is a difference of zero between the Placebo and Vaccinated groups. There is a fairly wide distribution of results that occur, however. This is the result of statistical fluctuations due to the small number of cases of COVID-19 in the sample (here 100).  The distributions for all the simulated efficacies have about the same width, with the exception of those near 100%. Since the difference can never be larger than 100 those distributions get sharper and develop a tail on the lower side.

Let’s look at some scenarios. The graph shows that the most common result of a trial of a vaccine with efficacy of 50% has a difference in number of cases between the Placebo and Vaccinated groups of 36, but sometimes the difference is larger and sometimes smaller. If the vaccine was worthless the most common trial result is zero, but there is also variability. The two histograms overlap considerably which indicates that one cannot distinguish between an efficacy of zero or 50% if the difference in the number of cases in your trial is in the range of zero to about 36. If the difference is greater than this you could conclude that the vaccine is more likely to be 50% than zero percent, and zero percent is the more likely of the two if the difference is negative. Still, there is a wide range of possible outcomes of a trial that have ambiguous interpretation.

On the other hand, what if we have a difference of 80 (90 cases in the Placebo group and 10 in the Vaccinated group)? There isn’t any significant overlap between zero percent efficacy and 90% at the point where the difference is 80. It is much more likely that the vaccine is 90% effective than zero. There is overlap with the 75% effectiveness histogram and we have to admit that it is possible that the vaccine is only 75% effective, but 90% is more likely.

This leads us to realize that the result of a vaccine trial has to result in a range of possible efficacies, with a varying probability of each. My little plot doesn’t make such an assessment very easy. In fact, the plot is starting to show some problems. What it shows is the probability of a trial having a particular result given the effectiveness of the vaccine. What we really want is the probability of each possible efficacy given the result of the clinical trial.

We have to transform our probabilities!

Turning everything on its head

While the calculations I just discussed were easy to set up and understand, they do not really reflect the experiment being done in a vaccine trial. I was assuming an effectiveness of the vaccine and running many, many trials. In reality the effectiveness is unknown and only one trial is run. Where I calculated the probability of a particular difference in COVID-19 cases given the effectiveness of the vaccine what I really want is the probability of the effectiveness of the vaccine given the results of a single clinical trial. It is often difficult to devise such a calculation from scratch but it is pretty straight forward to calculate it from the results I already have.

The first step toward the proper calculation is to expand the current plot. My first figure included simulations of just four possible efficacies. To display more possibilities, I need to abandon histograms. At the bottom of the plot I show four color-shaded bars. In these bars the color is darker when the corresponding histogram is taller. While not as visually clear these bars have the advantage that they can be stacked, and many more plotted in a single figure.

With this new tool I can calculate and display the simulated distribution of clinical trials for every vaccine efficacy from 0% to 100% in 1% steps. The new chart is displayed here.

Plot showing the difference between the number of cases in the Placebo group and the number in the Vaccinated group on the horizontal axis and the vaccine effectiveness on the vertical. There is a band of color, darkest in the center, which stands vertically in the plot leaning to the right and touching the upper right corner. Its bottom, at zero efficacy, has its darkest region right above a difference of zero. There is a line at 50% efficacy and the width of the band goes from about 5 to 70 centered on about 40. Another line is drawn at 75%. The band here goes from the upper 30’s to about 90 with a most probable value of around 65
Distribution of possible clinical trial outcomes for a given vaccine efficacy.
CC-BY-NC Dale E. Tronrud / Coronavirus Structural Task Force

This chart is read by locating the efficacy of your vaccine on the vertical axis and drawing a horizontal line there. The pattern of colors along that line represents the probability of each difference in cases between the Placebo and Vaccinated groups in a clinical trial. I have drawn two such lines, one for a vaccine with 75% effectiveness and another for one with 50%. You can see that the most likely result for the 75% one is about 60 cases (20 in Vaccinated and 80 in Placebo) and the other at about 34 cases. (You figure it out.) In this plot you can see the continuous change as the efficacy of the vaccine is changed. The key point is that there is a spread of results, but I described that before.

With any set of probabilities the full set has to always add up to one. For each horizontal line of colored boxes in this plot the sum of their probabilities is one. A set of numbers with this property is said to be “normalized”.

The vertical lines are not normalized in this plot, as you can see by looking at its left side. There are just a few, very lightly colored or low probability boxes and above them is simply white, which represents zero probability (or at least very, very, very small). This side of the plot has a difference in COVID-19 cases of -24, or in other words the Vaccinated group had 24 more cases of disease than the Placebo. Such an outcome for a clinical trial is very unlikely for any vaccine that has even a tiny amount of success (and is pretty unlikely for one that is merely useless).

Since the probabilities along vertical lines are not normalized they cannot be used as a histogram. Conveniently for us, this can be corrected simply by normalizing them.  This is done by summing all the probabilities along each vertical line in this plot and dividing the probabilities in the line by that sum.  This gives us a new set of probabilities and a new plot.

How does this magic work? The procedure is justified by a hundreds-of-years-old mathematical theorem called Bayes’s Law. This blog post is already getting long and I leave the application of your favorite search engine to you.

This plot is very similar to the last with the largest difference in the lower left corner. Here it indicates that the most likely effectiveness of the vaccine with more COVID-19 cases in the Vaccinated group than Placebo is near zero

Probability of vaccine effectiveness as a function of clinical trial outcome.
CC-BY-NC Dale E. Tronrud / Coronavirus Structural Task Force

The first thing to note is that the new plot isn’t much different than the original. While the lower-left side has clearly changed, that area is not very interesting. On the right, where the action is, it looks the same. For this reason, many fields of science simply use the unnormalized plot.

The new plot allows us to draw vertical lines (but forbids horizontal lines!). I have drawn example lines at differences of 36 cases and 82 cases. If our clinical trial results in a difference in cases of 82 (91 in Placebo and 9 in Vaccinated) we can see from the line on the plot that the most probable effectiveness of that vaccine is about 90%! This is very close to the happy number reported in Pfizer’s press release. The plot also shows us that there is uncertainty in this number. The vaccine’s effectiveness could be in the mid 70’s or in the upper 90’s.

This is the nature of all experimental work. All results have uncertainties and it is as important to know the amount of uncertainty as it is to know the direct result. You can see how important this is by looking at the 36 case difference line. The darkest blocks along this line, and therefore the most probable efficacy, are near 50%. This would also indicate a useful vaccine, but look at the spread! The width of the uncertainty goes all the way to zero – This vaccine could be worthless. A clinical trial that waited until only 100 cases occurred cannot distinguish between a vaccine with 50% efficacy and a worthless one.

If you wait for more cases to develop the width of the stripe in the plot becomes narrower and the uncertainty drops. The goal of Pfizer was to develop a vaccine of at least 50% efficacy so their design was to wait for 164 cases to give them a narrow enough band to clearly distinguish 50% from zero percent. Just in case the vaccine was better than 50% they built into the design of their trial several points where they could peek and see what was going on. They, and we, lucked out!

What are all those other people for?

The surprising thing about this analysis is that the total number of people in the trial is unimportant when calculating the uncertainty of the result. That answer is the same for a trial with 500 volunteers and a trial with 50,000 volunteers. The only thing that is important is the number of cases of COVID-19.

All those tens of thousands of people are important for other reasons. Very relevant to the current pandemic is that a larger number of volunteers will accumulate the target number of cases sooner: if you double the number of volunteers you will reach the target in half the time. We all want to know as quickly as possible if these vaccines are effective so we want the trials to consist of as many people as the companies can manage.

The other use for large numbers is the search for grim and hopefully rare side effects. These are likely to arise at much lower rates than viral infections and much larger numbers of volunteers are required to achieve statistical significance. A side effect that only occurs in 1 of 5,000 people will require a very large number of participants to be detected. A compounding factor is that a search for an unknown result requires many more data points than the search for a specific outcome, such as COVID-19 infection. (The checks for side effects are only now being described to the public, and I’ll not go into them here.) While the doctors keep an eye out for life threatening problems during the trial, the secret books identifying which volunteers are in which group are kept closed until the end.

End game

After all this interesting math I find that, yes, 94 diseased people are quite enough to conclude that a vaccine is effective at around a 90% level, or at least 70%.


During the writing of this post Moderna issued a press release about their vaccine candidate. The analysis presented here applies equally to their trial since I made no assumptions at all about the nature of the vaccine being tested. The only complaint I have with their press release is that they quoted the effectiveness of their vaccine as 94.5%. As you now know, this level of precision is ridiculous.  It would be better just to say it is “somewhere around 95% effective”.

This article has been written by Cameron Fyfe and Lea von Soosten.

In the previous two articles we spoke of proteins involved in RNA synthesis and proteins involved in removing errors during that process. There are also proteins produced by SARS-CoV-2 that can mimic functions of the host cell to avoid its defense mechanisms.

VIP treatment: Very Important Proteins 12
Figure 1. mRNA end caps with methylation VIP tag. Nsp14 is responsible for adding a methylation to produce the Cap 0 structure and Nsp16 methylates the Cap 0 structure to produce Cap 1. Figure modified from Ramanathan et al 2016​1​.

Eukaryotic cells have evolved to have various immune responses to fight infection or invasion from pathogens. One of these is to recognize and chop up any RNA that is from other organisms using enzymes called exoribonucleases. In order to differentiate "friendly" RNA from "foe" RNA is to give the cell's own RNA a VIP badge so that only unfriendly RNA will be shredded. These "VIP badges" are made of a 5’ to 5’ triphosphate linkage with two methylation modifications (see Fig. 1). In order to evade exoribonucleases, the virus SARS-CoV-2 has a way of 5’ to 5’ capping as well as adding its own methyl group VIP badges to protect its RNA from the defense mechanisms of invaded cells. Two Very Important Proteins, nsp14 and nsp16, have this methyltransferase activity using an S-Adenosyl methionine (SAM) as cofactor.

What are SAM methyltransferases?

VIP treatment: Very Important Proteins 13
Figure 2. A methyl group is transferred from the positively charged sulfur of S-Adenosyl methionine to a substrate resulting in a methylated product and S-Adenosyl homocysteine.

Methyltransferase enzymes are a large superfamily of proteins that perform the chemical addition of a methyl group (a carbon with three hydrogens) to a variety of substrates. These substrates include small molecules, other proteins, DNA, and RNA ​2,3​. This superfamily of proteins often uses a small molecule, S-Adenosyl methionine (SAM), to transfer a methyl group to its target substrate (Figure 2). During this process, the methyl group bound to the charged sulfur is brought in proximity to the target atom of the substrate, transferring the methyl group (Figure 2), resulting in the methylated product and the byproduct S-Adenosyl homocysteine (SAH).

Methyltransferases of SARS-CoV-2

VIP treatment: Very Important Proteins 14
Figure 3. The mRNA cap synthesis process in SARS-CoV-2. The process is performed by the sequential action of four enzymes: Nsp13 (red), a still unknown GTase, Nsp14 (green/orange) and Nsp16 (pink). The presence of the co-factor Nsp10 (blue) is fundamental for the activity of the last two enzymes. Figure modified from Romano, M. et al 2020.

In a previous article we spoke of the exoribonuclease (ExoN) proofreading activity of Nsp14 (not to be confused with the host cell's own exoribonucleases that are part of the immune system, see above). After the 5’ to 5’ guanine triphosphate addition has been performed on the mRNA the guanine-N7-methyltransferase activity of Nsp14 comes into play producing the first Cap0 structure with a VIP tag (Figure 1, 3). Only after this methylation has been performed can Nsp16 have action and perform the second 2’O-methylation to produce the Cap1 structure (Figure 1, 3).

Not only do both of these proteins perform VIP methylations of mRNA, but they also both bind another non-structural protein, Nsp10. The binding of Nsp10 has been shown to increase activity in both Nsp14 ExoN activity and Nsp16 methyltransferase activity​4​. Independently, Nsp10 has also been shown to have the ability to bind both single and double stranded DNA and RNA​5​.

Structures of nsp14 and nsp16

VIP treatment: Very Important Proteins 15
Figure 4. Electrostatic surface of the methyltransferase domains of Nsp14 and Nsp16. A. Active site of the methyltransferase domain of Nsp14 (PDB: 5c8s) with bound Guanosine-P3-adenosine-5',5'-triphosphate (GpppA) and S-Adenosyl homocysteine (green). The hinge region, connecting ExoN to the methyltransferase domain, that covers the methyltransferase site is not present. B. Methyltransferase active site of Nsp16 (PDB: 6wks) with bound P1-7-methylguanosine-P3-adenosine-5',5'-triphosphate (m7GpppA) (teal) and S-Adenosyl methionine (green).

Nsp14 consists of two domains, each carrying out one specific task: the first is responsible for the ExoN activity, whilst the second executes the first methylation of the Guanosine-N7 of the RNA end cap. The two domains are connected by a flexible region that acts like a hinge, allowing movement between the domains. The second domain has an unusual and unique structure which does not follow the typical Rossmann fold seen in other SAM methyltransferases. The methyltransferase active site has a negatively charged binding pocket that holds SAM (SAH in Figure 4. A) in close proximity to the Guanosine-P3-adenosine-5',5'-triphosphate (GpppA) substrate (Figure 4A). The binding pocket holding the GpppA has a positive charge and the surface charge of the region below is also positively charged (Figure 4A). The distance between the N7 of the 5’ Guanosine and the sulfur that transfers the methyl group is 4.4 Å​5,6​. This close proximity of cofactor and substrate facilitates the methylation.

Similar to Nsp14, Nsp16 has a negatively charged binding pocket to position SAM in close proximity to the m7GpppA substrate (Figure 4. B). The m7GpppA binding site has a positive charge. The space nearby the 3’ end of the m7GpppA also has an overall positive charge and would be expected to bind the extension of the full length RNA (Figure 4. B)​4​. The distance between the methyl group and the sulfur of SAM and the 2’O of the m7GpppA substrate is 3.1Å and 4.9Å, respectively.

Structure of nsp10 and its function

VIP treatment: Very Important Proteins 16
Figure 5. Allosteric activator Nsp10 (Blue) in complex with Nsp14 (A, PDB: 5c8s, Orange) and Nsp16 (B, PDB: 6w4h, Pink). Models aligned using Nsp10.

In a previous article where we spoke about the exoribonuclease (ExoN) activity of the first domain of nsp14, we highlighted the interaction between nsp14 and nsp10 (Figure 5A). This is quite significant, as the activity of ExoN increases 30-fold when nsp10 and nsp14 are bound. Nsp10 also functions as a co-factor for nsp16, stabilizing the SAM-binding pocket​7​ and enhances its methyltransferase enzymatic activity significantly​4​ (Figure 5B). For SARS-CoV, and similarly for MERS-CoV, the affinity for m7GpppA-RNA and m7GpppA cap analogue of nsp16 was found to be low until binding to nsp10, which enhanced the affinity for binding to RNA​8,9​. With a reduced activity in Nsp16 in the absence of Nsp10 and a huge decrease in activity of the exonuclease domain of Nsp14, interfering with these interactions could result in decreased viability of COVID-19.

Methyltransferases Nsp14 and Nsp16 as drugs targets

As both Nsp14 and Nsp16 use the cofactor SAM and have affinity for the endcap of RNA, these two binding sites could be worthwhile targets for drug development in the fight against SARS-CoV-2. Without the VIP status provided by the methylation of RNA the host immune system could defend against the viral RNA. It might be possible to block these binding pockets by letting the protein bind to something that is similar to SAM, which cannot function as a methyl donor. An additional challenge is that the inhibitor has to be very specific to Nsp14 or Nsp16, so as not to affect similar human proteins in a negative way.

Sinefungin is a 5’-aminoalkyl analog of SAH and SAM, which can do exactly that: it has the ability to inhibit all SAM methyltransferases (Figure 6). Sinefungin was first discovered in 1973 from Strepromyces griseolus and was described as having antifungal antibiotic properties​10​.  

VIP treatment: Very Important Proteins 17
Figure 6. Sinefungins similarity to SAM and SAH with its recognition by nsp16 in the SAM methyltransferase active site. A. Chemical structure comparison of SAM, SAH, and sinefungin. B. Detailed view of sinefungin recognition, important amino acid residues are shown in stick representation, waters as red spheres, and hydrogen bonds are shown as dashed lines. Figure modified from Krafcikova et al. 2020​4​.

A major issue with targeting the SAM binding site of Nsps with compounds such as sinefungin (Figure 6) is that there are many proteins within humans that use SAM as a cofactor for normal function. This results in singefungin and other similar compounds having toxic effects on human cells. Synthetic chemists have already been able to synthesize analogs of sinefungin with improved affinities to specific SAM methyltransferases. Recently, specific inhibitors have been developed to target a nicotinamide SAM methyltransferase​11​. This inhibitor was developed to have affinity to both the cofactor binding site and the substrate binding site by combining the nicotinamide substrate with the SAM cofactor. Recent work has looked at how singefungin binds to the active site of Nsp16 in order to have a detailed understanding of its interaction to design more specific inhibitors that can target methyltransferases from SARS-CoV-2​4​. Similar to the development of the nicotinamide SAM methyltransferase inhibitor, developing an inhibitor which binds to the substrate binding site as well as to the cofactor binding site could be effective. As Nsp14 and Nsp16 target different substrates, any inhibitors designed in this way would likely have specificity to only one of the two methyltransferases from SARS-CoV-2. Of the two, Nsp14 might be easier to target as it has a unique structure not similar to human SAM methyltransferases.

As both Nsp14 and Nsp16 interact with Nsp10 for normal function, interfering with this interaction could reduce activity of these enzymes. Further still, as the interface between Nsp10 with Nsp14 and Nsp16 has overlap the target is smaller for blocking binding of these proteins.

One way to look for possible drugs is repurposing those which are already approved for other diseases. Initial screen can be done in silico, by simulations of the interaction between the protein and the already existing and approved drug. However, such studies are highly dependent on the protein structures employed being correct, which is why we are evaluating all structures that are published for SARS-CoV and SARS-CoV-2.

Available structures

If you would like to look at the currently available structures for Nsp10, Nsp14, and Nsp16, they are available from our data base; we provide information on the quality of measurement data and models as well as improved structures.

All structures available for Nsp14 are bound to Nsp10 and are only available from SARS-CoV. The highest resolution structure of Nsp14 is PDB entry 5c8t at 3.2Å. It has a bound S-Adenosyl methionine ligand as well as zinc ions present. Alongside this, another structure of Nsp14 bound to S-Adenosyl homocysteine and a guanosine-triphosphate-adenosine ligand as well as zinc at 3.33Å resolution has been published (PDB: 5c8s). Additionally, two structures with zinc atoms but no ligands are available (PDB 5c8u 3.4Å at and 5nfy at 3.34Å). Both PDB entries 5c8t and 5nfy have been improved structures by our group.

Similar to Nsp14 all structures of Nsp16 are bound to Nsp10. There are currently 18 structures for Nsp16 bound to Nsp10 from SARS-CoV-2. The highest resolution structure is at 1.8Å and has SAM, Guanosine triphosphate and Adenosine bound as well as zinc atoms. The PDB:6wkq has Nsp16 bound to the methyltransferase inhibitor Sinefungin at 1.98Å resolution. Two further structures of note are 7jhe and 7jib that have various functional ligands. A further four structures are available from SARS-CoV.

Nsp10 alone: Currently there are two structures of Nsp10 from SARS-CoV-2, PDB 6zpe and 6zct, with the former having the highest resolution of 1.58 Å with bound zinc (PDB 6zpe). There are also three  structures of Nsp10 from SARS-CoV available, PDB 2fyg, 2g9t, and 2ga6.

  1. 1.
    Ramanathan A, Robb GB, Chan S-H. mRNA capping: biological functions and applications. Nucleic Acids Res. Published online June 17, 2016:7511-7526. doi:10.1093/nar/gkw551
  2. 2.
    Boriack-Sjodin PA, Swinger KK. Protein Methyltransferases: A Distinct, Diverse, and Dynamic Family of Enzymes. Biochemistry. Published online December 22, 2015:1557-1569. doi:10.1021/acs.biochem.5b01129
  3. 3.
    Lyko F. The DNA methyltransferase family: a versatile toolkit for epigenetic regulation. Nat Rev Genet. Published online October 16, 2017:81-92. doi:10.1038/nrg.2017.80
  4. 4.
    Krafcikova P, Silhan J, Nencka R, Boura E. Structural analysis of the SARS-CoV-2 methyltransferase complex involved in RNA cap creation bound to sinefungin. Nat Commun. Published online July 24, 2020. doi:10.1038/s41467-020-17495-9
  5. 5.
    Ferron F, Subissi L, Silveira De Morais AT, et al. Structural and molecular basis of mismatch correction and ribavirin excision from coronavirus RNA. Proc Natl Acad Sci USA. Published online December 26, 2017:E162-E171. doi:10.1073/pnas.1718806115
  6. 6.
    Ma Y, Wu L, Shaw N, et al. Structural basis and functional analysis of the SARS coronavirus nsp14–nsp10 complex. Proc Natl Acad Sci USA. Published online July 9, 2015:9436-9441. doi:10.1073/pnas.1508686112
  7. 7.
    Rosas-Lemus M, Minasov G, Shuvalova L, et al. The crystal structure of nsp10-nsp16 heterodimer from SARS-CoV-2 in complex with S-adenosylmethionine. Published online April 20, 2020. doi:10.1101/2020.04.17.047498
  8. 8.
    Romano M, Ruggiero A, Squeglia F, Maga G, Berisio R. A Structural View of SARS-CoV-2 RNA Replication Machinery: RNA Synthesis, Proofreading and Final Capping. Cells. Published online May 20, 2020:1267. doi:10.3390/cells9051267
  9. 9.
    Chen Y, Su C, Ke M, et al. Biochemical and Structural Insights into the Mechanisms of SARS Coronavirus RNA Ribose 2′-O-Methylation by nsp16/nsp10 Protein Complex. Kuhn RJ, ed. PLoS Pathog. Published online October 13, 2011:e1002294. doi:10.1371/journal.ppat.1002294
  10. 10.
    Robert L. H, Marvin M. H. A9145, A NEW ADENINE-CONTAINING ANTIFUNGAL ANTIBIOTIC. ‎J Antibiot. 1973;26(8):463-465. doi:10.7164/antibiotics.26.463
  11. 11.
    Policarpo RL, Decultot L, May E, et al. High-Affinity Alkynyl Bisubstrate Inhibitors of Nicotinamide N-Methyltransferase (NNMT). J Med Chem. Published online October 7, 2019:9837-9873. doi:10.1021/acs.jmedchem.9b01238

The genome of the novel SARS-CoV-2 codes for an ORF1a/ ORF1ab (open reading frame) polyprotein containing sixteen non-structural proteins (NSP) and four structural proteins. The genome also has multiple ORFs coding for accessory proteins through a frame shift. These accessory proteins are not necessary for viral replication but might play a key role in pathogenesis of SARS-CoV-2. One such protein is the accessory protein 7a, which is predicted to contribute to Covid-19 by inducing the apoptotic processes in human host cells​1​.


SARS-CoV-2 is a very young virus and the structure and function of the accessory protein 7a has not yet been solved. However, 7a of SARS-CoV-2 shows 85% sequence identity and 95.2% sequence similarity with another protein in SARS-CoV​2​. It is therefore conceivable that both accessory proteins have a similar structure and function. The sequence analysis of SARS-CoV predicts that ORF7a codes for a type I transmembrane protein with 122 amino acids, including a signal peptide at the N‑terminus and a retrieval signal at the C-terminus​3​. The N-terminal ectodomain of ORF7a consists of seven β-strands compactly arranged in an immuno-globulin-like β-sandwich fold (Fig 1). These seven β-strands are ordered in two β-sheets containing four β-strands (A; G; F; C) in the first sheet and three (B; E; D) in the second one (see Fig 1: left)​4​.

Accessory Protein 7a: Key Role in Pathogenesis? 18
Fig. 1. Structure of the accessory protein 7a of SARS-CoV-2 (PDB: 6W37). Left: The β-sheets BED and AGFC form the ectodomain of the type I transmembrane protein. Right: Stabilizing disulphide bonds on top and bottom of the β-sheets coloured in cyan. Image by Sabrina Stäb

Both sheets are amphipathic, with the hydrophobic side facing inwards closely packed against each other. The top of the ectodomain is defined by the BC, DE and FG loops and the bottom by the AB, CD and EF loops. The β-sandwich structure is stabilized by two disulphide bonds linking the sheets at opposite edges. At the bottom of the structure, a disulphide bridge connects Cys8 on strand A with Cys43 at the end of strand E. At the top, Cys20 of the BC loop is linked to Cys52 at the end of strand F (see Fig 1: right). Additionaly, on top of the BED sheet, the DE loop protrudes from the structure and forms a groove together with β-strands C and D. In the centre is Glu18 which contributes to the negatively charged bottom of the mainly hydrophobic groove. This grove may be a potential site for ligand interaction due to its central negative electrostatic potential​4​. ­


In cell culture, the polypeptide 7a of SARS-CoV seems to have diverse biological functions​5​.  It is possible that 7a plays a key role in cell cycle control. In HEK 293 cells, an overexpression of 7a led to inhibition of cell growth and induction of the G0/G1 phase cell cycle arrest. This arrest may favour coronavirus replication and exacerbate virus-induced pathogenicity. 7a is also predicted to induce apoptosis in human kidney epithelial cells by interaction with a protein called B-cell lymphoma-extra large (Bcl-XL).  Bcl-XL belongs to a group of pro-survival proteins, the B-cell lymphoma-2 (Bcl-2)- family, which prevent apoptosis in epithelial cells. The Interaction between 7a and the C-terminal transmembrane domain of Bcl-XL may interfere with this pro-survival function, leading to apoptosis via the caspase-dependant pathway​6,7​. In addition to this, SARS 7a interacts with a Ap4A-hydrolase involved in cell proliferation, DNA-replication, apoptosis and RNA-processing. This interaction leads to downregulation of its hydrolase-activity and an increased production of AP4A (diadenosine tetraphosphate) which may also induce apoptosis​5​. Such a host cell specific modulation of apoptosis could enable the virus to evade the immune response or to spread to other target organs.

Another predicted function of ORF7a is the inhibition of the bone marrow matrix antigen 2 (BST-2) that might restrict virus release by physically tethering the budding enveloped virion to the plasma membrane. ORF7a antagonizes this function by binding of the extracellular domain of BST-2 preventing its glycosylation. Thus, an inhibitor preventing ORF7a-BST-2 interaction can be speculated as potential drug target​8​.

Taken together, ORF7a is a virulence factor that contributes in different ways to the pathogenicity of SARS-CoV-2. Therefore, targeted drug development against ORF7a could be a critical factor to reduce viral spread or attenuate severe disease progression.

PDB Structures Available

6W37: X-ray structure of the SARS-CoV-2 ORF7a encoded accessory protein.

1xak: SARS-CoV ORF7a accessory protein, a unique type I transmembrane protein of unknown function. Has a short cytoplasmic tail and a transmembrane domain. Consists of one chain (chain A), that forms a compact seven-stranded beta sandwich.

1y04: SARS Coronavirus ORF 7a coded X4 protein, also known as 7a, U122 or X4. Type-I transmembrane protein with immunoglobulin like beta-sandwich fold. Potential functions of X4 in virus replication and pathogenesis are discussed.


  1. 1.
    Michel CJ, Mayer C, Poch O, Thompson JD. Characterization of accessory genes in coronavirus genomes. Virol J. Published online August 27, 2020. doi:10.1186/s12985-020-01402-1
  2. 2.
    Francis K. Y. The Proteins of Severe Acute Respiratory Syndrome Coronavirus‑2 (SARS CoV‑2 or n‑COV19), the Cause of COVID‑19. The Protein Journal (2020). 2020;(39):198-216. doi:10.1007/s10930-020-09901-4
  3. 3.
    Fielding BC, Tan Y-J, Shuo S, et al. Characterization of a Unique Group-Specific Protein (U122) of the Severe Acute Respiratory Syndrome Coronavirus. JVI. Published online July 15, 2004:7311-7318. doi:10.1128/jvi.78.14.7311-7318.2004
  4. 4.
    Hänel K, Stangler T, Stoldt M, Willbold D. Solution structure of the X4 protein coded by the SARS related coronavirus reveals an immunoglobulin like fold and suggests a binding activity to integrin I domains. J Biomed Sci. Published online November 23, 2005:281-293. doi:10.1007/s11373-005-9043-9
  5. 5.
    Vasilenko N, Moshynskyy I, Zakhartchouk A. SARS coronavirus protein 7a interacts with human Ap4A-hydrolase. Virology Journal. Published online 2010:31. doi:10.1186/1743-422x-7-31
  6. 6.
    Tan Y-J, Fielding BC, Goh P-Y, et al. Overexpression of 7a, a Protein Specifically Encoded by the Severe Acute Respiratory Syndrome Coronavirus, Induces Apoptosis via a Caspase-Dependent Pathway. JVI. Published online December 15, 2004:14043-14047. doi:10.1128/jvi.78.24.14043-14047.2004
  7. 7.
    Tan Y-X, Tan THP, Lee MJ-R, et al. Induction of Apoptosis by the Severe Acute Respiratory Syndrome Coronavirus 7a Protein Is Dependent on Its Interaction with the Bcl-XL Protein. JVI. Published online April 11, 2007:6346-6355. doi:10.1128/jvi.00090-07
  8. 8.
    Taylor JK, Coleman CM, Postel S, et al. Severe Acute Respiratory Syndrome Coronavirus ORF7a Inhibits Bone Marrow Stromal Antigen 2 Virion Tethering through a Novel Mechanism of Glycosylation Interference. García-Sastre A, ed. J Virol. Published online September 16, 2015:11820-11833. doi:10.1128/jvi.02274-15