This article has been written by Cameron Fyfe and Lea von Soosten.
In the previous two articles we spoke of proteins involved in RNA synthesis and proteins involved in removing errors during that process. There are also proteins produced by SARS-CoV-2 that can mimic functions of the host cell to avoid its defense mechanisms.
Eukaryotic cells have evolved to have various immune responses to fight infection or invasion from pathogens. One of these is to recognize and chop up any RNA that is from other organisms using enzymes called exoribonucleases. In order to differentiate "friendly" RNA from "foe" RNA is to give the cell's own RNA a VIP badge so that only unfriendly RNA will be shredded. These "VIP badges" are made of a 5’ to 5’ triphosphate linkage with two methylation modifications (see Fig. 1). In order to evade exoribonucleases, the virus SARS-CoV-2 has a way of 5’ to 5’ capping as well as adding its own methyl group VIP badges to protect its RNA from the defense mechanisms of invaded cells. Two Very Important Proteins, nsp14 and nsp16, have this methyltransferase activity using an S-Adenosyl methionine (SAM) as cofactor.
Methyltransferase enzymes are a large superfamily of proteins that perform the chemical addition of a methyl group (a carbon with three hydrogens) to a variety of substrates. These substrates include small molecules, other proteins, DNA, and RNA 2,3. This superfamily of proteins often uses a small molecule, S-Adenosyl methionine (SAM), to transfer a methyl group to its target substrate (Figure 2). During this process, the methyl group bound to the charged sulfur is brought in proximity to the target atom of the substrate, transferring the methyl group (Figure 2), resulting in the methylated product and the byproduct S-Adenosyl homocysteine (SAH).
In a previous article we spoke of the exoribonuclease (ExoN) proofreading activity of Nsp14 (not to be confused with the host cell's own exoribonucleases that are part of the immune system, see above). After the 5’ to 5’ guanine triphosphate addition has been performed on the mRNA the guanine-N7-methyltransferase activity of Nsp14 comes into play producing the first Cap0 structure with a VIP tag (Figure 1, 3). Only after this methylation has been performed can Nsp16 have action and perform the second 2’O-methylation to produce the Cap1 structure (Figure 1, 3).
Not only do both of these proteins perform VIP methylations of mRNA, but they also both bind another non-structural protein, Nsp10. The binding of Nsp10 has been shown to increase activity in both Nsp14 ExoN activity and Nsp16 methyltransferase activity4. Independently, Nsp10 has also been shown to have the ability to bind both single and double stranded DNA and RNA5.
Nsp14 consists of two domains, each carrying out one specific task: the first is responsible for the ExoN activity, whilst the second executes the first methylation of the Guanosine-N7 of the RNA end cap. The two domains are connected by a flexible region that acts like a hinge, allowing movement between the domains. The second domain has an unusual and unique structure which does not follow the typical Rossmann fold seen in other SAM methyltransferases. The methyltransferase active site has a negatively charged binding pocket that holds SAM (SAH in Figure 4. A) in close proximity to the Guanosine-P3-adenosine-5',5'-triphosphate (GpppA) substrate (Figure 4A). The binding pocket holding the GpppA has a positive charge and the surface charge of the region below is also positively charged (Figure 4A). The distance between the N7 of the 5’ Guanosine and the sulfur that transfers the methyl group is 4.4 Å5,6. This close proximity of cofactor and substrate facilitates the methylation.
Similar to Nsp14, Nsp16 has a negatively charged binding pocket to position SAM in close proximity to the m7GpppA substrate (Figure 4. B). The m7GpppA binding site has a positive charge. The space nearby the 3’ end of the m7GpppA also has an overall positive charge and would be expected to bind the extension of the full length RNA (Figure 4. B)4. The distance between the methyl group and the sulfur of SAM and the 2’O of the m7GpppA substrate is 3.1Å and 4.9Å, respectively.
In a previous article where we spoke about the exoribonuclease (ExoN) activity of the first domain of nsp14, we highlighted the interaction between nsp14 and nsp10 (Figure 5A). This is quite significant, as the activity of ExoN increases 30-fold when nsp10 and nsp14 are bound. Nsp10 also functions as a co-factor for nsp16, stabilizing the SAM-binding pocket7 and enhances its methyltransferase enzymatic activity significantly4 (Figure 5B). For SARS-CoV, and similarly for MERS-CoV, the affinity for m7GpppA-RNA and m7GpppA cap analogue of nsp16 was found to be low until binding to nsp10, which enhanced the affinity for binding to RNA8,9. With a reduced activity in Nsp16 in the absence of Nsp10 and a huge decrease in activity of the exonuclease domain of Nsp14, interfering with these interactions could result in decreased viability of COVID-19.
As both Nsp14 and Nsp16 use the cofactor SAM and have affinity for the endcap of RNA, these two binding sites could be worthwhile targets for drug development in the fight against SARS-CoV-2. Without the VIP status provided by the methylation of RNA the host immune system could defend against the viral RNA. It might be possible to block these binding pockets by letting the protein bind to something that is similar to SAM, which cannot function as a methyl donor. An additional challenge is that the inhibitor has to be very specific to Nsp14 or Nsp16, so as not to affect similar human proteins in a negative way.
Sinefungin is a 5’-aminoalkyl analog of SAH and SAM, which can do exactly that: it has the ability to inhibit all SAM methyltransferases (Figure 6). Sinefungin was first discovered in 1973 from Strepromyces griseolus and was described as having antifungal antibiotic properties10.
A major issue with targeting the SAM binding site of Nsps with compounds such as sinefungin (Figure 6) is that there are many proteins within humans that use SAM as a cofactor for normal function. This results in singefungin and other similar compounds having toxic effects on human cells. Synthetic chemists have already been able to synthesize analogs of sinefungin with improved affinities to specific SAM methyltransferases. Recently, specific inhibitors have been developed to target a nicotinamide SAM methyltransferase11. This inhibitor was developed to have affinity to both the cofactor binding site and the substrate binding site by combining the nicotinamide substrate with the SAM cofactor. Recent work has looked at how singefungin binds to the active site of Nsp16 in order to have a detailed understanding of its interaction to design more specific inhibitors that can target methyltransferases from SARS-CoV-24. Similar to the development of the nicotinamide SAM methyltransferase inhibitor, developing an inhibitor which binds to the substrate binding site as well as to the cofactor binding site could be effective. As Nsp14 and Nsp16 target different substrates, any inhibitors designed in this way would likely have specificity to only one of the two methyltransferases from SARS-CoV-2. Of the two, Nsp14 might be easier to target as it has a unique structure not similar to human SAM methyltransferases.
As both Nsp14 and Nsp16 interact with Nsp10 for normal function, interfering with this interaction could reduce activity of these enzymes. Further still, as the interface between Nsp10 with Nsp14 and Nsp16 has overlap the target is smaller for blocking binding of these proteins.
One way to look for possible drugs is repurposing those which are already approved for other diseases. Initial screen can be done in silico, by simulations of the interaction between the protein and the already existing and approved drug. However, such studies are highly dependent on the protein structures employed being correct, which is why we are evaluating all structures that are published for SARS-CoV and SARS-CoV-2.
If you would like to look at the currently available structures for Nsp10, Nsp14, and Nsp16, they are available from our data base; we provide information on the quality of measurement data and models as well as improved structures.
All structures available for Nsp14 are bound to Nsp10 and are only available from SARS-CoV. The highest resolution structure of Nsp14 is PDB entry 5c8t at 3.2Å. It has a bound S-Adenosyl methionine ligand as well as zinc ions present. Alongside this, another structure of Nsp14 bound to S-Adenosyl homocysteine and a guanosine-triphosphate-adenosine ligand as well as zinc at 3.33Å resolution has been published (PDB: 5c8s). Additionally, two structures with zinc atoms but no ligands are available (PDB 5c8u 3.4Å at and 5nfy at 3.34Å). Both PDB entries 5c8t and 5nfy have been improved structures by our group.
Similar to Nsp14 all structures of Nsp16 are bound to Nsp10. There are currently 18 structures for Nsp16 bound to Nsp10 from SARS-CoV-2. The highest resolution structure is at 1.8Å and has SAM, Guanosine triphosphate and Adenosine bound as well as zinc atoms. The PDB:6wkq has Nsp16 bound to the methyltransferase inhibitor Sinefungin at 1.98Å resolution. Two further structures of note are 7jhe and 7jib that have various functional ligands. A further four structures are available from SARS-CoV.
Nsp10 alone: Currently there are two structures of Nsp10 from SARS-CoV-2, PDB 6zpe and 6zct, with the former having the highest resolution of 1.58 Å with bound zinc (PDB 6zpe). There are also three structures of Nsp10 from SARS-CoV available, PDB 2fyg, 2g9t, and 2ga6.
Storing the building plans for a virus in its genome is much like how we store ideas in language. This may sound strange but, as an example, typos in spelling, grammar, or word usage, can lead to the meaning of a sentence either changing dramatically, remaining virtually unchanged, or becoming complete nonsense. The SARS-CoV-2 genome consists of RNA. Transcription of this RNA runs into a similar problem: errors can lead to the loss of function, a gain of function, or be completely inconsequential to the resulting protein (Figure 1). Large changes may break the virus, but smaller changes may provide an advantage and are essential for evolution.
In a previous article we spoke about the copy machinery of the virus, including the RNA-dependent RNA polymerase (RdRp), and drugs targeting it, such as Remdesivir. The goal of these drugs is to jam the enzyme and halt RNA production - or to cause more errors than are sustainable, with the end result being a less infectious virus. The reason the development of drugs targeting the copy machinery of RNA is worthwhile is that humans don’t have machinery to reproduce RNA from RNA. This means drugs targeting this machinery are less likely to interfere with normal processes in people. What if the virus could quickly repair these errors before the new genome is packed into a hull and kicked out the door? That would make finding a therapeutic much more difficult…
Unfortunately, SARS-CoV-2 has a way to repair the mistakes. When errors are introduced in transcription through environmental mutagenesis or even mutations caused by nucleotide analogs like Ribavarin1–3, the non-structural protein 14 (nsp14) has the ability to remove them. This multifunctional protein removes errors with the exoribonuclease (ExoN) activity of its N-terminal domain, while the C-terminal domain has the unrelated function of methylating the end cap of the viral RNA3,4.
However, this ExoN does not work alone. There is a replication complex made up of proteins performing many roles in the production of new RNA with high fidelity. Nsp12 is the main hub that makes a new RNA chain to complement the template. Nsp7 and nsp8 have a “processivity” role to enable nsp12 to function efficiently. In addition to these proteins there is a two-component proofreading system of Helicase (nsp13) and the ExoN domain of nsp14. Helicase can detect misshapen RNA helices caused by errors made by the copy machinery5. It then unwinds these double strands of RNA and feeds the strand containing the error into the ExoN domain of nsp14 where they are chopped out. This results in nsp12 continuing RNA replication where it left off.
The proofreading ability from Helicase and nsp14 ExoN allows SARS-CoV-2 to have a huge genome as compared to other viruses6(Figure 2). The large 29.9 kb genome of SARS-CoV-2 requires much more physical space to accommodate the necessary genetic information for reproduction when compared to other RNA viruses, such as Rhinovirus that has a genome between 7.2 kb and 8.5 kb in size (Figure 3). When no ExoN proofreading is present genomes cannot expand beyond 20 kb in size6(Figure 2). Maybe by removing the exoribonuclease activity, irreversible damage could be caused to the genome of SARS-CoV-2.
In order to understand how nsp14 can do this, we need to find out its atomic structure; this may also allow us to develop a drug which hinders its function. However, to this date, no structure of nsp14 from SARS-CoV-2 has been solved. However, structures have been solved of nsp14 in complex with another viral protein, nsp10, both from SARS-CoV (PDB entries 5nfy, 5c8s, 5c8t, 5c8u)2,7. As the protein sequences are very similar between SARS-CoV and SARS-CoV-2 (nsp14 is 95%, and nsp10 is 97% identical), it can be assumed that the SARS-CoV-2 structure as well as its functionality are very similar to SARS-CoV. The active site of the ExoN domain of nsp14 from SARS-CoV-2 has a DEEDh motif (named for the one-letter codes of the amino acids involved) containing a histidine as well as two aspartates and two glutamates2,3,7,8.
The N-terminus of nsp14 interacts with nsp10 (pink and blue, respectively, in Figure 4). The following domain (orange) has been shown to have exoribonuclease activity on double stranded RNA in a 3’ to 5’ direction9. When nsp10 is interacting with nsp14 there is a 35 fold increase in exoribonuclease activity, which is thought to occur due to conformational changes caused by formation of the complex2,9. The ExoN domain of nsp14 (orange) is connected to the methyltransferase domain (green) by a flexible hinge (black)7,10. This flexible region opens up the methyltransferase active site to allow methylation of the N7 of the 5’ Guanosine triphosphate of RNA10. There are three zinc finger motifs in nsp14 with two found in the ExoN domain and one in the methyltransferase domain2,7. In combination with the two further zinc sites in nsp10, these zinc fingers hold loops of the proteins together and are involved with nucleotide interaction2,7.
Nsp14 has also been demonstrated to form complexes with the copy machinery , nsp12, nsp7, and nsp8, although this interaction is independent of nsp102,11,12.
Scientists are searching for drugs that could be used to target nsp14 in order to find a cure for COVID-19. The active site of the ExoN domain of nsp14 has five residues that are essential for activity that form a negatively charged pocket (Figure 5A)7. Currently researchers are using the nsp14 structure from SARS-CoV to model a SARS-CoV-2 structure which can be used to identify compounds that could bind to the active site (Figure 5). These in silico screens start with nucleotide analog drugs like Remdesivir, Ribivarin or Ritonavir that are currently used as antiviral treatments for other viruses13–15. These nucleotide analogs are then changed to achieve a better binding to Nsp14’s active site in order to block it (Figure 5B).
As the ExoN is essential to support the huge 29.9kb genome of SARS-CoV-2, targeting nsp14 could lead to an effective treatment to COVID-19. Although drugs that target just nsp14 could be effective at increasing the error rate in RNA production by the virus, a more effective treatment will require inhibition of the RdRp of the copy machinery at the same time!
If you would like to look at the currently available structures for Nsp14(currently only available from SARS-CoV), they are available from our data base; we provide information on the quality of measurement data and models as well as improved structures. The highest resolution structure of nsp14 is PDB entry 5c8t at 3.2Å. This has a bound S-Adenosyl methionine ligand as well as zinc atoms present. Alongside this, another structure of Nsp14 bound to S-Adenosyl homocysteine and a guanosine-triphosphate-adenosine ligand as well as zinc at 3.33Å resolution has been published (PDB: 5c8s). Additionally, two structures with zinc atoms but no ligands are available (PDB 5c8u 3.4Å at and 5nfy at 3.34Å). Both PDB entry 5c8t and 5nfy have improved structures re-refined by our group.