Coronavirus
Structural Task Force

Untangling nsp3 - Papain-like Protease

Für diesen Artikel steht leider keine deutsche Übersetzung zur Verfügung.

An important drug target

In the first part of this series we compared the protein nsp3 from SARS-CoV and SARS-CoV-2 by sequence. Now we delve deeper into the differences between these two proteins and follow through by analyzing the structure of one domain of nsp3 in particular: papain-like protease. This domain is a very relevant drug target because of its ability not only to cleave the polyprotein, but also remove some of the post-translational modification our cells use to fight these viruses. Without papain-like protease, the virus would be unable to spread COVID-19.

Like the entire nsp3 protein, the papain-like-protease (Pl2pro) domain is localized close to the endoplasmic reticulum’s (ER) membranes. The transmembrane domains hold it in place while the majority of the protein protrudes out of the ER membrane into the cytoplasm.[1]

SARS-CoV genome
Fig 1: Position of the nsp3 gene on the SARS-CoV-1 genome. Nsp3 is seperated into 12 domains. Picture by Thomas Splettstoesser, scistyle.com.

Ubiquitin-like-domain 2

We cannot discuss the Pl2pro domain without its little neighbor, which has been speculated to influence protease domain functionality.

In ubiquitin-specific proteases, the function of comparable Ubl2 domains is attributed to substrate recruitment or an increase in catalytic efficency. Ubiquitin-like-domain 2 (Ubl2) is the domain residing directly adjacent to the N-terminus of the Pl2pro catalytic domain. These ubiquitin-like domain seems to be more conserved compared to Ubl1 in different coronavirus species.[2]

If, in SARS-CoV and Murine coronavirus (MHV), Ubl2 is removed, Pl2pro loses its structural integrity. In addition, Pl2pro is then no longer able to act as an Interferon (IFN) antagonist (see below). However, some studies suggest that the Ubl2 domain in MERS-CoV might not be as essential as originally thought and in cell-based studies of this virus, Pl2pro could retain some of its enzymatic functions without the Ubl2 domain.[3]

To date, several inconsistent roles of Ubl2 were reported, and its exact function and inner workings remain enigmatic. This is being highlighted the fact that there are significant differences between the coronaviruses, and as a consequence, we need to exercise caution in applying our findings to SARS-CoV-2.

Combating the Host's Immune System

In the family of coronaviridae, viruses with either one or two Plpro domains can be found, with SARS-CoV and SARS-CoV-2 only having one. Confusingly, this single domain is however still called Pl2pro, even if it is the only papain-like protease domain in the viral genome.

Pl2pro cleaves the polyprotein from nsp1 (leader protein) up to nsp3. While Pl2pro cuts between nsp1-( ELNGG↓AV)-nsp2-( RLKGG↓AP)-nsp3-( SLKGG↓KI)-nsp4, the nsp5 (3c-like protease) cleaves the rest of the polyprotein. [2] The cysteine protease Plpro is similar to human ubiquitin-specific-protease (USP) in that it adopts a right-hand fold with "thumb", "palm" and "finger" subdomains.

Different regions of Plpro
Fig. 2: Plpro of nsp3 SARS-CoV (PDB-ID: 5E6J) with the catalytic triad marked in red. The Finger domain (blue), palm domain (light green) and thumb domain (forest green). Picture by Kristopher Nolte

Despite the variations of Pl2pro in different coronaviridae, the same catalytic motif of three amino acid residues is essential for the stability and proteolytic activity of the domain: Cys112 is located in the thumb, His273 and Asp287 are located in the palm subdomain. (The numbers identifying these residues can vary between species.)

nsp3Plpro catalytic Mechanism
Fig 3: Catalytic cycle and proposed chemical mechanism of SARS-CoV PL2pro proteolysis. Active site residues of the catalytic triad (Cys112, His273, Asp287) and oxyanion hole residue Trp107 are shown in black. The peptide substrate is shown in green and a catalytic water molecule is shown in blue. [1] Source: The SARS-coronavirus papain-like protease: Structure, function and inhibition by designed antiviral compounds, Beaz-Santos et al.

In addition, Pl2pro has deubiquitinating and deISGylating (removal of ISG15 from target proteins) abilities.[4] Both ubiquitin and ISG15 regulate facets of the immune response and through their removal Pl2pro poses as an antagonist to the human immune response. They can stimulate the production of cytokines, chemokines and other IFN-stimulated gene products which have antiviral properties. [6] ISG 15 is an ubiquitin-like modifier composed of two ubiquitin-like folds that has an essential role in marking newly synthesized proteins during the antiviral response.[3] Post-translational modification by ubiquitin and interferon-stimulating gene 15 (ISG15) is reversed by isopeptide bond hydrolysis. Figure 3 shows a proposed mechanism for the cleaving of isopeptide bonds by SARS-CoV.

Ubiquitin bound to Plpro
Fig. 4: Ubiquitin (light blue) bound to Plpro (green) with the catalytic triad marked red. (PDB-ID: 5E6J) Picture by Kristopher Nolte

An example

Toll-like receptors (TLRs) are an important part of the machinery of the human immune response, which recognizes the pathogen-associated molecular patterns. The ability of the host cell to transduce the so-called Toll-like receptor 7 (TLR7) mediated immune response is diminished (Fig. 5) by Pl2pro as it removes Lys63-linked-ubiquitin from the TNF receptor associated factors TRAF3 and TRAF6. [5]

In addition, SARS-CoV can hamper the antiviral activities of interferon. The Pl2pro domain inhibits in combination with a transmembrane (TM) domain the STING mediated activation of interferon expression. PL2pro-TM interacts with TRAF3, TBK1, IKKε, STING and IRF3, the key components assembling a regulatory complex for activation of IFN expression.[5]

Fig. 5: Different ways in which Pl2pro of various coronaviruses interact with the human immune response. A pointed circle symbol means the binding of one protein to another. If the binding has positive effect on the protein it is marked with a plus. The triangle marks the cleavage of ubiquitin from the target protein. Also,nsp3 cleaves ISG15 off target shown on the right. Picture by Kristopher Nolte.

Another tool to fight the coronavirus in human cells is the "guardian of the genome", p53. The tumor supressor protein p53 impedes the replication of SARS-CoV, though the virus fights back with Pl2pro, which binds a p53 degradation stimulator named "RING finger and CHY zinc finger domain-containing protein 1" (or short: RCHY1). Enhanced by the Macro somains in NSP3, this binding enhances the stability of RCHY1 and hence promotes the degradation of p53. In addition, Pl2pro blocks another crucial cellular defense mechanism: The NF-κB pathway, which regulates immune responses to infections. SARS-CoV Pl2pro can stabilize IκBα, an inhibitor of NF-κB.[3]

Although all Pl2pro in different coronaviridae suppress the immune response, the targets differ between various species. For example, SARS-CoV Pl2pro preferentially processes Lys48 linked poly-ubiquitin chains, which are markers for proteasome degradation. MERS, on the other hand, shows no differences in effectivity between Lys48 and Lys63 linked di-Ubq chains. Lys63-linked chains are related to signal transduction cascades of the host immune system. Studies have shown that specificity among Pl2pro for Ubiquitin and ISG15 substrates can be altered with as little as a single amino acid change.[6] However, even though there are differences, for SARS-CoV-2, it is likely that at least some of the functions are similar.

Structural comparison

In order to predict Pl2Pro function for the novel Coronavirus SARS-CoV-2, we start by aligning their sequence like we did in the first part of this series to comapare the sequence with the one from SARS-CoV-2. Both domains share a similarity of 82.8% over the length of 313 amino acids. However, this time, we go for a more detailed analysis of the 54 individual differences, which are:

T3R N14I V20V N48N H49S V56Y D60N E66V D75T S77P P95Y G99N S114A V115T L116A L119T E123I K125L P129P A134D A143E N155C H170S L171Y Q173F S179D K181C C191T T195Q T196Q G200K N214E L215Q K216F G218K I221Q C225T D228K A229Q Y232K F240P Y250Q L252E Q254K G255H C259T E262S H274K K278S I284C L289L S293S T300I S308N
(The first letter refers to SARS-CoV, and the second to the amino acid residue in SARS-CoV-2.)

Figure 6: SARS-CoV (PDB-ID: 5Y3Q) and SARS-CoV-2 (PDB-ID: 6WZU) Pl2pro overlaid over each other. RMSD = 0.758. Differences in SARS-CoV and SARS-CoV-2 marked in red. Picture by Kristopher Nolte

The mutations are evenly spread over the protein. None of the catalytic triad (Cys 112, His 273, Asp287) are changed as is to be expected given their conservation in all other coronaviruses. On further investigation, however, in the motif which interacts with ubiquitin six sites are different: S170T, Y171H, F216L, Q195K, T225V, and K232Q. Earlier studies concluded that the mutation of position 232 from Glutamine to Lysine increases the affinity for ubiquitin at the expense of the de-ubiquitination effectiveness.[6] The kinetics of SARS-CoV-2 nsp3 Pl2pro were studied to test if the protease domain of nsp3 has a reduced effectiveness in binding ubiquitin compared to nsp3 from SARS-CoV, MERS-CoV.
All three Pl2Pro variants cleave more ISG15 than ubiquitin. SARS-CoV has the fastest kinetics of the three viruses. And, the slower kinetics of SARS-CoV-2 resemble those of MERS-CoV rather more than SARS-CoV, having a 10 times higher turnover rate (kcat) as a deISGylase than as a deubiquitase.[6]

Besides the kinetics, the Pl2pro’s affinity for different poly-ubiquitin linkage sites was measured. The result shows that while SARS-CoV-2 can cut K48-Ub linked polyproteins, it seems to lack an ability to cut other polyubiquitin chains. Those K48-Ub linked polyproteins are cleaved at a slower rate than by SARS-CoV. In this regard, SARS-CoV-2 distinguishes itself from MERS-CoV which has the ability to cleave K63-linkages. It is suggested that the decrease in deubiquitinase effectiveness may not be irrelevant, but could lead to the often-mild symptoms that are a factor in why SARS-CoV-2 has been able to evade our efforts in quarantine. But this is mere speculation and a lot more research is needed to resolve the matter.[6]

PL2pro as a drug target

Pl2pro was a potential drug target early on in SARS-CoV-2 research. Hilgenfeld et al. name two major challenges we have to overcome to find a drug targeting Pl2pro. One is that the binding sites are tailor-made to bind glycine residues. Also, this very specific binding motif is rather ubiquitious in our cells. These two problems make it difficult to find an inhibitor which fits and is specific to Pl2pro. However, scientists found a weak spot: a loop called Blocking Loop 2 (BL2) regulates substrate binding and may be a promising target to inhibit PL2pro.[2] Naphthalene based inhibitors, which were earlier proposed to inhibit the BL2 of SARS-CoV, were shown to also inhibit SARS-CoV-2 Pl2pro, in particular an inhibitor called GRL-0617.[6]

For in-silico drug development, it might be prudent to choose high-resolution structures which already have a ligand or inhibitor bound, such as 6yva, 6wuu, 6wx4 or 6yaa. Technically speaking, 6wrh, albeit being a mutant, is one of the highest-quality structures available for SARS-CoV-2 Pl2pro.

In fact, a lot of research is still required to consolidate our understanding of this protein and its domains. In spite of that, we are making progress in our endeavor to fight this virus - and every step we take is one more to win this fight.

Sources

[1] Báez-Santos YM, St John SE, Mesecar AD. The SARS-coronavirus papain-like protease: structure, function and inhibition by designed antiviral compounds. Antiviral Res. 2015;115:21-38. doi:10.1016/j.antiviral.2014.12.015, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5896749/

[2] Lei J, Kusov Y, Hilgenfeld R. Nsp3 of coronaviruses: Structures and functions of a large multi-domain protein. Antiviral Res. 2018;149:58-74. doi:10.1016/j.antiviral.2017.11.001, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7113668/

[3] Clasman JR, Báez-Santos YM, Mettelman RC, O'Brien A, Baker SC, Mesecar AD. X-ray Structure and Enzymatic Activity Profile of a Core Papain-like Protease of MERS Coronavirus with utility for structure-based drug design. Sci Rep. 2017;7:40292. Published 2017 Jan 12. doi:10.1038/srep40292, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5228125/

[4] Lei J, Hilgenfeld R. RNA-virus proteases counteracting host innate immunity. FEBS Lett. 2017;591(20):3190-3210. doi:10.1002/1873-3468.12827, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7163997/

[5] Chen X, Yang X, Zheng Y, Yang Y, Xing Y, Chen Z. SARS coronavirus papain-like protease inhibits the type I interferon signaling pathway through interaction with the STING-TRAF3-TBK1 complex. Protein Cell. 2014;5(5):369-381. doi:10.1007/s13238-014-0026-3, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3996160/

[6] Freitas BT, Durie IA, Murray J, et al. Characterization and Noncovalent Inhibition of the Deubiquitinase and deISGylase Activity of SARS-CoV-2 Papain-Like Protease [published online ahead of print, 2020 Jun 4]. ACS Infect Dis. 2020;acsinfecdis.0c00168. doi:10.1021/acsinfecdis.0c00168, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7274171/

Für diesen Beitrag exisitiert leider keine deutsche Übersetzung.

The world holds its breath as the novel Coronavirus continues to spread across the world, bringing our lives to a halt. We have gathered a lot of knowledge about the virus but there are still many gaps to fill. The non-structural-protein 3 (nsp3) represents one of these gaps in our knowledge. As the largest protein encoded by the coronaviruses genome, untangling its structure and function poses a huge task.

However, we can glean some knowledge around the specific function of SARS-CoV-2 nsp3 by looking at the virus‘s subfamily,  Orthocoronaviridae. As related viruses do share some common traits, academics were not completely unprepared when SARS-CoV-2 came. In the background, while only very few people were worried about a new corona virus, scientists around the world had been investigating the invisible enemy for decades. Building on this past work we look at the functions of proteins from other coronaviruse, like Murine Hepatitis Virus (MHV) and SARS-CoV, to learn more about how best to fight against SARS-CoV-2.

Fig. 1: The crystal structure of papain-like protease of SARS CoV-2 nsp3 (PDB-ID: 6w9c). Picture by Kristopher Nolte.

The gene which produces nsp3 lies on the open reading frame 1a (ORF1a) which encodes polyprotein 1a. The sequence for nsp3 of SARS-CoV is 1922 amino acids long and sandwiched between nsp2 and nsp4. It not only cleaves itself from the polyprotein by its papain-like protease domain but also nsp1 and nsp2. In coronaviruses, 18 different domains have been found in nsp3. Each virus type has 10 to 16 of these, out of which eight domains and two transmembrane regions form the conserved part of nsp3, which can be found in every coronavirus known to date [1]:

  1. Ubiquitin-like-domian (Ubl1)
  2. Ubiquitin-like-domain (Ubl2)
  3. Papain-like protease (PlPro)
  4. Macro domain / X domain (Mac)
  5. Hypervariable region / Glu-rich acidic domain (HVR)
  6. Transmembrane regions (TM1)
  7. Transmembrane regions (TM2)
  8. Ectodomain / Zinc finger domain (3ecto)
  9. Nidovirus-conserved domain of unknown function (Y1)
  10. Coronvirus specific carboxyl-terminal domain (CoV-Y)

To start our investigation on SARS-CoV-2 related structural data, we will look into the protein sequences of SARS-CoV and SARS-CoV-2 to learn where they are similar and where they differ.

Genetic Comparsion of SARS-CoV and SARS-CoV-2

SARS-CoV has 16 domains which span 1922 amino acids. The nsp3 protein of SARS-CoV-2 is a bit longer at 1945 amino acids. When compared to each other, there is an overall similarity of 75,97%.[2] In Addition to the ten conserved domains the nsp3 gene of SARS-CoV-2 codes for four domains:

Fig 1: Position of the nsp3 gene on the SARS-CoV-1 genome. Nsp3 is seperated into 12 domains. Picture by Thomas Splettstoesser, scistyle.com.
  1. Nucleic-acidic-binding domain (NAB)
  2. Betacoronavirus specific marker domain (βSM)
  3. Domain preceding Ubl2 and PL2pro (DPUP)
  4. Amphipathic helix 1 (AH1)

The two domains at the N-terminal end, Ubl1 and HVR, have an alignment of 79% and 64%, respectively. There seems to be a trend in coronaviridae for these domains to be poorly conserved, but Ubl1 still adopts the expected conserved fold.[4] If this proves true, could be analysed by comparing the sequence alignment and the structural similarity. It is unsurprising that the "high variable region" lives up to its name and shows the worst alignment of all. In the related MHV nsp3, this domain is dispensable for replication.[5]
It has been speculated that the Mac1 domain functions as an ADP ribose 1"-phosphatase, however, the effects of mutation in this region differ from virus to virus.[4] As a result, it is difficult to judge what significance the bad alignment of this domain will have on our understanding of SARS-CoV-2 without further research.

Table. 1: The domain amino acid range for SARS-CoV-1 was taken from Hilgenfeld et al.,2018 [2]. The range for SARS-CoV-2 was determined by taking the amino acid ranges of CoV-1 and using BLAST [2] to search for the best alignment of the domain sequences. Picture by Kristopher Nolte

The Mac1 domain, also known as the X-domain, is followed by two macrodomains which were originally called "SARS-CoV Unique domains" (SUD-N and SUD-M), but were renamed when they were found to not be unique to SARS-CoV. It has since been observed that only Mac3 plays an essential role in viral RNA replication[6], which could explain why Mac3 is one the most conserved domains in the alignment of SARS-CoV and SARS-CoV-2.

Pl2Pro and its neighbouring domain Ubl2 show some of the highest sequence alignments of all domain comaprisons. This could be explained by their essential function to cleave nsp3 from the polyprotein.
Little is known about the domains following Pl2Pro and our current structural knowledge is limited to a nuclear magnetic resonance (NMR) structure of NAB. While the structure and function of Y1 and CoV-Y from SARS-CoV-2 are currently unknown, their sequence, which compromises a fifth of the genome, is highly conserved in all coronaviruses.

Fig. 2: The location of the aligned domains of SARS-CoV (abbreviated CoV-1) and SARS-CoV-2 (abbreviated CoV-2) is shown over the length of nsp3 (TM1 = 1, TM2 = 2, AH1 =A). Picture by Tim Scharf.

In the second part of the series of Untangling Nsp3 of SARS-CoV-2 we will delve deeper into some structures of nsp3 of SARS-CoV-1 and SARS-CoV-2 and will try to find out how the differences in the sequence may have influenced some structures of the protein. For a further in-depth reading on the topics discussed here I highly recommend the sources below.  

Table. 2: For each domain and their respective counterpart in SARS-CoV-2 a BLAST search was contucted to search for fitting PDB-IDs. Last Update: 18.05.2020. The scripts and the PDB-data can be found in our Git repository [3]
Picture by Kristopher Nolte

Sources

  • [1] Lei J, Kusov Y, Hilgenfeld R. Nsp3 of coronaviruses: Structures and functions of a large multi-domain protein. Antiviral Res. 2018 Jan;149:58-74. doi: 10.1016/j.antiviral.2017.11.001. Epub 2017 Nov 8. PMID: 29128390; PMCID: PMC7113668.
  • [2] Madden T. The BLAST Sequence Analysis Tool. 2002 Oct 9 [Updated 2003 Aug 13]. In: McEntyre J, Ostell J, editors. The NCBI Handbook [Internet]. Bethesda (MD): National Center for Biotechnology Information (US); 2002-. Chapter 16. Available from: http://www.ncbi.nlm.nih.gov/books/NBK21097/
  • [3] https://github.com/thorn-lab/coronavirus_structural_task_force
  • [4] Benjamin W. Neuman, Bioinformatics and functional analyses of coronavirus nonstructural proteins involved in the formation of replicative organelles, Antiviral Research, Volume 135, 2016, Pages 97-107, ISSN 0166-3542, https://doi.org/10.1016/j.antiviral.2016.10.005.
  • [5] K.R. Hurst, C.A. Koetzner, P.S. Masters, Characterization of a critical interaction between the coronavirus nucleocapsid protein and nonstructural protein 3 of the viral replicase-transcriptase complex J. Virol., 87 (2013), pp. 9159-9172
  • [6] Kusov Y, Tan J, Alvarez E, Enjuanes L, Hilgenfeld R. A G-quadruplex-binding macrodomain within the "SARS-unique domain" is essential for the activity of the SARS-coronavirus replication-transcription complex. Virology. 2015 Oct;484:313-22. doi: 10.1016/j.virol.2015.06.016. Epub 2015 Jul 3. PMID: 26149721; PMCID: PMC4567502.

Logo Coronavirus Structural Taskforce
Top