Lea C. von Soosten, Maximilian Edich, Kristopher Nolte, Johannes Kaub, Gianluca Santoni and Andrea Thorn
This blog post was published in Crystallography Reviews.
Please cite: https://doi.org/10.1080/0889311X.2022.2098281
Abstract
With up to 17 domains, non-structural protein 3 (nsp3) is the largest protein of SARS-CoV-2. In part due to its large size, many of its functions still remain a mystery. It is known that nsp3 fulfils several essential functions in the cycle of infection, however most of its domains have not been structurally determined. One of its essential functions is to cleave the polyprotein, which is translated first upon infection into other functional non-structural proteins. Nsp3 is also involved in the evasion of the host immune system and forms large pore complexes important for viral replication. Furthermore, it interacts with more than 30 other host and viral proteins, resulting in a multitude of potential ways to affect the host cell and viral replication. The many roles of this coronaviral Swiss army knife make it a promising drug target. In this review, we aim to clarify naming conventions and give an overview on the structures and functions of its domains as a starting point for further research.
Introduction
Non-structural protein 3 (nsp3) is the largest protein of SARS-CoV-2 [1] and plays an important role within the infection cycle. Even though many of its functions are currently still unknown or uncertain, several studies do enable insight into its mechanisms, some of which are essential for viral replication. This makes nsp3 a promising drug target for therapeutics against COVID-19. The complexity behind nsp3 however lies not only in its size, but also in its great number of separate functional domains (see Figure 1), making it a ‘Swiss army knife’ of viral proteins. Depending on the definition of the domain borders, it consists of up to 17 domains. At the time of this review, only six of those are structurally solved.
Upon infection by SARS-CoV-2, the two polyproteins pp1a and pp1ab are translated and cleaved into individual non-structural proteins by two viral proteases. While the proteins nsp1 to nsp3 are cleaved from the polyprotein by the Papain-like Protease domain of nsp3, the remaining proteins nsp4 to nsp16 are cleaved by the 3C-like protease (also known as Mpro or nsp5) [2]. Some of the other domains of nsp3 interact with host proteins and interrupt the anti-viral signal transduction [3], while others interact with nsp4 to form double membrane vesicles [4], where it is fixed by two transmembrane domains [3] (see Figure 2). These vesicles protect the replication machinery against host proteases and ensure the replication of viral RNA. Together with nsp4 and nsp6, nsp3 also forms molecular pores in these vesicles through which the RNA can be exported for packing into new virions [5]. Altogether, nsp3 is an immensely complex protein, the overall fold of which is still unclear.

Figure 1. Domains of nsp3 in sequence. Structural examples for the domains from SARS-CoV-2 or SARS-CoV-1 are depicted below or above the domain chart, respectively. For domains depicted in grey, no Sarbecovirus structures have been solved yet. Figure was created using Protein Imager [40].
A detailed review on the nsp3 of coronaviruses up until 2018 (excluding SARS-CoV-2) was published by Lei et al. [3], giving a good overview of its function in different viruses.
Here, we provide an overview of the current state of nsp3 structural biology, with a view to recent discoveries boosted by the COVID-19 pandemic.

Figure 2. Interaction network of nsp3 domains and various interaction partners from the virus or the host. Domains are depicted in round boxes; viral components are shown as grey boxes and host components are shown in orange. The connecting lines show if the interaction was shown in vivo (black), shown in vitro (grey, solid) or only proposed (grey, dashed). Some of those interactions have only been proven for SARS-CoV-1 so far. The details for each interaction are found in the text.
Function and structure
Some nsp3 domains are present in all coronaviruses, whereas others only exist in specific coronaviruses. The multifunctionality of this protein along with inconsistent domain nomenclature makes it hard to understand in its entirety. An overview of all domains in SARS-CoV-2 nsp3 and their different names used in literature is given in Table 1. So far, no full-length structure of nsp3 has been solved and structural models are available for only 6 of its 17 domains. These include both ubiquitin-like domains (Ubl1 and Ubl2), the macrodomain 1 (Mac1), the Papain-like Protease (PL2pro) and the nucleic-acid-binding domain (NAB) as well as the Y3 domain. At the time of writing this review more than 300 experimentally determined structures of nsp3 domains are available in the PDB. A selection of those is listed in Table 2, describing at least one structure per available domain. In the following, a short introduction to the function of each domain is given, followed by a section with a closer look at existing structures, functional elements, and corresponding mechanisms, where available.
Domains
Table 1 Names and abbreviations. Cells shaded in grey indicate domains with available protein structures. The subsection column lists alternative names for (multi-)domain regions, which were used in the past. The used NCBI reference sequences are NP_828862.2 and YP_009742610.1 for SARS-CoV-1 and SARS-CoV-2, respectively. References: Lei et al. [3]; S. M. Korn et al. [6]; Schuller et al. [7]; N. Salvi et al. [8].
a These numbers are based on the predictions by the TMHMMM 2.0 Server [9].
b These numbers are based on the numbers of known structures and/or the predicted transmembrane regions and these regions might include also disordered linker sequences. It was not possible to determine the exact amino acid separating Y1 and CoV-Y.
c Y3 is a part of CoV-Y. Ranges without annotation were determined via available structures.
d Sequence identity was calculated for Y1 combined with CoV-Y without Y3, as their separation is not clear.
Complete Name | Alternative Name | Used Abbreviation | Subsection | Amino acid numbers/range | Sequence identity to SARS-CoV-1 [%] |
Ubiquitin-like domain 1 | Ubl1 | nsp3a | 1–111 | 76.58 | |
Hypervariable region | Glu-rich acidic region, intrinsically disordered region (IDR) | HVR | 112–206 | 35.79 | |
Macrodomain 1 | X-domain, Macrodomain, ADP-ribose-100-phosphatase (ADRP), S2-MacroD, MacroD | Mac1 | nsp3b | 207–379 | 71.68 |
Macrodomain 2 | SARS-unique domain N (SUD-N) | Mac2 | nsp3c | 413–550 | 69.57 |
Macrodomain 3 | SARS-unique domain M (SUD-M) | Mac3 | 551–675 | 81.60 | |
Domain preceding Ubl2 and PL2pro | SARS-unique domain C (SUD-C) | DPUP | 676–745 | 74.29 | |
Ubiquitin-like domain 2 | Ubl2 | nsp3d | 746–804 | 89.83 | |
Papain-like Protease | PLpro | PL2pro | 805–1063 | 81.08 | |
Nucleic-acidic-binding domain | NAB | nsp3e | 1089–1203 | 81.74 | |
Betacoronavirus-specific marker domain | Group-2-specific marker domain (G2M) | ßSM | 1204–1412b | 68.90 | |
Transmembrane region 1 | TM1 | 1413–1435a | 73.91 | ||
Nsp3 ectodomain | Lumenal loop | 3Ecto | 1436–1522a | 70.11 | |
Transmembrane region 2 | TM2 | 1532–1554a | 78.26 | ||
Amphipathic helix 1 | AH1 | 1561–1583a | 86.96 | ||
Nidovirus-conserved domain of unknown function | Y1 | 1584–? b | 88.08d | ||
Coronavirus-specific C-terminal domain | CoV-Y | Y2 | ? –1843b | 88.08d | |
Coronavirus-specific C-terminal domain | CoV-Y | Y3 | 1844–1945c | 90.20 |
Nsp3a – ubiquitin-like domain 1 and the hypervariable region
Starting at the nsp3 N-terminus, the first two domains are the ubiquitin-like domain 1 (Ubl1) and the Glu-rich acidic region (AC domain), with the latter alternatively known as hypervariable region (HVR). Both exist in all coronaviruses and together are called nsp3a in some of the literature [3]. Although the specific function of the coronaviral ubiquitin-like domain is unknown, studies indicate that the domain interacts with the viral nucleocapsid protein [8,10] and is capable of binding to single-stranded RNA in SARS-CoV-1 [3,11]. Nucleocapsid packs and protects viral RNA in the virion. This suggests that Ubl1 acts as a key interaction partner to facilitate association between the nucleocapsid, the replication/transcription complex (RTC) and the viral RNA [10]. It may also play an essential role in the viral replication process; one study for example showed that mutants with deleted Ubl1 core regions in mouse hepatitis viruses were not able to replicate [3].
Table 2 PDB entries for nsp3 domains. For each entry we list ID, technique, a comment on its importance and an evaluation of the most intense Fourier difference peaks as calculated from Coot as well as a statement on the model quality. A structure for each domain was chosen if possible. For Mac2, Mac3, and DPUP, structures where only available for SARS-CoV-1. For Mac1 and PL2pro, we listed one structure bound to a ligand and an additional structure without a ligand and the highest resolution.
SARS-CoV-2 | |||||||
ID | Method | Resolution in Å | Description | Comment about the highest Fourier difference peak from Coot | |||
7KAG | X-ray diffraction | 3.21 | Ubl1 domain shown as a dimer. Model includes 13 molecules of ethylene glycol and one sulfate. Both monomers show contact at their N-terminal β-sheet. | 22 peaks, with the highest being around Ser93. | |||
7KQP | X-ray diffraction | 0.88 | Macrodomain in complex with ADP-ribose. Only one monomer of the shown dimer is bound to the ligand. | 87 peaks, mainly water molecules missing. | |||
7KR0 | X-ray diffraction | 0.77 | Macrodomain without inhibitor at 100 K in highest resolution. Model shows a monomer. | 87 peaks, mainly water molecules or crystal solution components missing. | |||
7JRN | X-ray diffraction | 2.48 | PL2pro and Ubl2 with in complex with inhibitor GRL0617. Model shows a dimer and includes next to two inhibitor molecules two zinc atoms located at the finger domains and five sulfates in total. | 12 peaks. The highest one suggests that sulphate 405 could be a water molecule instead. | |||
7D6H | X-ray diffraction | 1.6 | PL2pro and Ubl2 C111S mutant without inhibitor in highest resolution. Shown is a monomer with zinc atom at the finger domain and a phosphate. | 59 peaks, mainly water molecules missing. | |||
7LGO | X-ray diffraction | 2.45 | NAB domain shown as a dimer surrounded by water. | 4 peaks, the highest could point to an alternative conformation of Thr92 in chain B | |||
7RQG | X-ray diffraction | 2.17 | Tetramer of Y3 domain, where a loop parts of the C-terminus of one monomer are not modelled. | 3 peaks, the highest indicates a possibly cleaved disulphide bridge between Cys1926 from chains B and D. | |||
SARS-CoV-1 | |||||||
2W2G | X-ray diffraction | 2.22 | Dimer of Mac2 and Mac3, where the linker between those is only partially modelled. The model includes two sulfates. | 2 peaks, no major problems can be found in the structure.. | |||
2KQW | Solution NMR | DPUP as a monomer. Although it contains the sequence of the SUD region, only the DPUP domain is modelled. |
Comparison of coronavirus genomes indicates a co-evolution between nucleocapsid and Ubl1. While any coronaviral nucleocapsid binds to any Ubl1, strong binding affinity was only measured for proteins of the same virus. For example, in experiments with bovine coronavirus (BCoV) and mouse hepatitis virus (MHV), the binding of BCoV nucleocapsid and MHV Ubl1 was lower than the binding of MHV nucleocapsid to MHV Ubl1 by a factor of 260 [3]. Additionally, Ubl1 shows a high structural similarity to human ubiquitin and the ubiquitin-like domain of human interferon-stimulated gene 15 (ISG15). Therefore, it is suggested that Ubl1 interacts with ubiquitin- and/or ISG15-targeting proteins and thus interferes with anti-viral signal transduction, since these kinds of proteins are often involved in immune signal transduction pathways [3].
The function of the Glu-rich acidic region remains a mystery. It lives up to its alternative name (hypervariable region), as – despite its presence in all coronaviruses – it is poorly conserved. From its 95 residues in SARS-CoV-2, glutamic acid and aspartic acid make up 22% and 11%, respectively. In contrast, the Glu-rich acidic region of SARS-CoV-1 consists of 69 residues, from which glutamic acid and aspartic acid make up 36% and 12%, respectively. It is also suggested to be intrinsically disordered [3]. Possible functions include a regulatory role and the interaction with other non-structural proteins of the virus [3], while Glu-/Asp-rich proteins in general are also adept at mimicking DNA or RNA [12], supporting the interaction between Ubl1 and the nucleocapsid.
Overall structure and functional features
Currently, one structure of the SARS-CoV-2 ubiquitin-like domain 1 is available (PDB: 7KAG) (see Figure 3 and Table 2). For the hypervariable region, no structure has been deposited so far, likely due to its proposed disordered nature [3]. This is indicated by its high variability among all coronaviruses, since it does not suggest common conserved structural elements.
For SARS-CoV-1 Ubl1, one NMR ensemble (PDB ID: 2GRI) and one conformer resembling the mean coordinates (PDB ID: 2IDY) have been published. Sequence identity between the structures from 7KAG and 2IDY amounts to 76.58%. The root mean square deviation (RMSD) between these structures’ Cα positions is 4.7 Å. However, both termini seem to be disordered. Thus, removing the first 18 residues and everything beyond residue 105 reduces the RMSD to 1.8 Å. Despite high sequence similarity, both folds differ at the disordered 12 N-terminal residues and slightly in the length of sheets and helices (Figure 3(a)). The secondary structure elements of 7KAG follow the sequence β1–β2–β3–α1–β4–α2–α3–α4–β5. The overall shape of the domains is similar in both viruses, indicating a conserved function.
Nsp3b – macrodomain 1/ ADP-ribose-phosphatase (ADRP)
The macrodomain 1, also known as nsp3b, X-domain or ADP-ribose phosphatase domain (ADRP), is a conserved domain found in all coronaviruses [3]. ADP-ribosylation of proteins and DNA is a post-translational modification, forming either monomeric ADP-ribose or poly-ADP-ribose (PAR) [13,14] conjugated through the C1 of the distal ribose at the end of the ADP-ribose molecule [7]. These modifications can be formed by the human poly(ADP-ribose) polymerases (PARP), which are involved in numerous processes, including stress response, protein degradation, signalling and several more [16]. Many of these PARPs are expressed during interferon (IFN) response, are part of the immune response and hold antiviral activity [14,15].Macrodomain 1, despite being called ADP-ribose phosphatase, is in fact a mono(ADP-ribosyl)hydrolase, i.e. it binds to monomeric (ADP-ribosyl) moieties (MAR) and hydrolyses them from target proteins [16]. The macrodomain 1 of SARS-CoV-2 has also been shown to reverse the human PARP14-derived ADP-ribosylation [15] and therefore antagonizes the host’s immune response. With this, the macrodomain plays an important role in the viral pathogenesis and therefore constitutes an interesting drug target [17]. This function is conserved throughout coronaviruses despite a sequence divergence of 28% between SARS-CoV-2 and SARS-CoV-1 (Table 1) and 59%, between SARS-CoV-2 and MERS-CoV [16].While a poly(ADP-ribosyl)hydrolase activity has been reported in other viral macrodomains including SARS-CoV-1 [18,19], it binds weakly without being able to hydrolyse these poly(ADP-ribose) moieties in SARS-CoV-2. Additionally, a mono(ADP-ribosyl)hydrolase activity has been shown [20].

Figure 3. Structures of the domains Ubl1 (a) (PDB-ID: 7KAG), NAB (b) (PDB-ID: 7LGO), and Y3 (c) (PDB-ID: 7RQG) from SARS-CoV-2. The left column shows the structures in cartoon representation with its surface, the central column shows it in cartoon alone and the right column shows for Ubl1 and NAB domains the superimposition between the respective structure from SARS-CoV-2 (blue) and its counterpart from SARS-CoV-1 in purple (PDB-ID for Ubl1: 2IDY; PDB-ID for NAB: 2K87). Figure was created using Protein Imager [40].
Overall structure and functional features
The overall structure of this domain is formed by two layers of helices and β-sheets of seven strands wedged between them, following the characteristic structure of a MacroD-type macrodomain. The β-sheet comprises the seven strands β1, β2, β7, β6, β3, β5 and β4, the two layers encompassing the β-sheet consist of α1, α2 and α3 on one side and η1, α4/η2, η3, α5 and α6 on the other side [17] (see Figure 4).
The ADP-ribose-binding pocket is formed by the C-terminal ends of the β-strands β3, β5, β6 and β7 in the centre of the sheet and the N-terminus of the α1-helix. The substrate is further surrounded by the 310 helix η3 formed by the loop between β6 and α5 as well as the connecting loop between β3 and α2 [17].
The pocket can be divided into the adenosine site, formed by the subsites involved in the binding of adenine and the proximal ribose, and the catalytic site, formed by the subsites involved in binding the diphosphate and distal ribose [7,17]. Within the structure, four regions that are highly conserved in different coronaviruses can be identified [16,17]. The first in sequence is the VNAAN motif (residues 36–40) in β3, with the N40 forming hydrogen bonds with the distal ribose 31 hydroxyl group. The second conserved region resides in the loop between β3 and α2 around the GGG motif (residues 46–48); it is involved in binding the distal ribose 21 and 11 hydroxyl groups [16]. The third region is located at the end of β5 with the amino acids VGP at positions 96–98 [16,17]. The last highly conserved region can be found at the end of β6 up to the end of η3, containing the GIF motif at 130–132. It interacts with the phosphates and is also involved in the binding of the distal ribose, where Phe132 might be involved in hydrolysis [16,17]. The substrate-binding regions show some degree of flexibility, which does not lead to large conformational changes within the rest of the protein, but instead to local shifts and rearrangements of certain structural elements like the β3–α2 or β6–(η3–)-α5 loop [17].

Figure 4. The SARS-CoV-2 macrodomain 1. Left: with surface and its substrate ADP-ribose, PDB ID: 7KQP. Right: labelled structure without substrate (PDB ID: 7KQO). Figure was created using Protein Imager [40].
Available structures
To date, 265 structures of the SARS-CoV-2 macrodomain 1 are available. Here, a selection of relevant structures will be given. A total of 10 apo-structures are currently available for SARS-CoV-2 nsp3 macrodomain 1, ranging from 0.77 Å to 2.03 Å in resolution (PDB IDs: 7KR0, 7KQO, 7KQW, 6WEY, 5S74, 5S73, 6WEN, 7KG3, 7KR1, 6VXS). Of the macrodomain bound to ADP-ribose, 7 structures are available, their resolution ranging from 0.88 Å to 3.83 Å (PDB IDs: 7KQP, 6W02, 6Z5T, 6WOJ, 6YWL, 7CZ4, 7C33). The remaining 248 structures show the domain in complex with various molecules, including buffers like MES (PDB IDs: 6WCF, 6YWM) and HEPES (PDB ID: 6YWK), nucleotides (PDB IDs: 6W6Y, 7BF4), ADP-ribose analogues (PDB IDs: 7BF5, 7BF6), adenosine (PDB ID: 7BF3) and cAMP (PDB IDs: 7JME). Other structures show complexes with the novel compounds PARG-345 (PDB IDs: 7LG7) and PARG-329 (PDB IDs: 7KXB) as well as the remdesivir metabolite GS-441524 (PDB ID:7BF6), ADP-ribose phosphate (PDB ID: 7BF5). Numerous structures were solved as part of a study conducted by Schuller et al. [7] through a combination of computational docking and crystallographic screening of small molecule fragments, resulting in more than 230 structures with 214 unique fragments bound to macrodomain 1.
In total, 192 of the fragments were located at the active site, 14 were bound to a distant pocket at Lys90 located at the backside of the protein, the rest was spread across the protein’s surface. Another interesting finding of the study is the distribution of fragments within the active site. While most were found to bind at the adenine subsite, 54 fragments were found to bind to a location formed by the backbone nitrogens of Phe156 and Asp157 close to the adenine subsite, and only a few bound to the catalytic site [7].
Further, a comparison between crystal structures in the apo-form at different temperatures (100 K: PDB ID 7KR0; 310 K: PDB ID 7KR1) was carried out, showing a more compact structure and some loop displacement close to the active site at lower temperature [7]. Besides crystallization conditions such as the presence of ligands, a number of studies additionally observed a strong influence of small alterations within the N- and C-termini of the domain’s amino acid sequence on the resulting crystal forms [7,15,16]. Additionally, different crystal forms exhibit different accessibility for molecules to the active site of the macrodomain, as their symmetry mates partly obstruct possible access points for example in the C2 crystal packing. Hence, the slightly more accessible P43 crystal packing appearsto be the most represented under the available structures [7]. The macrodomain 1 is an interesting drug target as it acts as a keen antagonist of ADP-ribosylation by cellular (ADP-ribosyl)transferases such as PARP14 [15], making it a crucial part of the virus due to its ability to interfere with the host immune response [14,15]. Even though the exact mechanism of action is not completely known yet, the vast number of available structures of different complexes already provides plenty of information about this domain and forms a good baseline for further exploration for suitable inhibitors.

Figure 5. Nsp3c or SUD of SARS-CoV-1: (A) Structure of macrodomain 2 with (left) and without surface (right), PDB ID: 2W2G, amino acids 389–516. (B) Structure of macrodomain 3 with (left) and without surface (right), PDB ID: 2W2G, amino acids 527–652. (C) Structure of the DPUP with (left) and without surface (right), PDB ID: 2KQW. Figure was created using Protein Imager [40].
Nsp3c – macrodomain 2, macrodomain 3 and the domain preceding Ubl2 and PL2pro
Nsp3c, formerly known as the SARS-unique domain (SUD), is predominantly found in the clade of Sarbecoviruses, which contains SARS-CoV-1 and SARS-CoV-2. After it had also been found outside of SARS coronaviruses, the domains were renamed, but all former names are currently still in use. Nsp3c consists of macrodomain 2 (Mac2, formerly SUD-N), macrodomain 3 (Mac3, formerly SUD-M) and the domain preceding Ubl2 and PL2pro (DPUP, formerly SUD-C) [3]. Mac2 is separated from Mac1 via a linker of 33 residues. Noticeably, the Mac2 domain does not exist in MERS-CoV [21]. So far, no protein structures of these domains of SARS-CoV-2 are available, but a few were published for SARS-CoV-1, enabling predictions about their relatives in SARS-CoV-2 due to the high sequence similarity (Table 1). These include one structure of Mac2 (PDB ID: 6YXJ), five of Mac3 (PDB IDs: 2JZD, 2JZE, 2JZF, 2RNK, 2KQV), two structures spanning over Mac2 and Mac3 (PDB IDs: 2W2G, 2WCT) and one showing DPUP (PDB ID: 2KQW).
Overall structure and functional features
Available NMR chemical shift assignments for SARS-CoV-2 reveal details about the secondary structure elements of the three domains. The Mac2 consists of the elements β1–α1–β2–α2–α3–β3–β4–α4 [22], Mac3 of the elements β1–α1–β2–α2–β3–β4–α3–β5–α4–α5–β6–α6 and the DPUP of α1–β1–β2–β3–β4–α2 [23]. When comparing these to the available structures of SARS-CoV-1 (Figure 5), all three domains share a very high secondary structure identity with their SARS-CoV-1 counterparts.
Even though Mac2 and Mac3 also follow an α–β–α sandwich fold as seen in macrodomain 1, they differ in their respective secondary structure elements as well as their function [23]. In both SARS-CoV-1 and SARS-CoV-2, the Mac2 domain has been shown to bind to the middle domain of the human poly(A)-binding protein-interacting protein 1 (Paip1) [21]. Paip1 is part of the host translation machinery and stimulates translation [24]. For SARS-CoV-1, the crystal structure of Mac2 in complex with the middle domain of Paip1 (PDB ID: 6YXJ) shows that the Mac2 mainly interacts with Paip1 via its N-terminal loop. Furthermore, the SUD has been shown to increase the binding affinity between Paip1 and its binding partner human poly(A)-binding protein. The SUD has been shown in cellulo to increase only viral protein translation levels, suggesting an interaction with the aforementioned human proteins. As the sequence identity between these domains in SARS-CoV-1 and SARS-CoV-2 (Table 1) and especially in the N-terminal loop of Mac2 binding to Paip1 is relatively high, a similar function can be assumed for these domains in both viruses [21]. Another interesting feature found in the SARS-CoV-1 Mac2 and Mac3 domains is the binding to oligo(G)-containing nucleic acids capable of forming G-quadruplexes [25,26]. A replacement of the amino acids interacting with the oligo(G)/G-quadruplex as well as a deletion of the Mac3 domain has been shown to lead to an abrogation of the viral genome replication [25]. Furthermore, the cellular tumour suppressor p53, which has in other viruses already been shown to hold antiviral activity, can be degraded by the involvement of the SUD together with the PLpro domain [27].
Nsp3d – ubiquitin-like domain 2 and Papain-like Protease 2
While many coronaviruses encode nsp3 proteins with two Papain-like Protease domains, which are known as PLpro and PL2pro, nsp3 of SARS-CoV-2, SARS-CoV-1 and MERS-CoV, only encodes one such protease [28]. Together with the ubiquitin-like domain 2 (Ubl2), it forms a region that is also called nsp3d. Although only the protease corresponding to the PL2pro domain is found in the Sarbecoviruses, both names, PLpro and PL2pro, are in use. Sometimes, even the whole nsp3 protein is found under the name of PLpro, adding further to the confusion. Ubiquitin-like domain 2 (Ubl2) is directly followed by PL2pro and is sometimes seen as a subdomain of the protease [29]. One of its major targets for cleavage is the viral polyprotein, which includes all 16 nsps. Human cells use similar proteases connected to ubiquitin-like domains, the class of deubiquitinating enzymes, to regulate several pathways by cleaving ubiquitin from the target protein. Such ubiquitin-specific proteases can be regulated by their respective ubiquitin- ike domain. However, such a regulation of PL2pro activity by its attached Ubl2 in SARS-CoV and SARS-CoV-2 is unlikely, due to structural differences in the linker compared to its counterparts in human cells [3]. The exact function of Ubl2 has not yet been identified. Mutation and deletion experiments in Ubl2 led to inconsistent results; while the deletion of Ubl2 had no impact on the thermal stability of PL2pro, a mutation in the hydrophobic core of Ubl2 did. The mutation, however, may disrupt the fold. Nonetheless, Ubl2 is more conserved than Ubl1 among different coronaviruses [3], which may indicate an important function. PL2pro recognizes a LXGG↓XX motif and cleaves the polyproteins pp1a and pp1ab to release nsp1, nsp2 and nsp3 after translation of the viral genome, making it indispensable for infection and replication and thus a promising drug target. As an LRGG motif constitutes the C-terminus of ubiquitin and the interferon-stimulated gene 15 protein (ISG15), PL2pro further exhibits activity for de-ubiquitination and removal of ISG15 [2]. The substrate binding prior to the cleaving is performed via two ubiquitin-binding sites, Ub1 and Ub2, and is regulated via the blocking loop BL2 [3]. In addition, it can also bind polyubiquitin and cleaves di-ubiquitin from those. Since both ubiquitin and ISG15 are involved in various host pathways including pathways related to immune response, PL2pro can interact with the immune response. Two of the main pathways are the interferon regulatory factor 3 and NF-κB pathways, which produce antiviral cytokines [3].
Besides its de-ubiquitinating and de-ISG-ylating activity, PL2pro interacts also in addi-tional ways with the immune response. It inhibits proteins from the interferon regulatory factor 3 (IRF3) pathway and the NF-κB signalling pathway, although due to inconsistent results the exact mechanisms are still up for debate [3]. Interestingly, SARS-CoV-1 PL2pro and SARS-CoV-2 PL2pro show different preferences for their substrates. While PL2pro of SARS-CoV-1 favours ubiquitin chains, the variant from SARS-CoV-2 prefers the interaction with ISG15, which leads to a de-ISG-ylation of interferon regulatory factor 3 and could thus lead to better evasion from immune responses [30]. Differences were also observed in the enzyme activities, as SARS-CoV-2 PL2pro is 2500–3000 times more efficient towards ISG15 as a substrate than to the polyprotein, while SARS-CoV-1 PL2pro is only 100 times more efficient for ISG15 over the polyprotein [28]. Also, SARS-CoV-1 PL2pro demonstrates differences in substrate specificity in vivo and in vitro experiments. The digestion of Lys48-linked poly ubiquitin chains from TRAF3 and TRAF6, two regulators of IRF3 and NF-κB, is preferred over the digestion of Lys63-linked poly ubiquitin in vitro. Furthermore, the Lys48-linked chains are not removed in vivo at all [3]. A possible reason might be the interaction between viral proteins and host factors [3], as well as the rearrangement of PL2pro and other domains within nsp3.
Different constructs consisting of nsp3 regions including PL2pro also interact with the viral proteins nsp2, nsp4, nsp6, the viral RNA polymerase nsp12 as well as with ORF3a, ORF7a, and ORF9b proteins, as shown through various protein–protein interaction assays [3]. The interaction with nsp2, ORF3a, and ORF9b was shown for the region from PL2pro to the C-terminus, while the interaction with nsp4, nsp6, nsp12, and ORF7a was shown for the region from PL2pro to the betacoronavirus-specific marker domain. The nature of these interactions remains unclear [3].
Overall structure and functional features
SARS-CoV-2 PL2pro comprises four subdomains [29] and follows a ‘thumb-palm-finger’ (Figure 6) architecture [31]. The four subdomains are: the N-terminal ubiquitin-like domain (Ubl2), the thumb, the zinc-finger and the palm subdomain [29], of which the last three form the catalytic part [31]. The Ubl2 subdomain consists of one α- and one 310-helix as well as five β-strands. The thumb consists of six α-helices and a β-hairpin. The palm subdomain has six β-strands and contains the catalytic triad of Cys111, His272, and Asp286 at its interface to the thumb subdomain [31]. The residues Gly266–Gly271 form a mobile β-loop close to the active site, known as the blocking loop 2 (BL2), which is involved in the regulation of substrate binding [3] by recognizing the LXGG↓XX motif and changing its conformation when binding to a substrate or inhibitor [29,31].
The finger subdomain consists of six β-strands and two α-helices. Within the loops between the β-strands reside four conserved cysteine residues (Cys189, Cys192, Cys224, Cys226), forming a zinc finger and coordinating a zinc ion [29,31].
The PL2pro domain exhibits two different binding sites, S1 and S2 (Figure 6), for cleavage of ubiquitin molecules or ISG15. However, both binding sites differ in substrate specificity and activities, which are important to characterize the protein’s effect on the host immune system. The S1 Ub/Ubl-binding site can interact specifically with ubiquitin and ISG15: ubiquitin interacts with the palm and finger subdomains; its C-terminus reaches into the catalytic centre, forming an open hand conformation. ISG15 also interacts with the palm subdomain, but with the thumb subdomain instead of the fingers. The main contacts between ISG15 and PL2pro are facilitated by Trp123 and Pro130/Glu132 of ISG15 interacting with the α7-helix of the PL2pro thumb domain (Figure 6) [32]. The S2 ubiquitin-binding site preferentially cleaves polyubiquitin linked at its residue Lys48 with higher activity for longer ubiquitin chains. The binding site consists of the Phe69, located within the conserved α2-helix in the thumb domain [32]. The SARS-CoV-2 PL2pro closely resembles SARS-CoV-1 PL2pro [29] with a sequence identity of 82.8% and a RMSD of 0.8 Å. Out of 54 residue differences, only six are located in the ubiquitin-interacting motif. Previous mutations in SARS-CoV-1 PL2pro at these sites showed increased activity for ISG15 at the cost of lower de-ubiquitinase activity [28]. This suggests that PL2pro from both, SARS-CoV-1 and SARS-CoV-2, might show differences in its specific substrate activities due to the mutations [28]. This is of particular interest when studying the interference of the SARS-CoV-2 PL2pro with the innate immune system and the host’s inflammatory signalling pathways. A full explanation of the effects would be beyond the scope of this review. However, a detailed depiction of the involved processes has been written by Shin et al. [30].

Figure 6. Structure PDB ID 7JRN of PL2pro with attached Ubl2 domain (purple) and bound to inhibitor GRL-0617. In addition to Ubl2, PL2pro consists of the fingers- (dark blue, left), thumb- (cyan, right), and palm-subdomains (light blue, bottom). The catalytic active site is highlighted in the lower introspection, the blocking loop interacting with the inhibitor is highlighted in the upper introspection. S2 binding site is located at Phe69. Location of S1 varies depending on the substrate and is therefore not labelled. The pink helix α7 interacts with ISG15 binding at S1. Figure was created using Protein Imager [40].
Available structures
Currently, 41 X-ray crystallographic structures of the SARS-CoV-2 PL2pro domain are available. Since this list is constantly growing, we refer here to the repository of the Coronavirus Structural Task Force, which is updated every week [33]. From these 41 structures, 28 were bound to a potential inhibitor.
It is also worth noting that crystallization of a wild type PL2pro domain without a bound ligand is a challenging task. Therefore, many available structures resemble the C111S mutant [29] which is easier to crystallize.
Therapeutic interest
Due to its many functions regarding viral replication as well as the interaction with the host cell and immune response pathways, nsp3 is – along with Mpro and nsp12 – a major drug target for COVID-19 [3,31]. The main targets within nsp3 are the macrodomain 1 and the Papain-like Protease, which made them the subjects of numerous studies on the development or discovery of suitable inhibitors and drugs [7,28] However, targeting Papain-like Protease comes with two major challenges as stated by Lei et al. [3] and by Báez-Santos et al. [34]: first, the S1 and S2 binding sites bind tightly to glycines in the substrate. Mimicking such a molecule limits the design of a suitable inhibitor. Second, similar binding motifs are used also in host proteins, making the design of drugs specifically for PL2pro more difficult. The blocking loop 2 (Figure 6) on the other hand is better suited as a SARS-CoV-2 specific drug target due to its uniqueness among host proteins and coronaviruses [3], which recently has been shown to bind the molecule GRL-0617 as a strong non-covalent inhibitor in SARS-CoV-2 [28].
Nsp3e – nucleic-acid-binding domain and the betacoronavirus-specific marker domain
The nucleic-acid-binding domain (NAB) and the betacoronavirus-specific marker domain (βSM) together form nsp3e and are unique to betacoronaviruses [3]. In SARS-CoV-1, the nucleic-acid-binding domain has been shown to unwind DNA and to bind single-stranded RNA consisting of repeats of three consecutive guanines [3,35], but its exact targets or specific function remain unclear [6]. As the structures of the nucleic-acid-binding domain are similar between SARS-CoV-1 and SARS-CoV-2, the function is likely the same in both Sarbecoviruses (see below). Another group of coronaviruses sharing a similar region are gammacoronaviruses, which contain a gammacoronavirus-specific marker domain instead of a betacoronavirus-specific marker domain. Although these might share a common function, no structural information is yet available and previous structure predictions indicated that most of this region is intrinsically disordered [3].
Overall structure and functional features
Currently, two nucleic-acid-binding domain structures are available, a crystal structure (PDB ID: 7LGO) for SARS-CoV-2 and an NMR structure (PDB ID: 2K87) for SARS-CoV-1 (Figure 3(b)). They exhibit a sequence identity of 81.74% and similar structures with a Cα RMSD of 2.7 Å. The secondary structure elements of PDB entry 7LGO (Figure 3(b), Table 2) occur in the sequence of β1–β2–α1–β3–β4–η1–β5–β6–α2, where η1 representsa 310 helix with a proline-induced kink in its middle. In the nucleic-acid-binding domain of SARS-CoV-1, the RNA binding is achieved through a positively charged surface patch [3]. Specifically, these are Lys74, Lys75, Lys98, and Arg105, which bind to single stranded-RNA with the preferred sequence pattern mentioned above [3]. Sequence alignment reveals that these same amino acids are conserved in SARS-CoV-2.
Transmembrane domains, amphipathic helix and the ectodomain
One of the key features of nsp3 is the fact that it is anchored in the membrane. Of all other non-structural proteins, only nsp4 and nsp6 share this property [36]. Two transmembrane domains, TM1 and TM2, pass through the membrane of the endoplasmic reticulum or the attached vesicles, with the nsp3 ectodomain (3Ecto) located on the lumenal side of the membrane between the two transmembrane domains [3]. Two N-glycosylation sites are located in the ectodomain of SARS-CoV-1. Both sites are also found in SARS-CoV- 2, although the first site mutated from Asn-Ser-Ser to Asn-Ser-Thr. The ectodomain is sometimes referred to as zinc-finger domain, since such a structural feature was found in this region. However, the zinc-finger is not conserved among all coronaviruses, leading to the renaming of this domain [3]. For SARS-CoV-1 nsp3, three transmembrane domains were predicted in silico, but glycosylation experiments investigating their topology indicated that only the first two of these traverse the membrane [36]. The third one is an amphipathic helix (AH1) and follows the second transmembrane domain. It could interact with the membrane, but the exact mode is unknown [3]. For SARS-CoV-2, the transmembrane prediction tool TMHMM 2.0 [9] finds four potential transmembrane helices, located at the regions of TM1, TM2, AH1 and at the C-terminal region of the ectodomain. The latter one, however, was not experimentally examined whether it is membrane spanning in SARS-CoV-1. For both, SARS-CoV-1 and murine hepatitis virus, it was shown that the nsp3 N-terminus and C-terminus are located in the cytoplasm, setting the requirement of an even number of transmembrane domains [36]. Otherwise, PL2pro and the corresponding cleavage site between nsp3 and nsp4 would be separated by the membrane. In absence of the transmembrane region, however, no cleaving by PL2pro is executed [37]. For SARS-CoV-2, no experiments on the membrane topology were performed up until now but it likely resembles the same topology as SARS-CoV-1. Further, in murine hepatitis coronavirus (MHV) – also a betacoronavirus – nsp3 has been identified as a major component in the formation of molecular pores within the membranes of double membrane vesicles (DMVs), which consist not only of nsp3 but also of the transmembrane proteins nsp4 and nsp6 [5]. These vesicles serve as viral replication organelles, protecting the viral RNA and replication machinery from host proteins, and are formed from the membrane of the endoplasmic reticulum [36]. For the translation, the viral mRNA strands are released from the double membrane vesicles into the cytosol, which is accomplished by the molecular pores. These pores show an overall six-fold symmetry and are complexes consisting of multiple proteins, of which nsp3 contributes the largest mass [5]. The double membrane vesicles are formed by the interaction between ectodomain of nsp3 and the lumenal regions of nsp4, inducing an increased membrane curvature [4]. So far, no high-resolution structure of the ectodomain and the transmembrane domains were experimentally determined. However, a 30.5 Å resolution architecture obtained by cryogenic electron microscopy showing the double membrane vesicle pore complex from murine hepatitis coronavirus provides first insights into the shape of the complex. An experiment with green fluorescent protein fused to the N-terminus of nsp3 also revealed the location of the Ubl1 domain within this complex. Those insights are likely also applicable to SARS-CoV-2 [5].
Nidovirus-conserved domain of unknown function and the coronavirus-specific carboxl-terminal domain (CoV-Y)
The last transmembrane domain TM2 is followed by the nidovirus-conserved domain of unknown function (Y1) and the coronavirus-specific carboxl-terminal domain (CoV-Y), which consists of the subdomains Y2 and Y3. Together, they make up the C-terminus of nsp3 with a total length of 362 residues. While Y1 is conserved among the whole order of nidoviruses, CoV-Y is only conserved within coronaviruses [3]. Studies have shown an improved binding of nsp3 to nsp4 if the domains Y1 and CoV-Y are present [3]. This interaction is important for the formation of double membrane vesicles, which in turn is important for the protection against proteases from the host cell. Additionally, possible interactions between the C-terminal domains of nsp3 and several other viral nsps have been shown in various in vitro experiments (Figure 2), including yeast two-hybrid screening, protein complex immunoprecipitation, and GST pull-down assays, although these findings might not apply in vivo [3]. So far, only one structure of the Y3 domain (PDB ID: 7RQG) has been deposited andat the time of writing no publication for this PDB entry has been made available. High sequence identity of 90% was shown for the alignment between SARS-CoV-2 Y3 and the C-terminal region of SARS-CoV-1 nsp3. A sequence alignment between both viruses for the proposed region of Y1 and Y2 domains showsa sequence identity of 88%. The observed high sequence similarity, in comparison to other coronaviruses, suggests an important functional role [38]. No comparable structures from other viruses have been published, so no further comparison could be drawn.
Summary
With 1945 amino acids, non-structural protein 3 is the largest protein of SARS-CoV-2 and consists of up to 17 domains. An overview is given in Table 1. However, to date few domains have been studied in great detail. Exceptions are Mac1 and PL2pro, involved in host immune response evasion, and in the viral replication process, respectively. In addition to these two, only structures of Ubl1, Ubl2, NAB and Y3 have been determined so far. While PL2pro cleaves nsp1 to nsp3 from the polyproteins pp1a and pp1ab, Mac1 removes post-translational modifications applied by the host, leading to an interference with the host’s signalling system, including its reaction. The ubiquitin-like domain 1, known as Ubl1, interacts with the nucleocapsid protein. The structurally similar Ubl2 is seen as a subdomain of PL2pro and probably supports its activity, although the details are not clear. The nucleic acid-binding domain (NAB) was shown to bind DNA and RNA, but its exact function has not been revealed yet. From the C-terminus of nsp3 only the Y3 domain has been solved and studies indicate it could be involved in the formation of double membrane vesicles. In addition, several domains are also present in nsp3 and are either disordered or their structures await to be solved. From those, only functions of the transmembrane domains and the SUD-region, consisting of the domains Mac2, Mac3, and DPUP, were suggested. While the transmembrane domains anchor nsp3 to the double membrane vesicles, the domains of the SUD-region are involved in the enhancement of viral replication by binding to oligo(G)/G-quadruplexes and the human Paip1.
Various interactions between the remaining domains of nsp3 with other viral or host proteins have been observed. Some domains such as those of the SUD-region have been demonstrated to be indispensable or of high importance to SARS-CoV-1 replication and might therefore play a more important role in SARS-CoV-2 than commonly thought. A lot of effort has gone into researching the major drug target domains of nsp3 so that their structures and functions are well established. But to complete the picture of the role of nsp3 in the infection cycle, both the overall architecture and the remaining domains need to be understood as well.
Discussion and outlook
Non-structural protein 3 is known to play an essential role in the viral replication cycle, although the functions of the majority of its domains are still under discussion. Extensive research has been carried out on the domains Mac1 and PL2pro, which have been identified as promising drug targets. However, to fully understand the role of nsp3 in infection, the structures of the other domains need to be solved and the overall architecture of nsp3 resolved. Additionally, all interactions between nsp3 and other non-structural proteins of the virus, as well as all interactions with host proteins and metabolites should be investigated to get a complete picture of the effects this huge protein has on host cells. Only then can we understand all of its functions and its evolution, as well as identify all related drug targets.
A good starting point is to establish the exact role of nsp3 and the arrangement of each domain within the molecular pore complex. This could reveal new or consolidate proposed interactions between domains within nsp3 and could also give an idea for interactions between the replication/transcription complex and the pore complex. Nevertheless, solving the entire protein structure at atomic resolution is difficult as many domains are separated by flexible linkers or disordered regions. This is exacerbated by the assembly into a large multimeric membrane spanning pore complex as the only known biologically active form. These difficulties leave currently low resolution cryo-EM imaging and electron tomography as the only viable experimental options, which in turn however can be combined with computational methods such as structure prediction and integrative modelling. A question still waiting to be answered concerns the unusual length of nsp3. Polypeptides of immense lengths have higher chances of accumulating errors from translation or during folding, leading to misfolded molecules [39]. Viruses producing large proteins would therefore require a mechanism to prevent or counteract these errors or to ensure a sufficient number of copies of this protein. Nevertheless, all domains of nsp3 being part of a single polypeptide seems to be advantageous in all betacoronaviruses, since the containment of all those functional domains within one polypeptide is well preserved, although only few mutations would be necessary to let a PL2pro or Mpro cleavage site emerge. One possible selection pressure could lie on the pore complex of the double membrane vesicles, as each pore consists of a hexameric complex, where each unit is made of nsp3, nsp4, nsp6 and an unknown number of additional components [5]. If all domains were individual non-structural proteins, their self-arrangement into such a complex could become a very rare event, if it would happen at all. Having 17 of those domains connected already, however, minimizes the number of moving parts and limits the number of possible arrangements of components within the complex. And although most domains remain functional as isolated proteins, their colocalization might allow for so far unknown functions. The nonstandard naming conventions of the individual domains make surveying current literature difficult, hence we give an overview of all synonyms in Table 1; a consistent naming scheme, as well as clear definitions of where each domain begins and where it ends within nsp3, would further increase understanding and avoid confusion in coronavirus nsp3 research. In conclusion, research focus should move away from individual nsp3 domains to looking at interactions and colocalization of multiple domains within the pore complex with the aim of answering open questions concerning the multi-layered mechanisms of coronavirus replication.
Acknowledgements
The authors would also like to thank Rosemary Wilson and Caitlin Hatton for support and discussion. All figures are courtesy of the Coronavirus Structural Task Force (insidecorona.net) who retains the rights for text and figures. L. C. v. S. and M. E. both contributed equally in writing this review and share first authorship. The decision on the first author was made by a game of rock-paper-scissors.
Funding
This work was supported by the German Federal Ministry of Education and Research [grant number 05K19WWA], Deutsche Forschungsgemeinschaft [project TH2135/2-1].
This blog post was published in Crystallography Reviews.
Please cite: https://doi.org/10.1080/0889311X.2022.2098281
References
- Yoshimoto FK. The proteins of severe acute respiratory syndrome coronavirus-2 (SARS CoV-2 or n-COV19), the cause of COVID-19. Protein J. 2020;39(3):198–216.
- Rut W, Lv Z, Zmudzinski M, et al. Activity profiling and crystal structures of inhibitor-bound SARS-CoV-2 papain-like protease: a framework for anti–COVID-19 drug design. Sci Adv. 2020;6:eabd4596.
- Lei J, Kusov Y, Hilgenfeld R. Nsp3 of coronaviruses: structures and functions of a large multi- domain protein. Antiviral Res. 2018;149:58–74.
- Hagemeijer MC, Monastyrska I, Griffith J, et al. Membrane rearrangements mediated by coronavirus nonstructural proteins 3 and 4. Virology. 2014;458–459:125–135.
- Wolff G, Zheng S, Koster AJ, et al. A molecular pore spans the double membrane of the coronavirus replication organelle. Science. 2020;369:1395–1398.
- Korn SM, Dhamotharan K, Fürtig B, et al. 1H, 13c, and 15N backbone chemical shift assignments of the nucleic acid-binding domain of SARS-CoV-2 non-structural protein 3e. Biomol NMR Assign. 2020;14:329–333.
- Schuller M, Correy GJ, Gahbauer S, et al. Fragment binding to the Nsp3 macrodomain of SARS-CoV-2 identified through crystallographic screening and computational docking. Sci Adv. 2021;7:eabf8711.
- Salvi N, Bessa LM, Guseva S, et al. 1H, 13c and 15N backbone chemical shift assignments of SARS-CoV-2 nsp3a. Biomol NMR Assign. 2021;15:173–176.
- Krogh A, Larsson B, von Heijne G, et al. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes11. J Mol Biol. 2001;305:567–580.
- Carlson CR, Asfaha JB, Ghent CM, et al. Phosphoregulation of phase separation by the SARS-CoV-2 N protein suggests a biophysical basis for its dual functions. Mol Cell. 2020;80:1092–1103.e4.
- Serrano P, Johnson MA, Almeida MS, et al. Nuclear magnetic resonance structure of the N-terminal domain of nonstructural protein 3 from the severe acute respiratory syndrome coronavirus. J Virol. 2007;81:12049–12060.
- Chou C-C, Wang AH-J. Structural D/E-rich repeats play multiple roles especially in gene regulation through DNA/RNA mimicry. Mol BioSyst. 2015;11:2144–2151.
- Munnur D, Bartlett E, Mikolčević P, et al. Reversible ADP-ribosylation of RNA. Nucleic Acids Res. 2019;47:5658–5669.
- Alhammad YMO, Fehr AR. The viral macrodomain counters host antiviral ADP-ribosylation. Viruses. 2020;12:384.
- Rack JGM, Zorzini V, Zhu Z, et al. Viral macrodomains: a structural and evolutionary assessment of the pharmacological potential. Open Biol. 2020;10:200237.
- Alhammad YMO, Kashipathy MM, Roy A, et al. The SARS-CoV-2 conserved macrodomain is a mono-ADP-ribosylhydrolase. J Virol [Internet]. 2021 [cited 2022 Feb 23];95. Available from: https://journals.asm.org/doi/10.1128/JVI.01969-20.
- Michalska K, Kim Y, Jedrzejczak R, et al. Crystal structures of SARS-CoV-2 ADP-ribose phosphatase: from the apo form to ligand complexes. IUCrJ. 2020;7:814–824.
- Li C, Debing Y, Jankevicius G, et al. Viral macro domains reverse protein ADP-ribosylation. J Virol. 2016;90:8478–8486.
- Eckei L, Krieg S, Bütepage M, et al. The conserved macrodomains of the non-structural proteins of chikungunya virus and other pathogenic positive strand RNA viruses function as mono-ADP-ribosylhydrolases. Sci Rep. 2017;7:41746.
- Lin M-H, Chang S-C, Chiu Y-C, et al. Structural, biophysical, and biochemical elucidation of the SARS-CoV-2 nonstructural protein 3 macro domain. ACS Infect Dis. 2020;6:2970–2978.
- Lei J, Ma-Lauer Y, Han Y, et al. The SARS-unique domain (SUD) of SARS-CoV and SARS-CoV-2 interacts with human Paip1 to enhance viral RNA translation. EMBO J [Internet]. 2021 [cited 2022 Feb 23];40. Available from: https://onlinelibrary.wiley.com/doi/10.15252/embj.2019102277.
- Gallo A, Tsika AC, Fourkiotis NK, et al. 1H,13C and 15N chemical shift assignments of the SUD domains of SARS-CoV-2 non-structural protein 3c: ‘the N-terminal domain-SUD-N’. Biomol NMR Assign. 2021;15:85–89.
- Gallo A, Tsika AC, Fourkiotis NK, et al. 1H,13C and 15N chemical shift assignments of the SUD domains of SARS-CoV-2 non-structural protein 3c: ‘the SUD-M and SUD-C domains’. Biomol NMR Assign. 2021;15:165–171.
- Derry MC, Yanagiya A, Martineau Y, et al. Regulation of poly(A)-binding protein through PABP-interacting proteins. Cold Spring Harbor Symp Quant Biol. 2006;71:537–543.
- Kusov Y, Tan J, Alvarez E, et al. A G-quadruplex-binding macrodomain within the ‘SARS-unique domain’ is essential for the activity of the SARS-coronavirus replication–transcription complex. Virology. 2015;484:313–322.
- Tan J, Vonrhein C, Smart OS, et al. The SARS-unique domain (SUD) of SARS coronavirus contains two macrodomains that bind G-quadruplexes. PLoS Pathog. 2009;5:e1000428.
- Ma-Lauer Y, Carbajo-Lozoya J, Hein MY, et al. P53 down-regulates SARS coronavirus replication and is targeted by the SARS-unique domain and PL pro via E3 ubiquitin ligase RCHY1. Proc Natl Acad Sci USA. 2016;113:E5192–E5201.
- Freitas BT, Durie IA, Murray J, et al. Characterization and noncovalent inhibition of the deubiquitinase and deISGylase activity of SARS-CoV-2 papain-like protease. ACS Infect Dis. 2020;6:2099–2109.
- Gao X, Qin B, Chen P, et al. Crystal structure of SARS-CoV-2 papain-like protease. Acta Pharm Sin B. 2020;11:237–245. S2211383520306985.
- Shin D, Mukherjee R, Grewe D, et al. Papain-like protease regulates SARS-CoV-2 viral spread and innate immunity. Nature. 2020;587:657–662.
- Osipiuk J, Azizi S-A, Dvorkin S, et al. Structure of papain-like protease from SARS-CoV-2 and its complexes with non-covalent inhibitors. Nat Commun. 2021;12:743.
- Klemm T, Ebert G, Calleja DJ, et al. Mechanism and inhibition of the papain-like protease, PLpro, of SARS-CoV-2. EMBO J [Internet]. 2020 [cited 2020 Dec 9];39. Available from: https://onlinelibrary.wiley.com/doi/10.15252/embj.2020106275.
- Croll TI, Diederichs K, Fischer F, et al. Making the invisible enemy visible. Nat Struct Mol Biol. 2021;28:404–408.
- Báez-Santos YM, John SES, Mesecar AD. The SARS-coronavirus papain-like protease: struc- ture, function and inhibition by designed antiviral compounds. Antiviral Res. 2015;115:21–38.
- Neuman BW, Joseph JS, Saikatendu KS, et al. Proteomics analysis unravels the functional repertoire of coronavirus nonstructural protein 3. J Virol. 2008;82:5279–5294.
- Oostra M, Hagemeijer MC, van Gent M, et al. Topology and membrane anchoring of the coro- navirus replication complex: not all hydrophobic domains of nsp3 and nsp6 are membrane spanning. J Virol. 2008;82:12392–12405.
- Harcourt BH, Jukneliene D, Kanjanahaluethai A, et al. Identification of severe acute respiratory syndrome coronavirus replicase products and characterization of papain-like protease activity. J Virol. 2004;78:13600–13612.
- Neuman BW. Bioinformatics and functional analyses of coronavirus nonstructural proteins involved in the formation of replicative organelles. Antiviral Res. 2016;135:97–107.
- Allan Drummond D, Wilke CO. The evolutionary consequences of erroneous protein synthesis. Nat Rev Genet. 2009;10:715–724.
- Tomasello G, Armenia I, Molla G. The protein imager: a full-featured online molecular viewer interface with server-side HQ-rendering capabilities. Bioinformatics. 2020;36:2909–2911.