Oliver Kippes, Andrea Thorn & Gianluca Santoni
This blog post was published in Crystallography Reviews.
Please cite: https://doi.org/10.1080/0889311X.2022.2072835
Abstract
The main focus of drug development against COVID-19 is on the spike protein and proteases. However, such drugs can be problematic because of mutations (in the case of the spike protein) and harmful to cellular homologs (in case of the proteases). Here, we review a viral protein that due to its conserved and multifunctional nature may be an alternative drug target: SARS-CoV-2 nucleocapsid. This protein consists of two ordered and three disordered domains, all of which exhibit RNA binding activity and are important for ribonucleoprotein complex assembly. This complex protects the viral RNA and is important for viral replication. Nucleocapsid might also be connected to modulation of the host cell cycle, replication, translation, viral assembly, and other parts of the infection cycle. The two ordered domains, the RNA binding domain and the dimerization domain, mediate packaging of the RNA into the ribonucleoprotein complex and bind it to membrane proteins. The actual organization of this complex has not been conclusively verified yet, but the large SARSCoV-2 RNA genome is efficiently packed yet is very flexible. A better understanding of this protein could lead to an efficient therapeutic measure against the virus and would improve our understanding of COVID-19.
Role and function
Severe Acute Respiratory Syndrome Corona Virus 2 (SARS-CoV-2) and the COVID-19 pandemic it triggered are in the focus of international research.Newviralmutations remain a constant thread in the race against the virus and necessitate adaptation to face these new pathogens in an efficient way. Other highly conserved viral protein targets hence are a good opportunity to gain an advantage against the mutations, as they offer a therapeutic or vaccine target [1].
The viral RNA of SARS-CoV-2 encodes many proteins; Accessory and non-structural proteins, which facilitate the viral infection cycle after infection, and four types of structural proteins. Structural proteins are present in the virion to initiate infection and protect the viral RNA: these are spike-protein, envelope-protein, membrane-protein and nucleocapsid. Nucleocapsid is encoded, along with the other structural proteins, by the last third of the viral genome (5 to 3) [2]. Nucleocapsid is 419 amino acid residues long and one of the most conserved proteins of all coronaviruses, with 91% homology to SARS-CoV. In this review, we present the latest structural information and discuss opportunities for exploiting the nucleocapsid for developing novel therapeutics.
The primary role of the nucleocapsid is to protect the viral RNA by packaging it in a ribonucleoprotein complex inside the virion. In the so-called ribonucleoprotein complex, the nucleocapsid proteins need to be oligomerised, which increases the protein activity [3] thus allowing the formation of the complex with the single-stranded viral RNA [4]. Dimerized nucleocapsid is shown to bind RNA with higher affinity, suggesting a structural relevance of the dimerization domain to latch on the RNA and form a stable complex [5]. The protection of the viral RNA makes nucleocapsid an essential protein for the viral infection cycle, and viral assembly [6]. The interactions between four types of structural proteins, in particular between nucleocapsid and membrane protein [7], are also very important for the viral assembly [8]. It has been suggested that nucleocapsidmay also play a role in virus budding. It has been shown to not be strictly required for the budding process [9], but virus-like particle experiments showed that the presence of nucleocapsid proteins can increase the virus-like particle yield, which suggests at least a participation of the protein [10] in the budding process. Furthermore, overexpression of nucleocapsid proteins showed an enhanced replication of SARS-CoV-2 viruses. This phenomenon is explainable through the nucleocapsids ability to antagonize the Interferon 1 mediated antiviral pathway, meaning that the nucleocapsid protein can supress the immune signalling of cells [11]. As an antagonist the Nucleocapsid inhibits phosphorylation of STAT1 and STAT2, which supresses their translocation to the nucleus of the host cell. This means that inhibition of nucleocapsid functionwould not only hinder the viral replication cycle butmay also permit a better immune response of the host cell, making it an excellent drug target. Since the nucleocapsid protein is required for optimal replication of coronaviruses [6,12], it is also possible that the protein is involved in RNA synthesis. This is implied by the importance of the nucleocapsid translation during the stimulation of genomic RNA infection [13] and that SARS-CoV nucleocapsids in an early stage of infection colocalise intracellularly with replicase components [14]. Experiments with the mouse hepatitis virus nucleocapsids also showed that the interaction between non-structural protein 3 (nsp3) and the nucleocapsid protein is important to form a complex on the 3’ end of the Mouse coronavirus genome that initiates RNAsynthesis, thereby promoting infectivity of genomic RNA [15,16]. In addition to its relevance for viral replication, nucleocapsid also acts in host cell regulation, modulating the host cell cycle to stop in the DNA replication phase by regulating cyclin-dependent kinase activity, leading to a halted S phase progression [17]. This blockage of the synthesis phase of cell division is postulated to give the virus enough time to use the cellular raw materials for replication of its genome, viral assembly and budding [18]. Studies on the transmissible gastroenteritis coronavirus and SARS-CoV together with in-silico studies of other coronaviruses measured chaperone activity in SARS-CoV and other coronaviruses showed similar patterns in this field this highly suggests that also SARS-CoV-2 nucleocapsids have RNA chaperone activity. This means that the nucleocapsid helps the virus as a RNA chaperone to enhance ribozyme cleavage, enable rapid and accurate RNA annealing, and to facilitate strand transfer and exchange [19]. Furthermore, the nucleocapsid protein also affects the cell stress responses of the host cells, as elongation factor 1α has been shown to interact with the dimerization domain of SARS-CoV nucleocapsid proteins thereby suppressing translation. The binding of elongation factor 1α also leads to an inhibited cytokinesis [20], i.e. separation of the eukaryotic cell into two daughter cells. While many of these mechanisms have only been demonstrated in SARS-CoV-1, due to the similarity between both nucleocapsids most, if not all, are relevant also for SARS-CoV-2.
Structural overview
The SARS-CoV-2 nucleocapsid consists of five different domains: three intrinsically disordered domains and two ordered domains (Figure 1). Both ordered domains, described below, show RNA binding activity which promotes ribonucleoprotein packaging [21]. Intrinsically disordered domains (IDRs) are challenging for conventional structural characterization, and are mostly researched by molecular simulations [22]. As they affect the conformation of the Nucleocapsid in the ribonucleic protein complex, this means that the precise details of the binding toRNAare still poorlyunderstood. It is still unclear, for example, how this complex is organized in the virus. There are two proposed formations of the complex which will both be described in this review. Two of the intrinsically disordered regions are located at the N- and C-terminus of the protein, with a third located in the middle, linking between the two ordered domains. The two ordered domains are called RNA-binding domain (N terminal) and dimerization domain (C terminal), both of which have been determined by both X-ray diffraction and solution NMR [23,24]. All the available structures are summarized in Table 1.
Ordered domains
The two ordered domains, RNA-binding and dimerization domain, together make up 257 of the 422 residues.Despite the fact that one domain is called the RNA-binding domain the binding epitope is distributed over all five nucleocapsid domains (see Figure 1), including the disordered ones [6].
The RNA-Binding domain
During the viral infection cycle, the RNA-binding domain captures the viral RNA and mediates the packaging of the RNA into the ribonucleoprotein complex [25]. The RNA binding domain is rich in aromatic and basic residues ordered into a right-hand-like shape with a protruded basic finger, a basic palm, and an acidicwrist (see Figure 2(A)). The SARSCoV- 2 nucleocapsid structures show similar loops that surround the β-sheet core in a sandwiched structure. The β-sheet core consists of four antiparallel β-strands, a short 310- helix in front of the β2-strand and a protruding β-hairpin that is located between the β2- and β5-strands (see Figure 2(B)). The structural basis for RNA binding by nucleocapsid is not yet known, but comparisons with the less pathogenic virus type human coronavirus OC43 suggest that SARS-CoV and SARS-CoV-2 have a unique potential RNA-binding pocket beside the β-sheet core [26,27]. Another position for the RNA-binding site could be between the basic fingers and the palm region. Analysis of the electrostatic potential reveals a highly positively charged cleft, which could bind RNA. NMR-based titration with single-stranded RNA revealed that the RNA interacted with the residues L56, G60, K61, K65, F66, A90, R93, I94, R95, K102, D103, L104, T165, T166, G175, and R177, forming a U-shaped binding epitope. The structures determined in this experiment are saved in the PDB under the IDs 6YI3, 7ACT and 7ACS [28]. Also, simulations suggested that residues T57, H59, S105A, R107A, G170, F171 and Y172 are interesting targets for drug research due to their simulated connection with RNA binding of SARS-CoV-2 [25].

Figure 1. Schematic overview of the SARS-CoV-2 Nucleocapsid domains with structural examples below. The protein consists of two ordered domains, the RNA-binding domain and the Dimerization domain, connected with a disordered linker. At the ends of the protein are the intrinsically disordered N-terminal arm and the C-terminal tail. Creator: Coronavirus Structural Task Force - Protein Imager, Oliver Kippes License: cc-by-sa
The dimerization domain
The nucleocapsid dimerization domain anchors nucleocapsid to the viralmembrane inside the virion and the RNA binding affinity of the dimerization domain allows a physical linkage between the viral membrane and the RNA [25]. The dimerization domain consists of three 310-helices, five α-helices and two antiparallel β-strands, which create a β-hairpin. This β-hairpin forms a C-shape together with other parts of the dimerization domain. Two domains fromtwo nucleocapsidmolecules forma tight homodimer with a rectangular slab shape, with the β-hairpins from each nucleocapsid protein on one side and the helices on the opposite side. The dimer is stabilized through hydrogen bonds and hydrophobic interactions. The dimerization domain is only stable when several nucleocapsidmolecules form a dimer or oligomer and the domain arranges self-association into tetramer, hexamer and higher oligomeric forms [29]. It can be assumed that this dimerization is a driving force for the viral ribonucleoprotein assembly [30]. An electrophoretic mobility shift assay showed that the amount of 17-mer single stranded SARS-CoV-2 RNA oligonucleotides decreased when put together with the recombinantly produced dimerization domain protein of the SARS-CoV-2 nucleocapsid protein [26,31]. This shows that the dimerization domain also has RNA binding activity. As of the writing of this review, the PDB has 26 SARS-CoV-2 nucleocapsid structures and 6 SARS-CoV structures, with the majority of the structures showing the RNA-binding and the dimerization domains. There are also some special structures that show the RNA-binding domain in complex with double-stranded RNA and structures that show short segments from the protein. Most of the structures are based on X-ray diffraction data and a few on NMR data and all of them are listed and commented on in Table 1.

Figure 2. Structures of the RNA-binding domain (A: PDB 7CDZ, B: 6VYO). A: Electostatic surface of the RNA-binding domain, red shows the negative charge potential and blue the positive charge potential. The protein consists of an acidic wrist,the basic finger and the basic palm. The area between the basic finger and the basic palm is marked as a possible RNA binding site. B: The core of the protein consists of four antiparallel β-strands (β1-β2-β5-β6) together with a short 310-helix in front of the β2-strand. The core is surrounded by loops that enclose the core. The basic finger is formed by a β hairpin (β3- β4). Creator: Coronavirus Structural Task Force - Protein Imager, Oliver Kippes License: cc-by-sa
Table 1. PDB entries for nucleocapsid. For each entrywe list ID, technique, a comment on its importance
as well as an evaluation of the most intense Fourier difference peaks as calculated from Coot as well as a statement on the model quality.
SARS-CoV-2 | ||||
ID | Method | Resolution in Å | Description | Comment about the highest Fourier difference peak from Coot |
6m3m | X-ray diffraction | 2.7 | RNA-binding domain. It consists of four identical monomers. The structure displays a great overall similarity to other nucleocapsid protein RNA-binding domains but exhibits a unique potential RNA-binding pocket alongside the β-sheet core. | Near C-terminal Ala 174, disordered conformer for this residue. |
6vyo | X-ray diffraction | 1.7 | RNA-binding domain. The structure consists of four identical monomers and exhibits interactions with the ligands Cl−, Zn2+, Glycerol, and 2-(N-morpholino)ethanesulfonic acid. The N-terminal domain provides structural features for RNA binding. | Probable unmodeled buffer molecule near Thr 54. Glycerol binding seems questionable. |
6wji | X-ray diffraction | 2.05 | Dimerization domain. The structure consists of six identical monomers forming three homodimers and exhibits interactions with the ligand Cl−. The C-terminus provides structural features for oligomerisation. | Highest FoFc peaks all seem simply related to poorly resolved sidechains of surface residues. |
6wkp | X-ray diffraction | 2.67 | RNA-binding domain in a monoclinic crystal form. The structure consists of four identical monomers and exhibits interactions with the ligands Zn2+ and 2-(N-morpholino)ethanesulfonic acid. The N-terminal domain provides structural features for RNA binding. | 3 peaks, the highest is a 5.2 σ peak near Glu118 and is due to a missing water molecule?. |
6wzo | X-ray diffraction | 1.42 | Dimerization domain in a triclinic (P1) crystal form. In solution, this structure was found to build a homodimer, but it is also proposed to be involved in tetramer formation. The structure 6wzo and 6wzq is a tetramer of two homodimers and 6wzq exhibits interactions with the ligand SO4 2−. | 52 peaks in the map. Highest at 6.97σ is an alternative conformer for Thr334 in chain A. |
6wzq | X-ray diffraction | 1.45 | 29 peaks. Many are clustered around residues 280-284, suggesting a wrong modelling of this loop in chain B,C, and D. | |
6yi3 | SOLUTION NMR | Monomeric RNA-binding domain. The N-terminal domain provides structural features for RNA binding. The selection criteria for this structure was the fact that it has the least restraint violations. | ||
6yun | X-ray diffraction | 1.44 | Dimerization domain in an orthorhombic (P212121) crystal form. The structure is a homodimer and provides structural features for oligomerisation. | 45 peaks above 5σ. A 12.4σ peak shows a wrongly positioned sidechain for Arg31 in chain A. Alternate conformers are also missing from multiple residues (e.g. Arg11 or Ser14) |
6zco | X-ray diffraction | 1.36 | Dimerization domain in an orthorhombic (I41) crystal form. The structure is a monomer and provides structural features for oligomerisation. | 7 peaks, unmodeled water molecules. |
7acs | SOLUTION NMR | N-terminal RNA-binding domain in complex with 7mer dsRNA. | ||
7act | SOLUTION NMR | N-terminal RNA-binding domain. It is in complex with single-stranded RNA 5'-UCUCUAAACG-3'. It demonstrates the binding capability of the charged binding groove. | ||
7c22 | X-ray diffraction | 2.0 | Dimerization domain in a triclinic (P1) crystal form. The structure is a tetramer of two identical homodimers and exhibits interaction with ligands diethylene glycol and acetate ion. | 5 peaks, the strongest (5.3σ) could suggest an alternative conformation of Met 317 /D |
7cdz | X-ray diffraction | 1.8 | RNA-binding domain. Part of a discussion about possible ways to form the ribonucleoprotein complex. | Badly modelled regions highlighted by 35 peaks. Chain A: loop 98-104 and 10 residues at C-ter ; ChainB 98-104. Overall modelling seems questionable. |
7ce0 | X-ray diffraction | 1.39 | Dimerization domain in dimerised form. The group published this structure together with an N-terminal domain structure to discuss possible ways how the ribonucleoprotein-complex could form. | 75 peaks highest at 13 σ, overall bad fit between map and model. |
7de1 | X-ray diffraction | 2.0 | Dimerization domain in dimerized form. Structure was used to show that the dimerization domain has a role in viral RNA binding and transcriptional regulatory sequences. | 17 peaks, no major faults with this structure. |
Structures containing peptides from SARS-CoV-2 | ||||
7kgo | X-ray diffraction | 2.15 | Human leukocyte antigen HLA-A*0201 in complex with the nucleocapsid peptides 351-359, 316-324, 222-230, 159- 167, 128- 146, 226- 234 and Human leukocyte antigen HLA-B*0702 in complex with the nucleocapsid epitope SPRWYFYYL. HLA-A*0201 and HLA-B*0702 consists of an MHC-class-I antigen and a Beta-2-microglobulin. The peptides are derived from the nucleocapsid and bound between the alpha-helices. The structures were solved to determine if and how nucleocapsid derived SARS-CoV-2 peptides trigger CD8+ T-cell immune responses. | |
7kgp | X-ray diffraction | 1.396 | ||
7kgq | X-ray diffraction | 1.34 | ||
7kgr | X-ray diffraction | 1.55 | ||
7kgs | X-ray diffraction | 1.58 | ||
7kgt | X-ray diffraction | 1.9 | ||
7lg0 | X-ray diffraction | 2.296 | ||
7ltu | X-ray diffraction | 1.12 | 6-residue segments from nucleocapsid. Shows the three 5 residue segments that are connected to liquid-liquid phase separation and drive amyloid fibril formation. 7ltu and 7lux have the residue chain AALALL. 7luz has the residue chain GQTVTK and 7lv2 has the residue chain GSQASS. | Peptide structures with negligible, if any, difference peaks. |
7lux | X-ray diffraction | 1.3 | ||
7luz | X-ray diffraction | 1.1 | ||
7lv2 | X-ray diffraction | 1.3 | ||
SARS-CoV | ||||
1ssk | SOLUTION NMR | RNA-binding domain. It is a monomer with a length of 158 amino acids and exhibits a five-stranded β-sheet. This structure binds single-stranded RNA to enable the packaging of the viral genome RNA into a helical ribonucleocapsid (RNP). | ||
2cjr | X-ray diffraction | 2.5 | This crystal structure shows the dimerization domain. Interactions within this structure stabilise the oligomerisation. Crystal packaging of the octamer forms the helical structure of the nucleocapsid. The structure also has RNA-binding activity. | 3 peaks present. A 6.4 σ shows a wrong conformation for Ile331 / E |
2gib | X-ray diffraction | 1.75 | This crystal structure shows the dimerization domain. The structure is a dimer of two identical monomers and exhibits interactions with the ligand SO42−. Strong interactions between both subunits suggest that the dimeric form is a functional unit. | High negative peaks on Glu324 and 368 suggest possible radiation damage. |
2jw8 | SOLUTION NMR | Dimerization domain. The structure is a homodimer and was modelled by the stereo-array isotope labelling (SAIL) method to determine a high-quality solution structure. The selection criteria for this structure was the fact that it has the least restraint violations. | ||
2ofz | X-ray diffraction | 1.17 | These crystal structures show the RNA-binding domain. The domain provides structural features for RNA binding, the study tried to identify important residues of the protein to understand the RNA binding mechanism. They used comparisons with the homologous avian infectious bronchitis virus. 2ofz shows a monomer in a monoclinic crystal form and exhibits interactions with the ligand 1,2-Ethandiol. | 56 peaks, clear indication of radiation damage on Glu. |
2og3 | X-ray diffraction | 1.85 | 18 peaks. At C ter a peak of 8.9 σ indicates a missing residue. |
Disordered domains
Intrinsically disordered domains generally feature a larger number of polar and charged aminoamides compared to ordered domains. The electrostatic repulsion together with a lack of stabilized hydrophobic cores prevents them from assuming a well-defined structure.
Despite the disordered nature of the N- and C-terminal regions, it is possible that they form transient helices made up of various residues [22]. The helices from the N-terminal domain and the flexible linker flank the RNA- binding domain and organize arginine residues to flank the same direction, and they drive the RNA binding. Transient helices from the C-terminal domain showed positive charges which are critical for protein RNA interactions, they also mediate membrane protein binding in other corona viruses, so it is likely that they have a similar function in SARS-CoV-2. The N-terminal conformation is affected significantly by the neighbouring ordered RNA-binding domain through electrostatic repulsion of the positively charged N-terminal domain from its positive surface and an attraction of the slightly negatively charged parts. This could lead to engagement of RNA [22]. The linker domain is composed of residues 174–246. It incorporates polar regions which are repelled by the neighbouring ordered domains. A positively charged serine- and arginine-rich motif is likely to function as a phosphorylation site for a direct interaction with RNA, viral membrane proteins, and nsp3 [22]. Simulations suggest that the linker either contributes to oligomerisation or acts as a recognitionmotif for the binding of other proteins. Intrinsically disordered regions could be involved in a number of regulatory functions including modulation of transcription, translation, post-translational modifications such as phosphorylation, and cell signalling, through to ordering when in contact with another protein domain [6].
The intrinsically disordered regions of SARS-CoV-2 nucleocapsid are thought to be responsible for liquid-liquid phase separation, which is an important process in eukaryotic cells. This also comes with certain dangers: liquid-liquid phase separation tends to concentrate proteins which have the tendency to trigger aggregation processes or jamming in high concentration, leading to the prevention of chemical reactions [32,33, p.]. It involves the formation of a macromolecule-rich, fluid compartment that is separate from the cytosol without a membrane layer. One can think of these regions as droplets or granules within the cytosol. The flexible linker of nucleocapsid shows residue similarities with other proteins that drive liquid–liquid phase separation [21]. During this process nucleocapsid forms amyloid-like fibrils which may encapsulate RNA during viral replication. Further research with fibril inhibitors in SARS-CoV-2 infected cells could give more insight into the actual functions of these fibrils. Due to the occurring neurological complications with COVID-19 and the connection between amyloid fibrils and illnesses such as dementia and Parkinson, the inhibition of this fibril formation could be relevant for better understanding of both COVID-19 and long COVID illnesses as well as therapeutic strategies [34].
Ribonucleoprotein complex
In order to protect the viral RNA, nucleocapsid must interact with the nucleic acid, which is preferentially mediated by GGG motifs from the leader RNA sequences [35], and the nucleocapsids need to oligomerise. According to the SARS-CoV nucleocapsid, the protein interacts with the RNA at multiple sites through the negatively charged phosphate backbone of RNA and the positively charged groove formed by the residues 248 through 280 of nucleocapsid (dimerization domain). However, the exact mechanism is unknown. It is also suspected that the nucleocapsid helps in RNA folding [6,36]. Inspections of tomogram slices showed a ‘G-shaped’ architecture of ribonucleoprotein complexes, with 15nm in diameter and 16 nm in height. Further 2D classification of the ribonucleoprotein complex revealed three classes of complexes, hexagonally and triangularly packed, and closely packed against the envelope. 3D refinement then came to two assembly models whose appearances differ in the virion shape. Spherical virions seem to have a higher number ofmembrane-proximal shaped ribonucleoprotein complex assemble (‘eggs-in-a-nest’) (see Figure 3(A)) and ellipsoidal virions have a higher number of membrane free shaped ribonucleoprotein complex assemblies (‘pyramid’). It was also observed that pyramid shaped complexes can assemble into ‘eggs-in-a-nest’ shaped complexes. The native ribonucleoprotein complexes are highly heterogenous and densely packed, as well as being locally ordered in the virus. These may act with the RNA in a ‘beads on a string’ (see Figure 3(B)) stoichiometry [37]. Another possible organization of the ribonucleoprotein complex is the helical shape as in SARS-CoV. In the helical arrangement the RNA is surrounded by nucleocapsids in an octamer formation, the complex shows a positively charged surface that could bind RNA via electrostatic interactions. The complex shows a twin helix with the octamers formed by two tetramers that are wound around each other.
The RNA binding domain alternates between one protomer in the inner side of the helical core and one protomer on the outer side of the helix [35,36]. The exact organization of the ribonucleoprotein complex in SARS-CoV-2 is not known, but it seems likely that it shows a different organization than SARS-CoV. However, the reasons for this organization change are not known yet.
Nucleocapsid as therapeutic target
Themultifunctional and conserved nature of the nucleocapsidmakes it an interesting drug target, but research is hindered by its disordered nature and the resulting lack of a complete atomic structure. Inhibition of its functions could disturb the viral infection cycle and improve the host immune response. A promising strategy for this is inhibition of the structurally well-established RNAbinding domain and, through this, prevention of ribonucleotide complex formation. Inhibition of the RNA-binding domain prevents viral replication [38]. The analysis of nucleocapsid protein from human coronavirus OC43 compared with SARS-CoV-2 suggests distinct ribonucleotide binding patterns between the proteinmolecules, and through this a potential inhibition pocket in SARS-CoV-2 could be identified [26]. This pocket lies alongside the β-sheet core (Figure 4). Drug candidate PJ34 is a potent inhibitor of RNA-binding activity in human coronavirus OC43. It binds to residues 48N, 49N, 50T, 51A, 110Y, 112Y like AMP and fits into the ribonucleotide-binding pocket of the RNA-binding domain(residues R88, T91, R93, R107, Y109, Y111, R149) [38]. A comparison between the HCoV-OC43 and SARS-CoV-2 nucleocapsid structures shows that the key residues S41, F53, Y109, Y111, and R149 (SARS-CoV-2 numbering) are conserved, which means that the human coronavirus OC43 and SARS-CoV-2 have similar binding pockets [27]. A second strategy would be the prevention of oligomerisation or inducing an abnormal aggregation. Experiments with MERS-CoV showed that the inhibitor 5-benzyloxygramine (P3) stabilizes protein–protein interactions between MERS nucleocapsids, leading to an abnormal full-length oligomerization [39]. P3 showed antiviral activity against MERSCoV in a non-native dimeric configuration. P3 targets the interface of the dimeric Nterminal domain and binds two hydrophobic pockets in two N-terminal domains. The important residues for the hydrophobic pockets in MERS-CoV are W43 and F135 on monomer 1 and G104, T105, G106 and A109 on monomer 2; these residues are highly conserved [39]. A comparison between SARS-CoV-2 and MERS-CoV also showed that the responsible residues are conserved, except for F135 in MERS-CoV which is replaced by I146 in SARS-CoV-2 [39].

Figure 3. Two proposed organizations of the ribonucleoprotein complex in the virion. A: Shows the ‘eggs-in-a-nest’ organization that is also called ‘beads on a string’, the nucleocapsid proteins bind to the RNA and oligomerise like beads on a string. B: shows the helical organization which was observed in SARS-CoV, the nucleocapsids organize in a decamer around the RNA. The RNP number indicates the number of nucleocapsid proteins in a single bead or helix turn respectively. Creator: Coronavirus Structural Task Force - Protein Imager, Oliver Kippes License: cc-by-sa
Coronaviruses also show a correlation between the nucleocapsid and the non-structural protein 3. An interaction between these two proteins seems to be essential for the virus to enhance infectivity [15] andvirus replication [40]. The gRNAof coronaviruses is onlyminimally infectious upon transfection into host cells. This increases with the co-transfection of mRNA that translates into nucleocapsids becoming active [15]. Experiments showed that the gRNA infectivity cannot be initiated if the ubiquitin-like domain of non-structural protein 3 is mutated. NMR titration showed that the nucleocapsid-non-structural protein 3 complex involves residues fromthe ubiquitin-like domain 1 and the Serine-Arginine rich region of the nucleocapsid [40,p.3] (SARS-CoV-2 residues: 176–206). The interactionmay also be important for the transcription of the virus. The nucleocapsid of themouse hepatitis virus can bind to the transcriptional regulatory sequence RNA which prevents the formation of the nucleocapsid-non-structural protein 3 complex. This competition between the non-structural protein 3 and the transcriptional regulatory sequence is an indicator for a viral transcription and replication switch. It was shown that with the phosphorylation of the serine rich region of the mouse hepatitis virus the binding affinity to the Ubiquitin like domain 1 is decreased [40]. Similar mechanisms could be present in SARS-CoV-2 but have not been investigated so far. An in silico study tried to predict the possible residues of SARS-CoV-2 that could be responsible for such interactions in the virus. The residues S188, S190, R191, N192, R195, S197, T198, P199, G200, S201, K237, G238, Q239, Q241, G243, Q244, T245, V246, T247, K248, F314, P309, S310, A311, S312, and A313 from the non-structural protein 3 and the residues S183, S184, R185, S186, S187, S188, R189, S190, R191, S193, S194, R195, and N196 from SARS-CoV-2 are likely to be involved in this proposed interaction [41]. Further studies are urgently needed. Inhibition of this interaction could be another therapeutic strategy.

Figure 4. Structure of the dimerization domain in monomer form A (PDB:7C22) and dimerized form B (PDB: 2GIB). A: The dimerization domain shows the order η1-α1-α2-η2-α3-α4-β1-β2-α5-η3. The two antiparallel β-strands form a C shape β hairpin. In the dimer form, the beta sheet of one monomer docks in between the equivalent sheet and the alpha helices of the other monomer. B: Dimeric form. The green copy is shown in the same orientation as in A. Creator: Coronavirus Structural Task Force - Protein Imager, Oliver Kippes License: cc-by-sa
Discussion and conclusion
Nucleocapsid is vital to the SARS-CoV-2 infection, but due to its relatively large size (ca. 114 kDa in dimer formation [42]) as well as the presence of disordered regions, it is hard to study and much is yet unknown. Indeed, Nucleocapsid is a reminder about the so-called ‘dark proteome’. Since a large portion of the protein world is disordered the biochemistry of these molecules remains poorly studied and understood. Still, nucleocapsid may be an excellent drug target due to its essential role in the viral infection cycle, immune system repression as well as its low rate ofmutation -a further indication of its fundamental role in SARS-CoV-2 biology. Nucleocapsid could also be an interesting target for vaccines as it is the second protein fromthe virion to interact with the immune system during the infection process and it is much more conserved compared to the spike protein, probably due to its RNA interaction and other vital functions in the infection cycle. We do know that inhibition of nucleocapsid leads to a highly decreased RNA binding affinity. Its essential role includes the formation of the ribonucleotide complex, the structure of which similarly eludes us. Current data suggest two potential structural arrangements: the helical hypothesis and the ‘beads on a string’ hypothesis. Furthermore, there seem to be additional function–structure relations which we do not yet understand. A deeper understanding of the ribonucleoprotein structure will shed light on the viral infection cycle as well as provide an alternative therapeutic strategy against SARS-CoV-2. Assumptions and inferences from SARS-CoV or other coronaviruses should be validated for SARS-CoV-2, in particular the organization of ribonucleoprotein complex and the nsp3–nucleocapsid interaction which is implied from Mouse Hepatitis Virus. Fold-search on the pdb shows that both ordered domains have a fold which is unique to the coronavirus family, and is not shared with other kind of proteins. More structure solutions are urgently needed, especiallywith inhibitor trials to unveil the precise structuralmechanisms of the interaction with single stranded RNA of RNA recognition. As a comparison, there are over 200 structures of spike protein domains from SARS-CoV-2 but less than 40 for nucleocapsid domains. SARS-CoV-2 will likely stay a common pathogen and hence, it is necessary to find long-term solutions against COVID-19. More knowledge about the biology of the virus is urgently needed, and the nucleocapsid has still a lot of secrets which we need to unravel to get there.
This blog post was published in Crystallography Review.
Please cite: https://doi.org/10.1080/0889311X.2022.2072835
Acknowledgements
The authorswould also like to thank JohannesKaub and RosemaryWilson for support and discussion. All figures are courtesy of the Coronavirus Structural Task Force (insidecorona.net). Figures 1, 2 and 3 were produced using the Protein Imager[43].
References
[1] MatchettWE, Joag V, Stolley JM, et al. Nucleocapsid vaccine elicits spike-independent SARSCoV-2 protective immunity. J Immunol [Internet]. 2021 [cited 2022 Jan 13]; Available from: https://www.jimmunol.org/content/early/2021/06/30/jimmunol.2100421
[2] PyrcK, Berkhout B, vanderHoek L. The novel humancoronavirusesNL63 andHKU1. J Virol. 2007;81:3051–3057.
[3] SurjitM, Liu B, Kumar P, et al. The nucleocapsid protein of the SARS coronavirus is capable of self-association through a C-terminal 209 amino acid interaction domain. Biochem Biophys Res Commun. 2004;317:1030–1036.
[4] Zlotnick A. Theoretical aspects of virus capsid assembly. J Mol Recognit. 2005;18:479–490.
[5] Forsythe HM, Rodriguez Galvan J, Yu Z, et al. Multivalent binding of the partially disordered SARS-CoV-2 nucleocapsid phosphoprotein dimer to RNA. Biophys J. 2021;120:2890–2901.
[6] McBride R, van Zyl M, Fielding BC. The coronavirus nucleocapsid is amultifunctional protein.Viruses. 2014;6:2991–3018.
[7] Sturman LS, Holmes KV, Behnke J. Isolation of coronavirus envelope glycoproteins and interaction with the viral nucleocapsid. J Virol [Internet]. 1980 [cited 2022 Jan 17]; Available from: https://journals.asm.org/doi/abs/10.1128/jvi.33.1.449-462.1980
[8] He R, Leeson A, Ballantine M, et al. Characterization of protein-protein interactions between the nucleocapsid protein and membrane protein of the SARS coronavirus. Virus Res. 2004;105:121–125.
[9] Vennema H,Godeke GJ, Rossen JW, et al.Nucleocapsid-independent assembly of coronaviruslike particles by co-expression of viral envelope protein genes. EMBO J. 1996;15:2020–2028.
[10] Siu YL, Teoh KT, Lo J, et al. The M, E, and N structural proteins of the severe acute respiratory syndrome coronavirus are required for efficient assembly, trafficking, and release of virus-like particles. J Virol [Internet]. 2008 [cited 2022 Jan 17]; Available from: https://journals.asm.org/doi/abs/10.1128/JVI.01052-08
[11] Mu J, Fang Y, Yang Q, et al. SARS-CoV-2 N protein antagonizes type I interferon signaling by suppressing phosphorylation and nuclear translocation of STAT1 and STAT2. Cell Discov. 2020;6:1–4.
[12] Zúñiga S, Cruz JLG, Sola I, et al. Coronavirus nucleocapsid protein facilitates template switching and is required for efficient transcription. J Virol [Internet]. 2010 [cited 2022 Jan 17]; Available from: https://journals.asm.org/doi/abs/10.1128/JVI.02011-09
[13] Hurst KR, Ye R,Goebel SJ, et al. An interaction between the nucleocapsid protein and a component of the replicase-transcriptase complex is crucial for the infectivity of coronavirus genomic RNA. J Virol. 2010;84:10276–10288.
[14] Stertz S, Reichelt M, Spiegel M, et al. The intracellular sites of early replication and budding of SARS-coronavirus. Virology. 2007;361:304–315.
[15] Hurst KR, Koetzner CA, Masters PS. Characterization of a critical interaction between the coronavirus nucleocapsid protein and nonstructural protein 3 of the viral replicasetranscriptase complex. J Virol [Internet]. 2013 [cited 2022 Jan 18]; Available from: https://journals.asm.org/doi/abs/10.1128/JVI.01275-13
[16] Züst R, Miller TB, Goebel SJ, et al. Genetic interactions between an essential 3 cisacting RNA Pseudoknot, Replicase gene products, and the extreme 3 end of the mouse coronavirus genome. J Virol [Internet]. 2008 [cited 2022 Jan 18]; Available from: https://journals.asm.org/doi/abs/10.1128/JVI.01690-07
[17] Surjit M, Kumar R, Mishra RN, et al. The severe acute respiratory syndrome coronavirus nucleocapsid protein is phosphorylated and localizes in the cytoplasm by 14-3- 3-mediated translocation. J Virol [Internet]. 2005 [cited 2022 Jan 18]; Available from: https://journals.asm.org/doi/abs/10.1128/JVI.79.17.11476-11486.2005
[18] Surjit M, Liu B, Chow VTK, et al. The nucleocapsid protein of severe acute respiratory syndrome-coronavirus inhibits the activity of cyclin-cyclin-dependent kinase complex and blocks S phase progression inmammalian cells. J Biol Chem. 2006;281:10669–10681.
[19] Zúñiga S, Sola I, Moreno JL, et al. Coronavirus nucleocapsid protein is an RNA chaperone. Virology. 2007;357:215–227. [20] Zhou B, Liu J, Wang Q, et al. The nucleocapsid protein of severe acute respiratory syndrome Coronavirus inhibits cell cytokinesis and proliferation by interacting with translation elongation factor 1α. J Virol. 2008;82:6962–6971.
[21] SARS-CoV-2 nucleocapsid protein phase-separates with RNA and with human hnRNPs. The EMBO Journal. 2020;39:e106478.
[22] Cubuk J, Alston JJ, Incicco JJ, et al. The SARS-CoV-2 nucleocapsid protein is dynamic, disordered, and phase separates with RNA. bioRxiv. 2020;2020.06.17.158121.
[23] Takeda M, Chang C, Ikeya T, et al. Solution structure of the c-terminal dimerization domain of SARS coronavirus nucleocapsid protein solved by the SAIL-NMR method. J Mol Biol. 2008;380:608–622.
[24] Jayaram H, Fan H, Bowman BR, et al. X-ray structures of the N- and C-terminal domains of a coronavirus nucleocapsid protein: implications for nucleocapsid formation. J Virol. 2006;80:6612–6620.
[25] Khan A, Tahir Khan M, Saleem S, et al. Structural insights into the mechanism of RNA recognition by the N-terminal RNA-binding domain of the SARS-CoV-2 nucleocapsid phosphoprotein. Comput Struct Biotechnol J. 2020;18:2174–2184.
[26] Kang S, Yang M, Hong Z, et al. Crystal structure of SARS-CoV-2 nucleocapsid protein RNA binding domain reveals potential unique drug targeting sites. Acta Pharm Sin B. 2020;10:1228–1238.
[27] Structures of the SARS-CoV-2nucleocapsid and their perspectives for drugdesign. TheEMBO Journal. 2020;39:e105938.
[28] Dinesh DC, Chalupska D, Silhan J, et al. Structural basis of RNA recognition by the SARSCoV- 2 nucleocapsid phosphoprotein. PLoS Pathog. 2020;16:e1009100.
[29] Ye Q, West AMV, Silletti S, et al. Architecture and self-assembly of the SARS-CoV-2 nucleocapsid protein. Protein Sci. 2020;29:1890–1901.
[30] Lu S, Ye Q, Singh D, et al. The SARS-CoV-2 Nucleocapsid phosphoprotein forms mutually exclusive condensates with RNA and the membrane-associated M protein. bioRxiv. 2020;2020.07.30.228023.
[31] Zhou R, Zeng R, von Brunn A, et al. Structural characterization of the C-terminal domain of SARS-CoV-2 nucleocapsid protein. Mol Biomed. 2020;1:2.
[32] Hyman AA, Weber CA, Jülicher F. Liquid-liquid phase separation in biology. Annu Rev Cell Dev Biol. 2014;30:39–58.
[33] Wang B, Zhang L, Dai T, et al. Liquid-liquid phase separation in human health and diseases. Signal Transduct Target Ther. 2021;6:290.
[34] Tayeb-Fligelman E, Cheng X, Tai C, et al. Inhibition of amyloid formation of the Nucleoprotein of SARS-CoV-2 [Internet]. 2021 [cited 2022 Jan 18]. p. 2021.03.05.434000. Available from: https://www.biorxiv.org/content/10.1101/2021.03.05.434000v2.
[35] Chang C, Hou M-H, Chang C-F, et al. The SARS coronavirus nucleocapsid protein – forms and functions. Antiviral Res. 2014;103:39–50.
[36] Chen C-Y, Chang C-K, Chang Y-W, et al. Structure of the SARS coronavirus nucleocapsid protein RNA-binding dimerization domain suggests a mechanism for helical packaging of viral RNA. J Mol Biol. 2007;368:1075–1086.
[37] Yao H, Song Y, Chen Y, et al. Molecular Architecture of the SARS-CoV-2 virus. Cell. 2020;183:730–738.e13.
[38] Ren P-X, ShangW-J, Yin W-C, et al. A multi-targeting drug design strategy for identifying potent anti-SARS-CoV-2 inhibitors. Acta Pharmacol Sin. 2022;43:483–493.
[39] Lin S-M, Lin S-C, Hsu J-N, et al. Structure-based stabilization of non-native protein–protein interactions of coronavirus nucleocapsid proteins in antiviral drug design. J Med Chem. 2020;63:3131–3141.
[40] Lei J, Kusov Y, Hilgenfeld R. Nsp3 of coronaviruses: structures and functions of a large multidomain protein. Antiviral Res. 2018;149:58–74.
[41] Khan MT, Zeb MT, Ahsan H, et al. SARS-CoV-2 nucleocapsid and Nsp3 binding: an in silico study. Arch Microbiol. 2021;203:59–66.
[42] ZengW, LiuG,MaH, et al.Biochemical characterization of SARS-CoV-2nucleocapsid protein. Biochem Biophys Res Commun. 2020;527:618–623.
[43] Tomasello G, Armenia I, Molla G. The Protein Imager: a full-featured online molecular viewer interface with server-side HQ-rendering capabilities. Bioinform. 2020;36:2909–2911. https://doi.org/10.1093/bioinformatics/btaa009