There is a secret code that virologists use to talk about the new coronavirus. This code is made up of synonymous words and abbreviations for each of the 28 proteins which facilitate the viral life cycle. In this article, we will shed some light on this mythical language.
First of all, SARS-CoV-2 has three classes of proteins:
Structural Proteins, namely the spike protein, the membrane protein and the envelope protein as well as the nucleocapsid, which forms an extra shell around the single-stranded RNA, are also known as the S-, M-, E- and N-Protein.
Non-structural proteins (NSP) ensure the viral life cycle but are not making up the hull or nucleocapsid; These are conveniently numbered 1–16.
And then there are accessory proteins, which seem to be more important in-vivo than in-vitro, and most of them have not yet been structurally determined.
Of course, this nice and clear naming scheme tells you little about the function and properties of the different proteins, which is why virologists invented plenty of other names for them. And this is where the confusion begins.
NSP3, for example, contains two ubiquitin-like (UBL1 and UBL2) domains, a papain-like protease (PLpro, PL2pro) domain (which includes a zinc finger), a "macro" domain (also known as X domain, Mac1, or ADP ribose phosphatase), a hypervariable region (also called Glu-rich acidic domain or HVR), two transmembrane domains (TM1 and TM2), an ecto (3Ecto) domain (which is also a zinc finger), a conserved domain of unknown function called Y1, and a coronavirus-specific carboxyl-terminal (CoV-Y) domain. The SARS-unique domains, or SUDs—namely SUD-M, SUD-N, and SUD-C—were all renamed after it was found out they are not unique to SARS: SUD-N is now Mac2, SUD-M is Mac3 and SUD-C is called DPUP.
If this was not enough to convince you that all of this is confusing, here are some additional names:
Spike: S-Protein, surface glycoprotein, E2 glycoprotein
NSP1: leader protein
NSP5: 3CLpro, SARS-CoV-2 3C-like protease, 3C-like proteinase, main protease, NSP5A_3CLpro, NSP5B_3CLpro, Mpro, Non-structural protein 5
NSP9: Non-structural protein 9, ssRNA-binding protein
NSP10: Non-structural protein 10, growth factor-like protein, GFL
NSP12: RNA Polymerase, RNA-dependent RNA Polymerase, NiRAN, RdRp
NSP13: NSP13-pp1ab, non-structural protein 13, helicase, NTpase, Hel
NSP14: NSP14A2_ExoN, SARS-CoV-2 3'-to-5' exonuclease, non-structural protein 14, NSP14B_NMT
NSP15: NSP15-A1, SARS-CoV-2 endoRNAse, NSP15B-NendoU, NendoU, uridylate-specific endoribonuclease NendoU
All these names are certainly hard to remember, but as a scientist you need them in order to save the world! So, we made a handy glossary for you that you can access here.
If you have any more suggestions or corrections for the glossary, please let us know in the comments!