More new SARS-CoV-2 variants are more likely

In a recent report published on bioRxiv * preprint server, researchers from the United Kingdom and Uganda are discovering unique protein characteristics of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) compared to other representatives of Sarbecovirus.

The causative agent of coronavirus disease 2019 (COVID-19), SARS-CoV-2, belongs Betacoronavirus kind and Sarbecovirus subgenus of Coronaviridae family, which is a group of enveloped, single-stranded RNA viruses.

Other representatives of Sarbecovirus include SARS-CoV (which was the coronavirus that led to the outbreak of SARS in 2002 and 2003), but also countless SARS-like coronaviruses that have been identified and analyzed in detail.

Among viral structural proteins, spico glycoprotein plays a key role in host range, cellular tropism, and entry, as well as infectivity properties. It is therefore no wonder that it is considered the main target of the host’s immune response.

Therefore, a thorough comparative analysis of viral proteins would immensely help in our better understanding of viral biology and pathology, providing insight into its origins as well as the conditions that led to the ongoing COVID-19 pandemic.

That is why dr. Matthew Cotten, Ph.D. David L. Roberston et al. My VT Phan from Uganda and the United Kingdom aimed to identify unique SARS-CoV-2 peptide regions compared to all available Sarbecoviruses to assess functions that could allow SARS-CoV-2 to efficiently replicate and transmit between us.

Study: Unique protein properties of SARS-CoV-2 compared to other Sarbecoviruses. Image credit: NIAID

Genome comparison and estimation of evolutionary distance

In short, this research group investigated genomes throughout Sarbecovirus subspecies using profiles of hidden Markov models. This modeling approach is based on a statistical description of the properties of viral proteins and their amino acid sequences.

Specifically, ten early SARS-CoV-2 genomes were compared with a representative subgroup Sarbecovirus genomes derived from human individuals, bats, pangolins, and civet cats. They were selected after the same analysis was applied to all available beta-coronavirus genes to avoid the omission of any surprisingly close regions of the viral genome.

To estimate the total domain distance between viral groups, normalized bit-point sums (grouped into SARS-CoV-2 and human Sarbecoviruses, bats, pangolins, and civet cats) were summed for all domains and for each genome.

Scheme of analysis.  (A) Hidden Markov profile domains (pHMM) were generated from a set of 35 sequences of the B genome of SARS-CoV-2 early lineage.  All open reading frames were translated and then sliced ​​into either 44 amino acid peptides with a step size of 22 amino acids or 15 amino acid peptides with a step of 8 amino acids.  Peptides were grouped using Uclust (13), aligned with MAFFT (14), and then each alignment was incorporated into pHMM using HMMER-3 (10).  (B) The pHMM set was used to test the Sarbecovirus genome sequences, bit points were collected as a measure of the similarity between each pHMM and the query sequence.  (C) Bit-results were collected analyzed to reveal regions that differ between the early SARS-CoV-2 genome and the query genome.

Scheme of analysis. (A) Hidden Markov profile domains (pHMM) were generated from a set of 35 sequences of the B genome of SARS-CoV-2 early lineage. All open reading frames were translated and then sliced ​​into either 44 amino acid peptides with a step size of 22 amino acids or 15 amino acid peptides with a step of 8 amino acids. Peptides were grouped using Uclust (13), aligned with MAFFT (14), and then each alignment was incorporated into pHMM using HMMER-3 (10). (B) The pHMM set was used to test the Sarbecovirus genome sequences, bit points were collected as a measure of the similarity between each pHMM and the query sequence. (C) Bit-results were collected analyzed to reveal regions that differ between the early SARS-CoV-2 genome and the query genome.

The unique nature of SARS-CoV-2

Detected changes in glycoprotein with SARS-CoV-2 spira – compared to a large number of known Sarbecovirus – reveal that a recent zoonotic source of this virus has not yet been found, but also support the rather unique nature of the SARS-CoV-2 genome.

Consistent with previous reports, a small set of bat-derived and pangolin-derived Sarbecoviruses shows the most significant similarity to SARS-CoV-2, while a measure of proteome similarity showed that bat Sarbecoviruses are unlikely to be a direct source of pandemic virus.

Furthermore, the areas of variance identified in this study may indicate either functional changes in SARS-CoV-2 proteins or amino acid positions that can be modified without compromising the required protein functions.

Detailed jump analysis revealed 82 domains of 15 amino acids that showed large variations in Sarbecoviruses, while 29 of these domains showed changes in the worrying variants relative to the early lineage of SARS-CoV-2.

Differences in proteomes in SARS-CoV-2 relative to close bat, human, and civet Sarbecoviruses.  All open reading frames from 35 early SARS-CoV-2 genome B lines were translated and processed into 44 aa peptides (with 22 aa overlaps), grouped into 0.65 identities using Uclust (11), MAAFT compliant (12), and converted to pHMM using HMMER-3 (10).  The presence of these domains was sought in the Sarbecovirus plus gene set of SARS-CoV-2 genomes and the genomes were then clustered using hierarchical clustering based on normalized domain point bits (e.g., similarity of the identified query domain to the reference lineage).  B SARS-CoV-2 domain).  Each row represents a genome, each column represents a domain.  Domains are shown in their order through the SARS-CoV-2 genome, Red = low normalized bit-result of the domain (less similar to line B SARS-CoV-2) = distant from SARS-CoV-2, Darkest gray = normalized bit of the domain - score = 1 = very similar to line B SARS-CoV-2.  Coronavirus groups are indicated to the right of the image.  (A) Differences in domain between Sarbecovirus subgenus.  (B) For each domain, the mean bit-score was calculated for the whole series of Sarbecovirus genes and for each domain the value 1-mean bit-score was plotted.  The domains were stained with proteins from which they were derived by the color code indicated below the image.

Differences in proteomes in SARS-CoV-2 relative to close bat, human, and civet Sarbecoviruses. All open reading frames from 35 early SARS-CoV-2 genome B lines were translated and processed into 44 aa peptides (with 22 aa overlaps), grouped into 0.65 identities using Uclust (11), MAAFT compliant (12), and converted to pHMM using HMMER-3 (10). The presence of these domains was sought in the Sarbecovirus plus gene set of SARS-CoV-2 genomes and the genomes were then clustered using hierarchical clustering based on normalized domain point bits (e.g., similarity of the identified query domain to the reference lineage). B SARS-CoV-2 domain). Each row represents a genome, each column represents a domain. Domains are shown in their order through the SARS-CoV-2 genome, Red = low normalized bit-score of the domain (less similarity to line B SARS-CoV-2) = distant from SARS-CoV-2, Darkest gray = normalized bit of the domain – score = 1 = very similar to line B SARS-CoV-2. Coronavirus groups are indicated to the right of the image. (A) Differences in domain between Sarbecovirus subgenus. (B) For each domain, the mean bit-score was calculated for the whole series of Sarbecovirus genes and for each domain the value 1-mean bit-score was plotted. The domains were stained with proteins from which they were derived by the color code indicated below the image.

Watch out for viral adaptation

This study provides credibility to the notion of continuous monitoring of the genomic variant, which should translate to the willingness of vaccine manufacturers to adapt such glycoprotein changes in the next generation of vaccine updates.

“In a broader sense, the evolution of SARS-CoV-2 observed in the current worrying variants has sampled only 36% of the possible jump changes that have historically occurred in Sarbecovirus evolution, ”say the study’s authors in this bioRxiv paper.

“It is likely that a large number of new SARS-CoV-2 variants are possible with changes in these regions, compatible with virus replication and expected in the coming months, unless global virus replication is severely reduced,” they added.

In conclusion, such a high rate of SARS-CoV-2 mutation in combination with an exceptional number of SARS-CoV-2 infections worldwide leads to massive viral adaptation. Therefore, further experiments will be needed to observe true functional changes from neutral evolution.

* Important notice

bioRxiv publishes preliminary scientific reports that have not been reviewed and, therefore, should not be considered definitive, guide clinical practice / behavior related to health, or be treated as established information.

.Source