Genomic constellations of RVA detected in Brazil from 1986 to 2016: a temporal and geographical distribution and occurrence of reassortments

Introduction: Species A rotavirus (RVA) infections are a major cause of severe gastroenteritis in children of <5 years worldwide. In Brazil, before vaccination, RVA was associated with 3.5 million episodes of acute diarrheal disease per year. Due to the segmented nature of their genomes, rotaviruses can exchange genes during co-infections, and generate new virus strains and new reinfections. Objective: To evaluate the genomic diversity of RVA isolated in Brazil in 30 years, between 1986 to 2016, to investigate possible changes in the frequency of genotype constellations before and after the implementation of the vaccine. Methods: In total, 4,474 nucleotide sequences were obtained from the Virus Variation Database. Genomic constellation was compared, and the proportion of rotavirus genotypes was analyzed by time and geographic region. Results: Our results showed that major known genotypes were circulating in the country during the period under analysis, with a prevalence of the G1P[8] Wa-like genotype, decreasing only in the period immediately after the introduction of the vaccine. Regarding the geographical distribution, most of our constellations, 62 (39.2%), and 50 (31.6%) were concentrated in the North and Northeast regions. Our analysis also showed the circulation of multiple strains during the periods when the DS-1-like and AU-1-like genotypes were co-circulating with the Wa-like genotype. Conclusion: Therefore, it is likely that inter-genogroup reassortments are still occurring in Brazil and so it is important to establish an efficient surveillance system to follow the emergence of novel reassorted strains that might not be targeted by the vaccine.


INTRODUCTION
Rotavirus A (RVA) remains the main viral agent that causes acute gastroenteritis in children ≤5 years old worldwide, affecting children from both developed and developing countries [1][2][3][4][5] .In Brazil, before the vaccine introduction, RVA was responsible for approximately 650,000 outpatient visits, 92,000 admissions, and 850 deaths per year in children under five years of age 6 .Due to the importance of RVA, in 2006 two vaccines with proven efficacy were licensed.In the same year, one of these vaccines was implemented in the National Immunization Program in Brazil, the monovalent based on G1P [8] rotavirus vaccine Rotarix® (GlaxoSmithKline Vaccines, Rixensart, Belgium) 7 .
The vaccine was available free of charge to all children of eligible age to reduce the number of deaths caused by gastroenteritis in children.Surveillance studies show that the goal was achieved by reducing considerably the number of hospitalizations and deaths related to gastroenteritis, especially in children up to 12 months [7][8][9] .
RVA possesses a double-stranded RNA genome with 11 gene segments.The segment of the genome enables the reassortment between and within human and animal strains, favoring greater genomic diversity of this virus 10 .
The traditional classification is based on the genes that encode the outer capsid proteins, VP4 (P-genotype) and VP7 (G-genotype) 10 .More recently, the genome classification of RVA strains has been enlarged to include all 11 genes: Gx-P[x]-Ix-Rx-Cx-Mx-Ax-Nx-Tx-Ex-Hx concerning the sequences of the genes VP7-VP4-VP6-VP1-VP2-VP3-NSP1-NSP2-NSP3-NSP4-NSP5/6, respectively 11 .Based on this classification, the human RVAs possess one of the following genotype constellations along with the G and P genotypes: Wa-like (I1-R1-C1-M1-A1-N1-T1-E1-H1) of porcine origin, DS-1-like (I2-R2-C2-M2-A2-N2-T2-E2-H2) of the bovine origin or, more rarely, AU-1-like (I3-R3-C3-M3-A3-N3-T3-E3-H3) possibly with canine or feline origin [10][11][12][13][14] .https://doi.org/10.7322/abcshs.2021169.1882 The Wa-like genotype is responsible for more than 50% of diarrhea cases in children worldwide and thus has a fundamental role in the emergence of new cases 15,16 .Also in Brazil, previous studies have shown a prevalence of the G1P [8] Wa-Like genotype.However, some reassortment events between different constellations have been described previously, forming new strains that may be of epidemiological importance but are not identified with the analysis of only G and P genotypes 10,17 .The classification of the entire viral genome makes it possible to visualize evolutionary events that are not under pressure from the vaccine, but that may be of great clinical or epidemiological importance, such as the NSP4 protein, which is a viral enterotoxin released by infected cells 18 .Although there are already some studies describing the genotype constellation of strains of RVA circulating in Brazil 10,17,19 , there is still no study that makes a long temporal evaluation of all available strains to verify temporal changes in the genotypes.
In addition, Brazil is a country that has a continental proportion and there are no studies that verify the regional differences in the RVA genomic constellation over time.
Therefore, this study aimed to evaluate the genomic diversity of RVA isolated in Brazil over a period of 30 years to investigate possible temporal and/or regional changes in the genomic constellation prevalence.

Sequence alignments and phylogenetic reconstruction of the RVA sequences
Nucleotide sequences for each rotavirus genome segment were aligned using the Muscle algorithm with the default parameters 20 , which is incorporated into the MEGA5 software 21 .The phylogenetic relationships of RVA sequences were determined by https://doi.org/10.7322/abcshs.2021169.1882 Bayesian inference using MrBayes v3.2.7 22,23 , with 12,000,000 generations for the Markov chain Monte Carlo (MCMC) algorithm.A 25% discarded burnin was set to eliminate iterations at the beginning of the MCMC run.For Bayesian Inference tree reconstruction, the general time reversible (GTR) model using Gamma distribution (+G) and the proportion of invariable sites (+I) was used, which was indicated by jModelTest v2.1.10 24.This evolutionary model was set as the substitution rates of variation of the sequences for NSP1, VP1, VP3, VP4,0, and VP7 datasets.For NSP2 and NSP4 segments, the 3-parameter model (TPM)1uf+G and TPM2uf+I+G model was used, respectively.For NSP3 and VP2 segments, the transitional model (TIM)1+I+G and TIM3+G were used, respectively.The model used for phylogenetic reconstruction of the NSP5 and VP6 segments was Hasegawa-Kishino-Yano (HKY)+G and Tamura-Nei (TRN)+I+G, respectively.The tree for each viral segment was edited and visualized with Itol v 5.6.3 (https://itol.embl.de/).

RESULTS
A total of 4,474 nucleotide sequences were obtained from the Rotavirus database at the Virus Variation Resource.Regarding the length of the sequences, 4,214 had a length above 500 bp.The sequences smaller than 500 bp were distributed among the 10 RVA genotypes.The NSP1, NSP2, and NSP3 genotypes, with one sequence, respectively.
The analysis of the genomic constellation by time and region showed that most of the isolates, 62 (39.2%), and 50 (31.6%),were concentrated in the North and Northeast  4).After the introduction of the vaccine, the Wa-like constellations showed a small increase and the strains of the DS-1-like constellation appeared for the first time in the country.The period 2009-2011 presented the largest number of representatives, with 59 constellations of the Wa-like type and 22 DS-1-like.The number of mixed constellations doubled after the introduction of the vaccine, from three to six (Figure 4).

The phylogenetic analysis of RVA based on each of the 11 gene segments has
shown that all RVA genes presented a specific pattern of clustering of the isolates from Brazil according to the genotypes and their respective lineages.It was also possible to observe the existence of a temporal pattern of clustering for the lineages in all reconstructed phylogenies.In addition, the internal groups of most lineages presented a regional cluster.However, the temporal clustering was the most evident (Supplementary file 3-13).When analyzing the phylogeny of NSP1, it was possible to identify a cluster of sequences belonging to the South (KM026548), Southeast (KM026553), and Midwest (KM026547) regions, that clustered together in black.The [P8] genotype of VP4 also showed this clustering pattern.In this segment, an orange sequence (KM027056) from the Southeast region is highlighted, clustering together with sequences from the Midwest region (KM027057, KM027059, and JX437037), in yellow.These clusters highlighted here had in common the period, 1988-1998.Other segments analyzed also presented this https://doi.org/10.7322/abcshs.2021169.1882clustering pattern, highlighted sequences, or marked with different colors within the same genotype.A sequence in A3 of NSP1 (JQ715659) from the State of Pará did not present a clustering pattern based on year or geographical region.The posterior probabilities for the external nodes were above 95% and, for most internal nodes present in the 11 gene segments, the posterior probability values were greater than 75%.The VP4 segment had the lowest branch support for several internal nodes of the phylogenetic tree (Supplementary files 3-13).

DISCUSSION
Despite the existence of two licensed and effective vaccines, RVA remains an important agent of acute gastroenteritis and diarrhea-related deaths in children under 5 years of age, especially in undeveloped countries [1][2][3][4] .In Brazil, the vaccine has been available since 2006 and has maintained high levels of coverage, which has caused a decrease in the number of hospitalizations and deaths from diarrhea in the country 9 .
However, despite the vaccine's effectiveness, new cases of RVA are reported annually in several studies 10,[25][26][27][28] .Based on the analyzed data, during 30 years of surveillance for Brazilian RVA, G1P [8] genotype was the most prevalent when considering the entire study period, but there is a sharp decrease in the circulation of this genotype shortly after the introduction of the vaccine in the country (2007-2008 period).In the same period, it is also possible to see an increase in G2P[x] genotypes, especially G2P [4], which is a heterotypic strain compared to the vaccine Rotarix® (G1P [8] strain) (Figure 3).This decrease in the G1P [8] genotype and increase in the G2P [4] between 2007-2008 was also seen in countries where the Pentavalent Rotateq® vaccine was introduced, such as Australia 29 , as well as countries that had not introduced any vaccine against RVA, such https://doi.org/10.7322/abcshs.2021169.1882as Argentina 30 , which demonstrates that these fluctuations of genotypes over time can be a natural viral mechanism of maintenance in the human population 31 .
Before the vaccine, G1P[8] represented 6.7%, 14.4%, and 33.9% of the strains in the 1986-1995, 1996-2000, and 2001-2006 periods, respectively.After the vaccine introduction, it represented 1.3%, 32.9%, and 10.6% of the strains in 2007-2008, 2009-2011, and 2012-2016, respectively.These variations in the prevalence of the G1P [8]   genotype, with a decrease mainly in the periods following the start of vaccination, have already been described by Santos et al. 25 in a systematic review of publications between 1986-2015.Here, we have analyzed the nucleotide sequences for all 11 RVA genes, and the genomic constellations, in databases and evaluated their distribution by time (1986-2016) and by geographic region of the country.Therefore, we conclude that there is no linear reduction of the frequency of this genotype in the total circulating RVA after the implementation of the vaccine, although there is a reduction in the total number of deaths and hospitalizations due to infection with this virus.
The Wa-like genomic constellation was the most prevalent during the entire study period, a fact that was already expected considering that this is the genotype that is most frequently reported in human infections around the world [32][33][34][35][36] .However, in specific periods, the other genotypes were also identified in Brazil.AU-like-1, which is a rarer genotype, was only found in the period between 1996-2000, and the DS-1-like genotype was identified in two consecutive periods analyzed (2007-2008 and 2009-2011).Despite low frequencies, mixed genotypes, which had reassortment events, were also identified in the same periods in which the genotypes DS-1-like and AU-like-1 were identified (Figure 4).Some DS-1-like and AU-like-1 genotypes, along with mixed genotypes, have already been described in other studies in Brazil as well as in other regions of the https://doi.org/10.7322/abcshs.2021169.1882world 16,18,22,24,35,37 .Recently, the spread of a mixed equine-like G3P [8] strain in Brazil has been identified, showing that even recently these mixed strains are still emerging and need to be identified 37 .Some studies have reported the existence of inter-genogroup reassortments among different animal and human RVA genogroups 14,15,38 .Reassortment events have been reported in the NSP3 gene in strains from Maranhão and Rio de Janeiro 14,39 .Our results indicate reassortment events in the NSP2, NSP3, NSP4, VP2, and VP3 genes in strains collected from Brazil that belong to the N2, T2, E2, C2, and M3 genotypes, associated with the co-circulation of DS-1 like and AU-1-like genomic constellations.Despite the evolutionary importance of the reassortment events in RVA genomes, the effects of these events on RVA vaccines still need to be addressed in further studies.
Our analysis of the genomic constellations by time and geographic region showed that before the vaccine was introduced, there was a greater number of genotype constellations in the Southeast and Northeast regions.The northern region had no constellations available in the pre-vaccine periods, which may indicate a lack of interest in monitoring RVA in this region before the vaccine introduction.In the post-vaccine periods, the Northeast continued with the second highest number of available constellations, but the Southeast has decreased the available sequences and the North has emerged as the region with the most sequences available, which demonstrates an inconsistency in RVA surveillance in the country that can make it difficult monitoring the epidemiology of the virus as well as the effectiveness of the vaccine.The Central-West region had the least number of constellations available throughout the analyzed period, including no constellation obtained in the post-vaccination period, which shows how little we know about the vaccine impacts on the complete RVA genome constellations circulating in this region.It is important to remember that the human Wa- like strain of RVA has the same phylogenetic origin as porcine strains of RVA and the appearance of new strains is likely to happen during co-infection in pig cells 13 .Then, in environments where there is the human manipulation of these animals, as well as with cattle and canines, there is a potential for the emergence of new mixed strains between human and animal hosts.These new strains may not be able to infect human cells, which is more likely.However, it is necessary to maintain control measures, with good sanitary conditions, maintenance of vaccination to control the spread of the virus, and the maintenance of surveillance for viral circulation and analysis of the RVA genome to identify these changes and implement measures effective control systems 13 .
In this study, based on the analyzed genomic data, it was possible to conclude that strains of mixed genotype or animal origin have not been successful in maintaining themselves in the human population in Brazil over the years, with the Wa-like genotype being the most prevalent throughout the analyzed period.However, the evolutionary history between pathogen and host is dynamic and it is necessary to remain vigilant in this process to verify any event that could change the course of what is observed today.Phylogenetic analysis and evolutionary history were inferred for all 11 genome segments of the RVA, which exhibited a pattern similar to temporal and regional.
Sequences from different regions were identified in some internal groups.Thus, it was possible to verify that even in a continental country, such as Brazil, the temporal grouping overlaps the regional grouping in the analyzed phylogenies.Perhaps this is explained by the constant temporary migrations from one region to another in the country 40 .
Although our findings are relevant, our study presents a limitation.Our data were obtained from a genomic database, in which some periods may be more represented than others since the genomic sequencing of rotavirus has become more accessible in recent years.Therefore, we have many more available sequences in recent years than in some https://doi.org/10.7322/abcshs.2021169.1882years in the past.Also, some regions may be over-or under-represented and the results here presented may not represent the real situation of each region.Even with these limitations, these results are relevant because it presents the distribution of circulating genotypes in Brazil over a period of 30 years, as well as the possibility that intergenogroup reassortment is occurring in Brazil and so it is important to establish an efficient surveillance system to follow the emergence of novel genotype constellations that might not be targeted by the vaccine.https://doi.org/10.7322/abcshs.2021169.1882 (186) and VP4 (177) for Northeast.The years 2009 and 2011 presented the largest number of representatives (Figure 1a, 1b).In Central-West, Southeast, and South regions, most of the genotypes obtained were from the pre-vaccination period, especially between the years 2001 to 2006 (Figure 1c and 1e).

2 ( 1 .
regions and the period 2009-2011, with 59 (37.3%) and 28 (17.7%)constellations, respectively.The Southeast and South regions presented 28 (17.7%) and 12 (7.6%)constellations, respectively.Most of the isolates in the Southeast and South regions were identified between 1996-2000 and 2001-2006.In the period 1996-2000, only constellations identified in the Southeast region were registered in the analyzed database.They represented 13 (8.2%) of the total constellations analyzed in this study.The number of constellations identified in the Southeast and South regions in the period 2001-2006 was 9 (5%) and 10 (6.3%), respectively.The periods 1986-1995, 2007-2008, and https://doi.org/10.7322/abcshs.2021169.18822012-2016 had the lowest number of constellations identified, 9 (5%), 6 (3.8%), and 3%), respectively.The Central-West region with 6 (3.8%) constellations presented the lowest frequency identified.Based on the genomes deposited in the database, before the introduction of the vaccine, only the Wa-like and AU-1-like constellations were present in Brazilian strains, with the period 2001-2006 housing 41 of the 64 constellations present in the 1986-2006 time interval (Figure

Figure 1 :
Figure 1: Distribution of the gene segments in the different regions of Brazil from 1986 to 2016.(A) North; (B) Northeast; (C) Central West; (D) Southeast; (E) South.

Figure 2 :
Figure 2: Sequence's disposition of the G and P genotypes in the period from 1986 to 2016.

Figure 3 :
Figure 3: Distribution of the combined G and P genotypes of Rotavirus A in Brazil from 1986

Figure 4 :
Figure 4: Genomic constellation distribution of Rotavirus A in Brazil from 1986 to 2016.

Analysis of the gene segments and constellations for time and geographic region
To create a local database, 4,474 nucleotide sequences, with a length between 201 to 3302 bp of the 11 segments RVA infecting the human host and 158 fully sequenced RVA isolates between the years of 1986-2016, with sequence length between 528 to 3302 bp, were collected from the Rotavirus database at the Virus Variation Resource https://doi.org/10.7322/abcshs.2021169.1882(https://www.ncbi.nlm.nih.gov/genome/viruses/variation/).