Genotypic and phenotypic diversity within the neonatal HSV-2 population

More than 14,000 neonates are infected with herpes simplex virus (HSV) annually. Approximately half display manifestations limited to the skin, eyes, or mouth (SEM disease). The rest develop invasive infections that spread to the central nervous system (CNS disease) or systemically (disseminated disease). This invasive HSV disease is associated with significant morbidity and mortality, but viral and host factors that predispose neonates to these forms are unknown. To define viral diversity within the neonatal population, we evaluated ten HSV-2 isolates from newborns with a range of clinical presentations. To assess viral fitness independent of host immune factors, we measured viral growth characteristics in cultured cells and found diverse in vitro phenotypes. Isolates from neonates with CNS disease were associated with larger plaque size and enhanced spread, with isolates from cerebrospinal fluid (CSF) exhibiting the most robust growth. We sequenced complete viral genomes of all ten neonatal viruses, providing new insights into HSV-2 genomic diversity in this clinical setting. We found extensive inter-host and intra-host genomic diversity throughout the viral genome, including amino acid differences in more than 90% of the viral proteome. The genes encoding glycoprotein G (gG, US4), gI (US7), gK (UL53), and viral proteins UL8, UL20, UL24, and US2 contained variants that were found only in association with CNS isolates. Many of these viral proteins are known to contribute to cell spread and neurovirulence in mouse models of CNS disease. This study represents the first application of comparative pathogen genomics to neonatal HSV disease. Significance Statement Herpes simplex virus (HSV) causes invasive disease in half of infected neonates, resulting in significant mortality and permanent cognitive morbidity. The factors that contribute to invasive disease are not understood. This study reveals diversity among HSV isolates from infected neonates, and makes the first associations between viral genetic variations and clinical disease manifestations. We found that viruses isolated from newborns with invasive brain disease show enhanced spread in culture. These viruses contain protein-coding variations not found in viruses causing non-invasive disease. Many of these variations are found in proteins known to impact neurovirulence and viral spread between cells. This work advances our understanding of HSV diversity in the neonatal population and will aid the development of neuroprotective anti-viral therapies.

Abstract (currently 246 words) 20 More than 14,000 neonates are infected with herpes simplex virus (HSV) annually . 21 Approximately half display manifestations limited to the skin, eyes, or mouth (SEM disease). 22 The rest develop invasive infections that spread to the central nervous system (CNS disease) or 23 systemically (disseminated disease). This invasive HSV disease is associated with significant 24 morbidity and mortality, but viral and host factors that predispose neonates to these forms are 25 unknown. To define viral diversity within the neonatal population, we evaluated ten HSV- 2 26 isolates from newborns with a range of clinical presentations. To assess viral fitness independent 27 of host immune factors, we measured viral growth characteristics in cultured cells and found 28 diverse in vitro phenotypes. Isolates from neonates with CNS disease were associated with larger 29 plaque size and enhanced spread, with isolates from cerebrospinal fluid (CSF) exhibiting the 30 most robust growth. We sequenced complete viral genomes of all ten neonatal viruses, providing 31 new insights into HSV-2 genomic diversity in this clinical setting. We found extensive inter-host 32 and intra-host genomic diversity throughout the viral genome, including amino acid differences 33 in more than 90% of the viral proteome. The genes encoding glycoprotein G (gG, US4), gI 34 (US7), gK (UL53), and viral proteins UL8, UL20, UL24, and US2 contained variants that were 35 found only in association with CNS isolates. Many of these viral proteins are known to 36 contribute to cell spread and neurovirulence in mouse models of CNS disease. This study 37 represents the first application of comparative pathogen genomics to neonatal HSV disease. 38

52
Each year an estimated 10,000 neonates are infected with HSV-2, and 4,000 infected with HSV-53 1, worldwide (1). Infants are typically infected at the time of birth by maternal genital shedding 54 of HSV; most often mothers are not aware of their infection (2)(3)(4). The recent increase in genital 55 HSV-1 incidence among women of childbearing age, particularly in developed nations, suggests 56 that the burden of neonatal infection will continue to rise (1,5). While some infected infants 57 exhibit only superficial infection limited to the skin, eyes, or mouth (SEM disease; 45%), about 58 half develop invasive systemic (disseminated disease; 25%) or central nervous system (CNS 59 disease; 30%) infections associated with significant morbidity and mortality (6,7). Currently, the antiviral medication acyclovir is the standard therapy for all forms of neonatal HSV disease. 61 Although this intervention has reduced mortality due to invasive disease, most survivors are left 62 with permanent neurodevelopmental deficits (8,9). 63 The factors that predispose a neonate to invasive HSV infection are not entirely known. Recent 64 studies have found that a portion of individuals outside of the neonatal period who experience 65 HSV infection of the brain have a host genetic defect within the Toll-like receptor-3 (TLR3) 66 pathway (10,11). HSV encephalitis outside of the neonatal period is uncommon, and its 67 association with the rare host defects in the TLR3-pathway has not been examined. By contrast, 68 half of HSV-infected neonates experience invasive CNS or disseminated disease, making it less 69 likely that host genetic defects alone could account for all of the observed cases of invasive 70 infection in neonates. Prior clinical data on mother-to-infant transmission of HSV indicate that 71 most cases of neonatal disease, including invasive forms of disease, result from newly acquired 72 HSV infection in the absence of maternal immune protection (2)(3)(4)12). This suggests a window 73 of opportunity where the contributions of viral genetic variation to the progression of invasive 74 infection and disease may be greater than in adults. 75 Prior studies have identified viral genetic factors that influence virulence or disease for 76 reoviruses, influenza virus, HIV, and others (13)(14)(15)(16)(17)(18). In contrast to these RNA viruses, HSV was 77 presumed to have lower genetic diversity and potential for variation in virulence, due to its 78 relatively stable DNA genome and long co-evolutionary history with humans (19). The 79 assumption of limited HSV heterogeneity was supported by early studies that utilized low-80 resolution restriction fragment length polymorphism (RFLP) or single-gene analyses to compare 81 multiple HSV isolates (20)(21)(22). However Rosenthal and colleagues used RFLP and PCR analysis of a single locus to demonstrate that a heterogeneous HSV population can exist in an invasive 83 neonatal infection, and provided proof of principle that natural genetic variation can impact 84 neurovirulence (23,24). More recently, advances in high-throughput sequencing (HTSeq) have 85 enabled a re-evaluation of herpesvirus genome-wide variation, which suggests that herpesviruses 86 harbor extensive diversity both between strains or individuals (inter-host variation) as well as 87 within a single individual (intra-host variation) (25)(26)(27). These minor genetic variants may 88 become clinically important if a variant within the viral population becomes the new dominant 89 allele or genotype as a result of a bottleneck at transmission, entry into a new body compartment, 90 or selective pressure such as antiviral therapy (28,29). infections (36)(37)(38). HTSeq-based examination of vaccine-associated rashes due to the alpha-97 herpesvirus varicella zoster virus (VZV) demonstrated that adult skin vesicles contain a subset of 98 the viral population introduced during vaccination, and found at least 11 VZV genomic loci that 99 were linked to rash formation (26). Recent HTSeq comparisons of adult genital HSV-2 revealed 100 the first evidence of a single individual shedding two distinct strains (39), demonstrated changes 101 in the viral genome over time in a recently infected host (40), and provided the first evidence of 102 ancient recombination between HSV-1 and HSV-2 (41,42). However, to date there has been no 103 evaluation of genome-wide variation in neonatal HSV isolates to determine the levels of 104 diversity in this population, or the potential impact(s) of viral genetic variants on disease.
Until recently, several technical barriers prevented thorough assessment of neonatal HSV 106 genomes. A key constraint on studies of neonatal disease has been the availability of cultured 107 viral samples associated with clinical information that have also been maintained in a low-108 passage state appropriate for sequencing and further experimental studies. Historically viral 109 culture was part of the HSV diagnostic workflow, but this has been superseded in clinical 110 laboratory settings by the speed and sensitivity of viral detection by PCR (43,44,6). This change 111 limits neonatal HSV sample availability for in vitro and animal model studies. In addition, many 112 previously archived neonatal HSV isolates have been passaged extensively, allowing them to 113 acquire mutations that enhance viral growth in culture (23,24). Additional challenges for HTSeq 114 approaches to neonatal HSV include the large size of the viral genome (~152 kb), its high G+C 115 content (~70%), and a large number of variable-number tandem repeats in the viral genome (> 116 240 mini/micro-satellite repeats and >660 homopolymers of ≥6 base pairs) (45,27). Therefore, 117 many studies of HSV diversity (21,(46)(47)(48)(49) or the effect of HSV genetic variation on disease 118 (50)(51)(52) have relied on low-resolution restriction fragment length polymorphism (RFLP) or 119 single-gene PCR analyses, due to the speed and ease of analysis in comparison to whole-genome 120 approaches (53-58). To overcome these challenges, we combined our expertise in HSV 121 comparative genomics and phenotypic analysis (59, 53, 60, 58) with a unique resource of low-122 passage, well-annotated neonatal specimens (8,61,9). 123 Here we analyzed a set of ten low-passage clinical HSV-2 isolates collected from neonates with 124 HSV infection, enrolled in one of two clinical studies that spanned three decades of patient 125 enrollment (1981-2008) (8,61,9). These samples represented a wide range of clinical 126 manifestations including SEM, CNS, and disseminated disease, and each sample was associated 127 with a robust set of de-identified clinical information. We defined the level of diversity in this 128 population using comparative genomics and an array of cell-based phenotypic assays. We found 129 that HSV-2 isolates displayed diverse in vitro phenotypes, as well as extensive inter-and intra-130 host diversity distributed throughout the HSV-2 genome. Finally, we found coding variations in 131 several HSV-2 proteins associated with CNS disease. This study represents the first-ever 132 application of comparative pathogen genomics to neonatal HSV disease and provides a basis for 133 further exploration of genotype-phenotype links in this clinically vulnerable patient population. 134

135
Neonatal HSV-2 samples represent a diverse clinical population 136 We utilized samples collected from ten HSV-2-infected neonates enrolled by the National 137 Institute of Allergy and Infectious Diseases Collaborative Antiviral Study Group (CASG) for 138 clinical trials between 1981 and 2008 (8,61,9). These infants encompassed a range of clinical 139 disease manifestations (see Table 1), with about half experiencing invasive CNS disease (5 140 patients) or disseminated (DISS) disease with CNS involvement (2 patients), and the remainder 141 experiencing non-invasive SEM disease (3 patients). Extensive clinical information was 142 available for each patient, including long-term neurocognitive and motor outcomes ( Table 1). 143 This population was also diverse with respect to sex, race, gestational age, and enrollment center 144 (Table 1; enrollment center data not shown). All samples were collected at the time of diagnosis, 145 prior to initiation of acyclovir therapy. Each isolate was cultured once as part of the diagnostic 146 process, with expansion only for the experiments shown here. Although the sample size was 147 constrained by the rarity of neonatal HSV infection and availability of appropriately maintained 148 isolates, our group is similar in size to prior HTSeq comparisons of congenital HCMV samples 149 (31,32,34,35), and is the largest group of neonatal HSV samples ever subjected to comparative 150 genomic and phenotypic analysis. 151 Neonatal HSV-2 isolates have different fitness in culture

152
To determine whether the viruses isolated from this neonatal population (Table 1) were 153 intrinsically different, we assessed viral growth in culture, which provides a consistent 154 environment that is independent of host genetic variation. To minimize the impact of immune 155 pressure we selected Vero monkey kidney cells, which lack an interferon response (62, 63). Each 156 viral isolate was applied to a confluent monolayer of cells in vitro, and allowed to form plaques 157 for 100 hours ( Figure 1A). Average plaque size differed between isolates, with large plaques 158 being more frequent among viruses derived from neonatal CNS disease ( Figure 1B). Viruses 159 isolated directly from the cerebrospinal fluid (CSF; isolates CNS11 and DISS14) produced 160 plaques with an average size that was statistically larger than those isolated from the skin (one-161 way ANOVA followed by Holm-Sidak's multiple comparisons test, p<0.05). Plaque size was 162 assessed throughout passage in culture and remained constant from the time the isolates were 163 received in our laboratory (passage 2) through their genetic and phenotypic analysis (passage 4). 164 The low-passage HSV-2 clinical isolate SD90e, which was isolated from an adult patient, was 165 used as a control for comparison (64). The variance in plaque sizes produced by a given isolate 166 was not statistically different between isolates ( Figure 1B). The differences in average plaque 167 size between isolates suggested that the HSV-2 populations found in each neonatal isolate are 168 indeed intrinsically different. 169 Entry kinetics, DNA replication, protein expression, and virus production do not 170 account for differences in plaque size

171
Plaque formation is a complex endpoint that involves the ability of the virus to enter the cell, 172 replicate its double stranded DNA genome, produce viral proteins and assemble new virions that 173 spread to adjacent cells. Therefore, we explored whether the differences in plaque formation 174 observed in Vero cells reflected inherent differences in the ability of isolates to complete each of 175 these stages of the viral life cycle. For these comparisons, two large-plaque-forming isolates 176 associated with CNS disease were selected, including one isolated from the CSF (CNS11) and 177 one from the skin (CNS03). These were compared to two small-plaque-forming isolates 178 associated with either CNS (CNS12) or SEM disease (SEM02), both of which were isolated 179 from the skin. First, we compared each isolate's rate of cell entry. Virus was applied to chilled 180 cells, followed by warming to synchronize cell entry (Figure 2A). A low pH solution was 181 applied at various points over the first hour of cell entry to inactivate any virus that had not yet 182 entered a cell, and plaque formation was then allowed to proceed for 100 hours. We found no 183 difference in rates of cell entry between these four representative viral isolates ( Figure 2B), 184 suggesting that large plaques did not result from increased rates of virus entry into cells. 185 We next infected Vero cells at high multiplicity of infection (MOI=5) to compare the outcome of 186 a single round of viral replication. We found that all four isolates produced similar numbers of 187 genome copies (as measured by qPCR for the gB gene; see Methods for details) ( Figure 2C), 188 indicating that differences in viral DNA replication did not influence plaque size. We quantified 189 the production of infectious virus by counting plaque-forming-units (PFU) on Vero cell 190 monolayers, as well as on the highly-permissive U2OS human bone osteosarcoma epithelial cell 191 line (65). No differences in virus production were noted between the four isolates when 192 quantified on either cell type (Figure 2D, E). U2OS cells lack innate sensing of viral infection 193 through the STING pathway (66) and can even support the growth of highly-defective  isolates that lack ICP0 function (65). All isolates formed large plaques on the highly-permissive 195 U2OS cell monolayers (Figure S1), allowing us to rule out the possibility that very small foci of 196 infection were missed during titering of the small plaque-forming isolates on Vero cells ( Figure  197 2D, E). Finally, we compared viral protein production for these isolates, and found no  immunofluorescence (Figure 3D, E, and Figure S2). Viral titers recovered from harvested cells 213 ANOVA followed by Tukey's multiple comparison test, p<0.05 at 72h). Together these data 237 indicated that large-plaque-forming isolates associated with CNS disease shared an enhanced 238 ability to spread cell-to-cell through culture, in comparison to small-plaque-forming isolates. 239 Comparative genomics reveals genetic diversity in neonatal HSV-2 isolates

240
The differences identified in cell-to-cell spread between neonatal isolates in culture indicated the 241 existence of intrinsic differences between these viruses. To reveal how genetic variation may 242 contribute to viral phenotypes in culture, and ultimately to clinical disease manifestations, we 243 sequenced the complete viral genome of all ten neonatal HSV-2 isolates. For each isolate, we 244 sequenced purified viral nucleocapsid DNA and assembled a consensus genome, which 245 represents the most common genotype at each nucleotide locus in the viral population. The 246 clinical trials utilized in this study enrolled HSV-infected infants from multiple sites across the 247 United States (8,9,61). Therefore, we first assessed the overall degree of relatedness between 248 these viral genomes to understand whether any similarities in viral or geographic origin might 249 have contributed to in vitro or clinical phenotype patterns. In light of the known potential for 250 recombination in the phylogenetic history of HSV (49,67,53,42), we used a graph-based 251 network to investigate the phylogenetic relationship between these isolates. We found a similar 252 degree of divergence among all ten neonatal HSV-2 consensus genomes ( Figure 4A), which was 253 further corroborated by their distribution in a network graph of available HSV-2 genomes from 254 GenBank ( Figure 4B, see Table S1 for HSV-2 GenBank accessions). This suggested that 255 similarities in viral genetic origin were not responsible for determining the cellular or clinical 256 outcomes of neonatal HSV-2 infection. 257 Overall protein-coding diversity in neonatal HSV-2 isolates is similar to that 258 observed in adult HSV-2 strains

259
We next asked whether overt defects in any single HSV-2 protein might be associated with 260 clinical or in vitro spread phenotypes. In examining the coding potential of all ten neonatal HSV-261 2 isolates, we found no protein deletions or truncations encoded by any of these viral genomes. 262 These comparisons excluded the viral proteins ICP34.5 (RL1) and ICP4 (RS1), whose coding 263 sequence is not fully determined in these isolates due to sequencing gaps and/or incomplete 264 assembly at G+C-rich tandem repeats in these regions. Issues in sequencing and/or assembly of 265 these two genes, as well as the nearby open reading frame (ORF) for ICP0 (RL2), have been 266 observed in all prior HTSeq studies of HSV-2 genomes (68-70). For all other HSV-2 proteins, 267 we found a total of 784 nucleotide differences in 71 genes (see Table S2 Table S2). We also compared these neonatal genomes to the 273 full set of 58 annotated adult HSV-2 genomes from GenBank (listed Table S1). As expected 274 from the difference in sample size, there was more overall diversity found across 58 adult HSV-2 275 genomes than in 10 neonatal genomes. A comparison of the dN/dS ratio in neonatal vs. adult 276 HSV-2 genomes revealed a similar trend with a few outliers visible on each axis, e.g. UL38 and 277 US8A had a higher dN/dS ratio in neonates than in adult isolates (see Table S2 and Figure S3  278 for full comparison). These data indicated that at the consensus level, neonatal HSV-2 isolates 279 display substantial inter-host coding diversity spread throughout the genome, but do not possess 280 strikingly more diversity or an excess of genetic drift as compared to adult isolates. 281

282
We next focused our attention on differences below the consensus level in each intra-host viral 283 population. The amino acid (AA) variations described above exist in the consensus genomes of 284 each isolate. Since viral replication creates a population of genomes, we next assessed whether 285 minor allelic variants existed within the viral population of any neonatal HSV-2 isolate, thereby 286 expanding the viral genetic diversity within each host. The significant depth of coverage from 287 deep-sequencing of each isolate allowed us to search for minority variants at every nucleotide 288 position of each genome. We defined a minor variant as any nucleotide allele (single nucleotide 289 polymorphism, or SNP) or insertion/deletion (INDEL) with frequency below 50%, but above 2% 290 (our limit of detection; see Methods for additional criteria). We found minor variants in the viral 291 genome population of all 10 neonatal HSV-2 isolates, albeit to a different degree in each isolate 292 ( Figure 6A). In total, there were 1,821 minor variants, distributed across all genomic regions 293 ( Figure S4). For both SNPs and INDELs, intergenic minority variants outnumbered those in 294 genes (genic), likely reflecting the higher selective pressures against unfavorable mutations in 295 coding regions. The neonatal isolate DISS29 had 8-10-fold higher levels of minority variants 296 than other neonatal isolates (Figures 6A and S4), and these variants were often present at a 297 higher frequency or penetrance of the minor allele than observed in other neonatal isolates 298 ( Figures 6B and S4). We further examined the distribution of minor variants that occurred in 299 genes, and found that nearly every HSV-2 protein harbored minority variants in at least one 300 neonatal isolate ( Figure 6C). Only UL3, UL11, UL35, and UL55 were completely devoid of 301 minority variants. Three of these genes (UL3, UL11, UL35) were also devoid of AA variations at the consensus level ( Figure 5). These data revealed the breadth of potential contributions of 303 minority variants to neonatal HSV-2 biology, which could undergo selection over time or in 304 specific niches. 305 Coding variations identified between neonatal HSV-2 isolates are associated with  (Table S3). Isolates collected from infants with CNS disease and disseminated 317 involvement also shared a variant in the HSV-2 UL20 gene (P129L) (Figure 7). One variant in 318 the HSV-2 UL24 gene (V93A) was shared only by the SEM isolates, with all CNS isolates 319 containing a valine at this position ( Figure 7). The sample size of these comparisons was 320 constrained by the overall limits of neonatal HSV-2 availability. While CNS11 and DISS14 321 appeared together frequently in the variants listed in Figure 7, these viral genomes were not 322 genetically similar overall at the consensus genome level (Figure 4), and these isolates harbored 323 unique variants in several of the proteins that also had specific shared variants (e.g. gK, UL8, gG). This indicated that their similarities in these residues were not due to a common viral strain 325 origin. Many of the coding variations that correlate with neonatal CNS disease phenotypes 326 impact viral proteins known to modulate cell-to-cell spread (71-75) and/or contribute to 327 neurovirulence in mouse models of CNS infection (73, 76-82) (Figure 7 and Table S2), 328 however, their role in human disease has not yet been assessed. 329

330
Host factors have not been identified to explain the >50% of neonates experiencing invasive 331 CNS or disseminated forms of HSV infection. There is growing evidence that most herpesviruses 332 contain significant genetic variation, including HSV-1 and HSV-2 (53, 27, 56, 68-70). The 333 potential contributions of viral genetic variation to clinical disease in neonates therefore warrants 334 exploration. Here, we analyzed genetic and phenotypic diversity for HSV-2 isolated from ten 335 neonatal patients spanning two clinical studies (8,61,9). We found that neonatal HSV-2 isolates 336 exhibited diverse growth characteristics in culture, with isolates obtained from neonates with 337 CNS disease being associated with larger average plaque size and enhanced viral spread through 338 culture. Using comprehensive comparative genomics, we further demonstrated that these 339 neonatal HSV-2 isolates contained extensive genetic diversity both within and between hosts. 340 These data revealed several specific viral genetic variations that were associated with cases of 341 CNS disease, in proteins known to contribute to cell-to-cell spread and/or neurovirulence in 342 mouse models of CNS disease. Further studies are required to determine the impact of these 343 variations on HSV-2 neurovirulence and progression to CNS disease. 344 Genomic comparison of these neonatal isolates revealed a wide range of genetic diversity. At the consensus genome level, which reflects the most common allele in each viral population, we 346 found that coding differences between strains were as numerous as between previously described 347 sets of adult HSV-2 isolates ( Figure 5) (68-70). Furthermore, the specific genetic variations 348 associated with neonatal CNS disease in our study can also be found in genital HSV-2 isolates 349 which are not associated with CNS disease in adults. This suggests that the dramatic differences 350 in clinical manifestations following HSV-2 infection in neonates, with significantly higher rates 351 of invasive CNS infection, are not due to unique neonatal HSV-2 strains. It is more likely that 352 genetic variations occurring naturally in the adult HSV-2 population, when present in the specific 353 context of neonatal infection, may confer neurovirulence and progression to CNS disease. 354 At the level of minority variants (MV), which represent rare alleles that exist within each intra-355 host viral population, we found that the DISS29 harbored 8-10 fold and CNS15 harbored 3-4 356 fold more MV than other neonatal virus genomes (Figures 6 and S4). This could be indicative 357 of a mixed viral population (e.g. a multi-strain infection) or of decreased polymerase fidelity (28, 358 27). Diversity in viral populations has also been observed in congenital HCMV infection (31, 32, 359 35, 34). However, this is the first time that evidence has been found for this level of viral 360 population diversity with HSV-2. These minor genotypes may be selected or genetically isolated 361 in particular niches (e.g. CSF), as observed in a comparison of VZV skin vesicles (26) This comparison of viral genotype to clinical phenotype revealed associations between neonatal 385 CNS disease and several viral protein variants that may impact neurovirulence through 386 modulation of cell-to-cell spread. Although the sample set in this proof of concept study is small, 387 we observed potential patterns that warrant exploration in a larger dataset. It is important to 388 acknowledge that there is limited availability of samples from neonatal infection, both due to the 389 rarity of these infections, and the fragility and limited body size of the infected infants. These natural circumstances lead to minimal sample collection from infected neonates. The finding that 391 CNS-associated isolates exhibit enhanced spread between cells in culture, particularly those 392 derived directly from the CSF, suggests that one or more of these variants could be functionally 393 significant. Coding differences in viral proteins not known to contribute to neurovirulence were 394 also found to be associated with neonatal CNS disease, and represent potential novel 395 contributions to invasive infection. These promising results warrant exploration in a larger study, 396 ideally with isolates from multiple time points and/or body sites from each infected infant. This 397 would enable a better understanding of how overall viral genetic diversity contributes to 398 neuroinvasion. 399

401
Viruses were collected from neonates enrolled in clinical studies (8, 9,

483
The genomes of all 10 neonatal HSV-2 isolates were combined with all annotated HSV-2 484 genomes available in GenBank (see Table S1 for full list; all derived from adults) and aligned 485 using MAFFT (90). The genome-wide alignment used a trimmed genome format (lacking the 486 terminal repeats) to avoid giving undue weight to these duplicated sequences. The MAFFT 487 alignment was used to generate a NeighborNet phylogenetic network in SplitsTree with 488 Uncorrected P distances (49,91,92). A diverse subset of ten adult HSV-2 isolates was selected 489 for protein-level comparisons with the ten neonatal isolates (indicated in

521
The authors declare no competing interests. 522 Tables   524   Table 1: Clinical characteristics associated with HSV-2 isolates from ten patients.    HSV-2 genomes (B), reveals a lack of geographic clustering and the wide genetic distribution of 599 these unrelated isolates. Unlike HSV-1, HSV-2 genomes have been previously noted to lack 600 geographic separation into clades. The network graph was created using SplitsTree4, from a 601 MAFFT trimmed genome alignment. See Table S1 for complete list of accessions and 602 geographic origins of adult HSV2 strains. 603  Table S1. The ratio of synonymous nucleotide differences in each 612 open reading frame (ORF) was also compared to the number of nonsynonymous coding 613 differences; these data are summarized in Table S2 and Figure S3. Proteins such as ICP27 614 (UL54) and US9 display an absence of coding variation in neonatal HSV-2 genomes (see also 615 Figure S3).

641
More detailed information regarding cell-to-cell spread and neurovirulence can be found 642 in Table S3.   Connor J, Jacobs R, Nahmias A, Soong S-J, the National Institute of Allergy

1119
The ratio of non-synonymous (dN) to synonymous (dS) coding variations were plotted for each 1120 HSV-2 protein. The x-axis value represents the average dN/dS ratio for each protein in 58 adult 1121 HSV-2 strains, while the y-axis value represents the average dN/dS ratio in 10 neonatal isolates. 1122 Proteins with a difference in average dN/dS ratios ≥1 in neonatal vs. adult HSV-2 genomes are 1123 labeled (green indicates a higher average dN/dS in neonatal HSV-2 genomes; red indicates a 1124 higher dN/dS in adult HSV-2 genomes). The average dN/dS ratios for all proteins are listed in 1125  Table S2. 1126

1130
For each neonatal HSV-2 isolate, the graph on the left plots spatial location in the genome (x-1131 axis) against the frequency at which each minority variant was observed. The plot on the right 1132 summarizes the number of minor variants (y-axis height) in binned increments of 1% (x-axis). 1133 The data reveal the distinctly different distribution of minority variants in DISS29, and to a lesser 1134 extent CNS15, than in other isolates. Color code matches that used in Figure 6. 1135 SI Page 13 In process *Indicates 10 adult HSV2 strains used for AA comparisons in Figure 5.   Glycoprotein at virion and cell surface; interacts with gB and UL20 (33); important for cytoplasmic envelopment, viral egress, and virus-induced cell fusion gK null virus exhibits poor neuronal spread in culture including decreased retrograde and anterograde transport (34); reduced spread within eye and to nervous system, as well as increased survival, following ocular challenge in mice (35)  Membrane-associated protein interacts with gK and gB (33); required for gK glycosylation, cell surface expression (49), and virus-induced cell fusion (50); important for viral egress (51) UL20 null virus forms small plaques in culture (50); RNA inhibition of UL20 results in decreased rates of encephalitis following HSV footpad infection in mice (52) * present as minority variant ^ Prediction by SMART Analysis