Long Persistence of a Streptococcus pneumoniae 23F Clone in a Cystic Fibrosis Patient

Streptococcus pneumoniae is a common resident in the human nasopharynx. However, carriage can result in severe diseases due to a unique repertoire of pathogenicity factors that are rare in closely related commensal streptococci. We investigated a penicillin-resistant S. pneumoniae clone of serotype 23F isolated from a cystic fibrosis patient on multiple occasions over an unusually long period of over 3 years that was present without causing disease. Genome comparisons revealed an apparent nonfunctional pneumococcus-specific gene encoding a hyaluronidase, supporting the view that this enzyme adds to the virulence potential of the bacterium. The 23F clone harbored unique mosaic genes encoding penicillin resistance determinants, the product of horizontal gene transfer involving the commensal S. mitis as donor species. Sequences identical to one such mosaic gene were identified in an S. mitis strain from the same patient, suggesting that in this case S. pneumoniae played the role of donor.

rates of over 80% are reported occasionally (4)(5)(6) and especially in developing countries (7,8). Carriage may lead to a variety of diseases, such as otitis media, pneumonia, septicemia, and meningitis, especially in young children, elderly people, and immunocompromised patients (for a review, see reference 9). In fact, pneumococcal infections cause more deaths than other infectious diseases worldwide. The pathogenic potential distinguishes S. pneumoniae from other members of the group of viridans streptococci (10).
Numerous virulence factors of S. pneumoniae have been described (for a review, see references 11 and 12), but most of them are also present in the closely related commensal Streptococcus species S. mitis, S. pseudopneumoniae, and S. oralis (13)(14)(15). Typical for S. pneumoniae is the polysaccharide capsule, which is crucial for the pathogenicity of this species. Over 90 capsular types have been reported, based on biochemical and genetic analyses (16), and the potential to cause disease depends on the serotype (ST) (17). Before introduction of pneumococcal conjugate vaccines (PCVs), only a few serotypes accounted for the majority of invasive diseases, including types 4, 6B, 9V, 14, 18C, 19F, and 23F. The prevalence of serotypes changed after introduction of the first seven-valent conjugated vaccine (PCV7) in 2000, which covered serotypes 4, 6B, 9V, 14, 18C, 19F, and 23F, followed later by PSV10, which also included serotypes 1, 5, and 7F, and by PCV13, with the additional serotypes 3, 6A, and 19A. Vaccination was accompanied by the appearance of antibiotic-resistant clones expressing nonvaccine serotypes (18). Other important virulence factors present in most S. pneumoniae strains are the pneumolysin Ply, choline-binding proteins (CBPs), which include the autolysin LytA as well as the variable CBPs PspA, PspC, and PspA, and the hyaluronidase HlyA (11,12).
Due to its ability for genetic transformation, the genomes of S. pneumoniae isolates are highly diverse and include a large accessory genome. The increasing number of available genome sequences has provided an insight into the astounding repertoire of genes available in the pan-genome of pneumococcus (19,20). The current standard for the definition of clones is based on comparative sequence analysis of housekeeping genes, which are part of the core genome common to all strains of the species; the methods is termed multilocus sequence typing (MLST) (21). The Streptococcus pneumoniae MLST database (https://pubmlst.org/spneumoniae) listed 13,126 STs in February 2017. Different capsular serotypes may be found within one ST due to capsule switching (22,23). Genomes of an identical ST may vary considerably in their accessory genome content (24).
We report here on a rare clone of serotype 23F S. pneumoniae representing isolates with intermediate penicillin resistance which have been collected over a period of over 3 years from a patient in Berlin, Germany, with cystic fibrosis (CF); presence of the clone was not associated with disease. The genome sequences of three isolates of the same clone, including one isolate obtained from a different hospital in Germany, were used for comparative analysis of penicillin resistance determinants, the penicillin-binding proteins PBP2x, PBP1a, and PBP2b, and the main pneumococcal virulence factors.

RESULTS
Twenty-nine S. pneumoniae isolates were obtained between 1992 and 1995 from the Wannsee-Lungenklinik-Heckeshorn in Berlin. Seven of these isolates were recovered from one CF patient over a period of 37 months and were not associated with disease ( Table 1). All seven strains expressed serotype 23F and showed identical antibiotic resistance patterns that were distinct from patterns of the other 22 strains (data not shown). MLST revealed that these seven isolates were members of the same clone of a new ST, ST10523. Screening of our strain collection detected another member of ST10523, strain D219, which was isolated in Leipzig, Germany, in 1989 (25). In order to see whether special virulence factors are associated with this clone and whether the isolates from the CF patient differed from D219, the genomes of the first (D122) and last (D141) isolate from Berlin and of D219 from Jena were sequenced.
Antibiotic resistance and penicillin-binding proteins. All seven ST10523 strains had intermediate resistance to beta-lactam antibiotics but were sensitive to tetracy-cline, erythromycin, and chloramphenicol (Table 1). Since PBP2x, PBP2b, and PBP1a play key roles in penicillin resistance, these genes were analyzed in detail to see whether they are related to PBP alleles of other penicillin-resistant S. pneumoniae (PRSP) strains. The PBP sequences of D219, D122, and D141 were identical and contained sequence blocks that diverged from PBP genes of the penicillin-sensitive R6 laboratory strain bỹ 20%. They did not match any other PBP sequences in the NCBI database except for pbp1a (see below). Interestingly, the genomic regions containing pbp2x (spr302 to spr307; yllC to clpL) and pbp2b (spr1513 to spr1517; mutT to pbp2b) contained a significantly larger amount of single nucleotide polymorphisms (SNPs; Ͼ5.3%) in the ST10523 strains than in the entire genome of the R6 strain, excluding variable genes (Ͻ0.7%), indicating transfer of large sequence blocks flanking the PBP genes. Similar observations have been reported previously (26). In contrast, flanking genes of pbp1a showed no signs of recombination events.
The PBP2x gene of ST10523 has a complex mosaic structure ( Fig. 1). It contains sequence blocks that are highly related (Ͻ3% difference on the DNA level) to pbp2x of putative penicillin-sensitive donor S. mitis strains M3, SV01, and NCTC10712 (27), in addition to other divergent sequences of unknown origin (Fig. 1A). The deduced protein sequence included mutations A 338 and E 552 , which are known to confer resistance to beta-lactams; both amino acid changes are frequent in clinical PRSP isolates. The only other 2 amino acids that did not match any of the sensitive reference S. mitis strains were T 410 , present in pbp2x of low-level-resistant viridans streptococci, and T 434 , which occurs in some penicillin-sensitive strains (Fig. 1B) (27).
Furthermore, we obtained pbp2x sequences from four commensal streptococcus isolates from the same patient, isolates D122 and D141 ( Table 1). The PBP2x genes of S. mitis strains B8 and B10 were identical and were distinct from pbp2x of S. oralis B11 and that of S. pneumoniae ST10523. The gene pbp2x B11 contained a central sequence block almost identical to that of Spain 23F -1, and 3=-sequences were related to S. oralis ATCC 35307 (Fig. 1A). BLAST searches revealed one Romanian isolate, HMC3243, of unknown serotype (28) which was partially identical to D219 up to codon 517, whereas the C-terminal part represented R6 sequences ( Fig. 1A and B). Interestingly, pbp2x of S. mitis B93-4 contained a 284-nucleotide (nt) sequence block almost identical to PBP2x D219 (codons 284 to 377), including two SNPs resulting in 1 amino acid change, D 368 N, suggesting that this block was acquired from S. pneumoniae ST10523 (Fig. 1A). The flanking 5=-sequence block was identical to S. oralis B11, and the 3=-end was identical to the S. pneumoniae R6 sequence, documenting multiple interspecies gene transfer events.
The internal mosaic blocks of ST10523 pbp2b and pbp1a were highly related to the respective genes in S. mitis NCTC10712 (see Fig. S1 in the supplemental material). PBP2b contained the mutation A 446 close to the conserved S 443 SN motif, which mediates low resistance levels (29) and which is present in most penicillin-resistant isolates. Moreover, it had one more amino acid, Y 430 , that resulted in a deduced protein of 681 residues (Fig. S1). Regarding PBP1a of ST10523, the change of four consecutive residues, T 574 SQF to NTGY, has been associated with penicillin resistance, but the mutation A 371 within the active site motif S 470 TMK, which also has been implicated in penicillin resistance, was not present (30)(31)(32). The PBP1a gene of ST10523 was identical to that of S. pneumoniae strain HMC3243 (Fig. S1) except for three silent SNPs. In contrast, the mosaic structure of pbp2b was entirely different. This clearly indicated that the three PBP genes were acquired from different sources or occasions in S. pneumoniae HMC3243 compared to ST10523. In summary, all three PBPs contained amino acid mutations that have been associated with the resistance phenotype corresponding to the intermediate penicillin resistance of ST10523.
Genomic comparison of S. pneumoniae genomes. Since ST10523 is a new sequence type, the 2,050,063-nt draft genome of strain D219 was first compared to the R6 genome. Genes with no match in R6 were then used in a BLAST search of the NCBI database. Excluding transposases, the genome of D219 differed from the R6 genome bỹ 8%, including parts of a bacteriocin cluster (SPND219_00557 to SPND219_00567), the cps biosynthesis cluster encoding the 23F capsule (SPND219_00380 to SPND219_00398), and two clusters encoding phage-related genes (SPND219_00003 to SPND219_00023 [phage relict] and SPND219_01526 to SPND219_01585 [prophage]). This percentage corresponds to data obtained in comparative genomic hybridization on an oligonucleotide microarray representing the TIGR4 genome (33). Large parts of the phage relict were present in several genomes of S. pneumoniae. The prophage shows high similarity to S. pneumoniae phage 040922 (GenBank accession number FR671406), which is associated with a Tn916-like element in one S. pneumoniae strain, 18C/3 (34). The prophage contains two large genes (fragments SPND219_01501-3 and SPND219_01535) that encode the surface-expressed tail fiber PblB and the tape measure protein PblA. Homologues of PblA and PblB of S. mitis phage SM1 have been shown to be involved in the platelet-binding activity of S. mitis SF100 (35) and for its virulence in an animal model of infective endocarditis (36). No genes exclusively carried by ST10523 could be detected in BLAST searches.
Between the genome of D219 on one hand and D122 and D141 on the other hand, little difference regarding gene content was noted. The only exceptions were the phage relict and the prophage mentioned above, which were absent in D122 and D141. One gene cluster, SPND122_00705 to SPND122_00709 and SPND141_00707 to SPND141_00711 related to the R6 genes spr0623 to spr0627 (ABC transporter, lactate monooxygenase, 2-lysyl-tRNA synthetase) were not found in D219. However, since these genes are all located on small contigs, including repeat elements such as BOX and RUP (37,38), which result in problems during the genome assembly process, verification of their absence in D219 will require further analyses.
The genomes of D122 and D141 differed from each other by approximately 0.01% SNPs in 49 genes, less than that found in D219 (0.02 to 0.024% affecting a total of 178 genes in D219), i.e., the isolates from the patient are more closely related to each other than to D219, in agreement with their distinct place of isolation. One clear difference was observed in the gene encoding the two-component sensor kinase HK07 (39), which is not essential in S. pneumoniae (40). It was intact in strains D122 and D219 (SPND122_00180 and SPND219_00200), whereas an ISL3 family transposase fragment was inserted into the D141 gene, resulting in three incomplete gene fragments, SPND141_00182 to SPND141_00184.
Virulence genes in ST10523. All genes encoding the main S. pneumoniae specific virulence factors lytA, ply, Nana, and nanB and the CBPs pspC, pspA, and pcpC were present in the three ST10523 genomes. The deduced protein sequences were identical to each other and identical or highly similar to R6 proteins except for pspA, which represented a distinct genetic variant in ST10523. Differences in the repeat regions in genes encoding CBPs were not considered, since they are most likely caused by assembly problems of the choline-binding repeat regions. Other genes encoding surface proteins of important biological function, pavA and, for the IgA proteases, zmpB and zmpA were also present and identical in all three strains, except for the IgA protease, which contained gaps in the sequence of D219 and two SNPs plus a single-nucleotide deletion in D122, resulting in a premature stop codon. No pilus cluster was detectable. The capsule cluster SPND219_00380 to SPND219_00394 differed from that of S. pneumoniae strain Spain 23F -1 by 20 SNPs, as expected for genes of distinct clonal origins. The last four-gene rml operon encoding enzymes involved in dTDP-rhamnose synthesis and which is common to several serotypes (41) included two short divergent regions in rmlB identical to rmlB of many serotype 19F strains.
However, all three ST10523 genomes contained significant differences in hlyA, which encodes the hyaluronidase (SPND219_00348, SPND122_00329, and SPND141_00331) and which are unusual in other S. pneumoniae genomes. Within the HlyA gene, a 4-bp deletion corresponding to the position 128 nt downstream of the putative ATG start codon of R6 resulted in a premature stop codon. Moreover, a 12-nt deletion within the promoter region 10 nt upstream of the ATG start codon was present (Fig. 2). This strongly suggests that no functional hyaluronidase is expressed in ST10523 strains. Searches in the whole-genome contig NCBI database revealed another six genomes which contained this peculiarity ( Table 2), but they differed by up to four SNPs from the D219 region shown in Fig. 2.

DISCUSSION
Prolonged carriage of the same S. pneumoniae clone for a period of 37 months, as observed for serotype 23F isolates obtained from a CF patient in our study, is unusual.  S. pneumoniae is not considered a persistent colonizer in CF patients, unlike Pseudomonas aeruginosa and Staphylococcus aureus (42)(43)(44). Long-term persistence has been reported for Staphylococcus aureus for up to 70 months (45). The duration of pneumococcal carriage in healthy children is a few weeks, ranging from 2 days and in rare cases up to 6 to 12 months, depending on the age of the carrier and the serotype of the isolate (2,3,(46)(47)(48). Some serogroups, including serotype 23F, are generally carried for longer periods than other serogroups (46,49) and have a low propensity to cause invasive disease (50). Interestingly, an inverse relationship between the attack rate of a given capsular serotype and its duration of carriage has been noted (3). Carriage rates for S. pneumoniae isolated from CF patients are similar, with colonization rates ranging between 3 and 20% (51)(52)(53)(54)(55)(56). No special serotypes appear to be associated with CF (57), but some serotypes may be more common, depending on the geographic area. Serotype 23F isolates were prevalent in a CF unit in Madrid, mainly due to the clone Spain 23F -1 of a varied multiresistance phenotype (52). Serotype 3 prevailed in another study which reported no 23F serotype isolates in a CF center in Rome, probably because all patients had received vaccination (58). In both studies, S. pneumoniae was recovered more than once from some patients. However, only three strains of serotype 23F isolated over a period of 3 months showed identical SmaI restriction patterns, revealed by pulsed-field gel electrophoresis (PFGE), and the same antibiotic resistance profile, and thus most likely represented members of the same clone (52). Three patients carried S. pneumoniae with the same serotype and identical SmaI PFGE pattern for 1 to 8 months (58). In these cases, S. pneumoniae was considered a colonizer, since at the time of isolation the patients showed no evidence of pulmonary exacerbation.
Genomic comparisons showed that the two strains, D122 and D141, from the CF patient are more closely related to each other than to D219, which was isolated 3 years before from a different geographical site. They differed from D219 by the absence of two large gene clusters encoding a prophage and a phage relict, by the presence of a five-gene cluster, including an amino acid ABC transporter, and by the estimated number of SNPs. The prophage carries two genes encoding large proteins PblA and PblB. Homologues of these proteins are frequent in S. pneumoniae phages (59) and have been shown to play a role in adhesion and virulence in S. mitis (35,36). It these proteins play a similar role in S. pneumoniae, it is conceivable that their absence in D122 and D141 supports extended carriage.
The variation between D122 and D141 concerned only 49 genes, and D141 contained an insertion of an ISL3 family transposase fragment into the gene encoding the histidine protein kinase HK07, which was absent in the D122 gene. This element was present at another three sites in the D141 genome and at one site in the D122 genome, whereas it could not be detected in D219. Other studies have supported little genomic variation during carriage of the same S. pneumoniae clone. Minimal variation was observed during carriage established experimentally with a single serotype 6B strain of ST138 (60). The maximum SNP distance between any of the 229 isolates obtained over a period of 35 days versus the reference strain was three SNPs (60). It should be noted that the genomic comparison between two isolates of strain D39, a historically important serotype 2 isolate from the early 1940s (61) and which have been cultivated separately for at least 21 years, revealed only five mutations (62). Similarly, some strains isolated during a 7-month period from a child with chronic pneumococcal infection varied by only Յ30 SNPs (63). However, those authors noted there were also multiple events of horizontal gene transfer in some strains, which most likely occurred during polyclonal infection. In contrast, we saw no evidence of gene acquisition in D141 versus D122, and the S. mitis strain B93-4 contained a pbp2x fragment identical to pbp2x of ST10523. The mosaic PBP2x and PBP2b genes of ST10523 represent new gene variants and are distinct from all others found in the NCBI database. Therefore, this finding indicates interspecies gene transfer from S. pneumoniae to S. mitis in the same host.
No genes specifically associated with ST10523 genomes were identified. This is not astounding, given the vast number of actually available genomes of S. pneumoniae.
Based on a pan-genome analysis of 158 S. pneumoniae genomes, it has been predicted that only 0.3 new genes will be discovered in a new genome if a data set from 1,000 genomes is already available, and only 0.06 new genes will be discovered from 5,000 genomes (24). However, an unusual hyaluronidase gene, hlyA, was present in the ST10523 genomes which contained a stop codon (Fig. 2) distinct from that described in a serotype 3 clone, ST180, where an SNP at position 376 of the hyaluronidase coding sequence resulted in a stop codon and truncation of the protein after 125 amino acids (24). Moreover, a 12-nt deletion within the promoter region 10 nt upstream of the ATG was detected. Hyaluronidase is produced by almost all clinical isolates of pneumococci (64). It is one of the genes which have not been found in closely related viridans streptococci, except for some S. oralis isolates, but not in S. mitis strains (65), i.e., it represents a typical component of the species S. pneumoniae. The enzyme depolymerizes hyaluronic acid, which is an important component of the host connective tissue and extracellular matrix (66). No significant impact on virulence was found in a mouse intraperitoneal infection model when a single hlyA mutant was used (67); however, when a double mutant that was also deficient for the pneumolysin gene ply was tested, virulence was significantly decreased (68). Hyaluronidase significantly potentiated pneumolysin-mediated ciliary slowing and epithelial damage in an in vitro model, suggesting that its presence favors colonization and subsequently extrapulmonary dissemination of the pneumococcus (69). It is therefore tempting to assume that the lack of a functional hyaluronidase predestines ST10523 to survive within the host for an extended period without causing disease. In conclusion, the unusually long carriage rate observed for S. pneumoniae isolates D122 and D141 might not be related to mutations or genetic variants acquired during persistence in the human host, but rather to the loss of a large prophage carrying potential virulence factors and to the absence of a complete hyaluronidase gene.

MATERIALS AND METHODS
Bacterial strains and growth conditions. S. pneumoniae strains D122 to D141, S. mitis B8, B93-4, and B10, and S. oralis B11 ( Table 1) were part of a strain collection obtained from the Wannsee-Lungenklinik-Heckeshorn in Berlin; strain D219, isolated in Leipzig, has been described elsewhere (25). Strains were grown at 37°C without aeration in complex C medium (70) supplemented with 0.1% yeast extract (C ϩ Y). MICs were determined by the agar dilution method (for beta-lactams) or by using E-test strips for all other antibiotics (Oxoid GmbH, Basingstoke, United Kingdom) on D-agar plates supplemented with 3% sheep blood (71).
DNA sequencing and analysis. Chromosomal DNA from streptococci was isolated as described previously (72). Internal sequences of the seven housekeeping genes were obtained with primers described on the Streptococcus pneumoniae MLST homepage (https://pubmlst.org/spneumoniae). PBP2x gene fragments were amplified with the primers pn2xup and pn2xdown (72), and direct sequencing of PCR products was performed with consecutive primers. PCR products were purified using a JetQuick DNA purification kit (GenoMed). PCRs were performed using either Goldstar Red Taq polymerase (Eurogentec) or DreamTaq polymerase (Fermentas), according to the manufacturer instructions. The genomes of D219, D122, and D144 were sequenced using a 454 Life Sciences FLX sequencer, and reads were assembled by the 454 Newbler Assembler version 2.6. Contigs were aligned to the S. pneumoniae R6 genome sequence (73). The rapid annotation subsystem technology (RAST) server (74) designed for annotation of bacterial and archaeal genomes was applied to obtain EMBL-formatted files containing protein, tRNA, and rRNA annotations from a large set of several output formats.
DNA analysis and bioinformatic tools. For the analysis of SNPs in the three ST10523 genomes, only sequences that were 350 nt from contig ends were included, to avoid potential errors generated by the 454-generated sequences. Individual open reading frames were investigated manually and compared with other genome sequences by using BLAST analyses and the NCBI database (nucleotides and whole-genome contigs). As a reference for the serotype 23F capsule cluster, genes of strain ATCC (Spain 23F -1) were used (75). Alignments were prepared using Clustal X2 (76). Codon sites were included manually and trimmed by using the program Clustal Formatter 3 (http://nbc11.biologie.uni-kl.de/ sequence_analysis/ClustalFormatter3/documentation.html) to reveal only sites that differ from the reference sequence shown in Fig. 1B.