Skip to main content
  • ASM Journals
    • Antimicrobial Agents and Chemotherapy
    • Applied and Environmental Microbiology
    • Clinical Microbiology Reviews
    • Clinical and Vaccine Immunology
    • EcoSal Plus
    • Eukaryotic Cell
    • Infection and Immunity
    • Journal of Bacteriology
    • Journal of Clinical Microbiology
    • Journal of Microbiology & Biology Education
    • Journal of Virology
    • mBio
    • Microbiology and Molecular Biology Reviews
    • Microbiology Resource Announcements
    • Microbiology Spectrum
    • Molecular and Cellular Biology
    • mSphere
    • mSystems
  • Log in
  • My alerts
  • My Cart

Main menu

  • Home
  • Articles
    • Latest Articles
    • COVID-19 Research and News from ASM Journals
    • mSphere of Influence: Commentaries from Early Career Microbiologists
    • Archive
  • Topics
    • Applied and Environmental Science
    • Clinical Science and Epidemiology
    • Ecological and Evolutionary Science
    • Host-Microbe Biology
    • Molecular Biology and Physiology
    • Therapeutics and Prevention
  • For Authors
    • Getting Started
    • Submit a Manuscript
    • Scope
    • Editorial Policy
    • Submission, Review, & Publication Processes
    • Organization and Format
    • Errata, Author Corrections, Retractions
    • Illustrations and Tables
    • Nomenclature
    • Abbreviations and Conventions
    • Publication Fees
    • Ethics Resources and Policies
  • About the Journal
    • About mSphere
    • Editor in Chief
    • Board of Editors
    • For Reviewers
    • For the Media
    • For Librarians
    • For Advertisers
    • Alerts
    • RSS
    • FAQ
  • ASM Journals
    • Antimicrobial Agents and Chemotherapy
    • Applied and Environmental Microbiology
    • Clinical Microbiology Reviews
    • Clinical and Vaccine Immunology
    • EcoSal Plus
    • Eukaryotic Cell
    • Infection and Immunity
    • Journal of Bacteriology
    • Journal of Clinical Microbiology
    • Journal of Microbiology & Biology Education
    • Journal of Virology
    • mBio
    • Microbiology and Molecular Biology Reviews
    • Microbiology Resource Announcements
    • Microbiology Spectrum
    • Molecular and Cellular Biology
    • mSphere
    • mSystems

User menu

  • Log in
  • My alerts
  • My Cart

Search

  • Advanced search
mSphere
publisher-logosite-logo

Advanced Search

  • Home
  • Articles
    • Latest Articles
    • COVID-19 Research and News from ASM Journals
    • mSphere of Influence: Commentaries from Early Career Microbiologists
    • Archive
  • Topics
    • Applied and Environmental Science
    • Clinical Science and Epidemiology
    • Ecological and Evolutionary Science
    • Host-Microbe Biology
    • Molecular Biology and Physiology
    • Therapeutics and Prevention
  • For Authors
    • Getting Started
    • Submit a Manuscript
    • Scope
    • Editorial Policy
    • Submission, Review, & Publication Processes
    • Organization and Format
    • Errata, Author Corrections, Retractions
    • Illustrations and Tables
    • Nomenclature
    • Abbreviations and Conventions
    • Publication Fees
    • Ethics Resources and Policies
  • About the Journal
    • About mSphere
    • Editor in Chief
    • Board of Editors
    • For Reviewers
    • For the Media
    • For Librarians
    • For Advertisers
    • Alerts
    • RSS
    • FAQ
Research Article | Host-Microbe Biology

Assessment of In Vitro and In Silico Protocols for Sequence-Based Characterization of the Human Vaginal Microbiome

Luisa W. Hugerth, Marcela Pereira, Yinghua Zha, Maike Seifert, Vilde Kaldhusdal, Fredrik Boulund, Maria C. Krog, Zahra Bashir, Marica Hamsten, Emma Fransson, Henriette Svarre Nielsen, Ina Schuppe-Koistinen, Lars Engstrand
Lifeng Zhu, Editor
Luisa W. Hugerth
aCentre for Translational Microbiome Research, Department of Microbiology, Tumour and Cell Biology, Science for Life Laboratory, Karolinska Institutet, Solna, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Luisa W. Hugerth
Marcela Pereira
aCentre for Translational Microbiome Research, Department of Microbiology, Tumour and Cell Biology, Science for Life Laboratory, Karolinska Institutet, Solna, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yinghua Zha
aCentre for Translational Microbiome Research, Department of Microbiology, Tumour and Cell Biology, Science for Life Laboratory, Karolinska Institutet, Solna, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Maike Seifert
aCentre for Translational Microbiome Research, Department of Microbiology, Tumour and Cell Biology, Science for Life Laboratory, Karolinska Institutet, Solna, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Vilde Kaldhusdal
bDepartment of Medicine Solna, Division of Infectious Diseases, Karolinska University Hospital, Center for Molecular Medicine, Karolinska Institutet, Solna, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Fredrik Boulund
aCentre for Translational Microbiome Research, Department of Microbiology, Tumour and Cell Biology, Science for Life Laboratory, Karolinska Institutet, Solna, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Maria C. Krog
cThe Recurrent Pregnancy Loss Unit, Capital Region of Denmark, Rigshospitalet and Hvidovre Hospital, Copenhagen, Denmark
dDepartment of Clinical Immunology, Copenhagen University Hospital, Copenhagen, Denmark
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Zahra Bashir
cThe Recurrent Pregnancy Loss Unit, Capital Region of Denmark, Rigshospitalet and Hvidovre Hospital, Copenhagen, Denmark
eDepartment of Obstetrics and Gynaecology, Holbæk Hospital, Holbæk, Denmark
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Marica Hamsten
aCentre for Translational Microbiome Research, Department of Microbiology, Tumour and Cell Biology, Science for Life Laboratory, Karolinska Institutet, Solna, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Emma Fransson
aCentre for Translational Microbiome Research, Department of Microbiology, Tumour and Cell Biology, Science for Life Laboratory, Karolinska Institutet, Solna, Sweden
fDepartment of Women’s and Children’s Health, Uppsala University, Uppsala, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Henriette Svarre Nielsen
cThe Recurrent Pregnancy Loss Unit, Capital Region of Denmark, Rigshospitalet and Hvidovre Hospital, Copenhagen, Denmark
gDepartment of Obstetrics and Gynecology, Hvidovre Hospital, Copenhagen, Denmark
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ina Schuppe-Koistinen
aCentre for Translational Microbiome Research, Department of Microbiology, Tumour and Cell Biology, Science for Life Laboratory, Karolinska Institutet, Solna, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Lars Engstrand
aCentre for Translational Microbiome Research, Department of Microbiology, Tumour and Cell Biology, Science for Life Laboratory, Karolinska Institutet, Solna, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Lars Engstrand
Lifeng Zhu
Nanjing Normal University
Roles: Editor
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
DOI: 10.1128/mSphere.00448-20
  • Article
  • Figures & Data
  • Info & Metrics
  • PDF
Loading

This article has a correction. Please see:

  • Erratum for Hugerth et al., “Assessment of In Vitro and In Silico Protocols for Sequence-Based Characterization of the Human Vaginal Microbiome”
    - December 23, 2020

ABSTRACT

The vaginal microbiome has been connected to a wide range of health outcomes. This has led to a thriving research environment but also to the use of conflicting methodologies to study its microbial composition. Here, we systematically assessed best practices for the sequencing-based characterization of the human vaginal microbiome. As far as 16S rRNA gene sequencing is concerned, the V1-V3 region performed best in silico, but limitations of current sequencing technologies meant that the V3-V4 region performed equally well. Both approaches presented very good agreement with qPCR quantification of key taxa, provided that an appropriate bioinformatic pipeline was used. Shotgun metagenomic sequencing presents an interesting alternative to 16S rRNA gene amplification and sequencing but requires deeper sequencing and more bioinformatic expertise and infrastructure. We assessed different tools for the removal of host reads and the taxonomic annotation of metagenomic reads, including a new, easy-to-build and -use reference database of vaginal taxa. This curated database performed as well as the best-performing previously published strategies. Despite the many advantages of shotgun sequencing, none of the shotgun approaches assessed here agreed with the qPCR data as well as the 16S rRNA gene sequencing.

IMPORTANCE The vaginal microbiome has been connected to various aspects of host health, including susceptibility to sexually transmitted infections as well as gynecological cancers and pregnancy outcomes. This has led to a thriving research environment but also to conflicting available methodologies, including many studies that do not report their molecular biological and bioinformatic methods in sufficient detail to be considered reproducible. This can lead to conflicting messages and delay progress from descriptive to intervention studies. By systematically assessing best practices for the characterization of the human vaginal microbiome, this study will enable past studies to be assessed more critically and assist future studies in the selection of appropriate methods for their specific research questions.

INTRODUCTION

The human vaginal microbiome plays a key role in maintaining the gynecological health of women of reproductive age. Estrogen is responsible for the cyclic maturation of the vaginal epithelium and the deposition of glycogen in vaginal epithelial cells (1). Shed glycogen-rich cells are an excellent carbon source for lactic acid bacteria (2). Lactic acid lowers the local pH and has bactericidal and immune regulatory effects (3). In addition to keeping bacterial balance and preventing bacterial vaginosis (BV) and aerobic vaginitis (AV) (4), the vaginal microbiome has been shown to play a protective role against infections with viruses such as human papillomavirus (HPV) (5), herpes simplex virus 2 (HSV-2) (6), and human immunodeficiency virus (HIV) (7). The vaginal microbiome might also be protective against adverse pregnancy outcomes, such as early miscarriage (8) and preterm birth (9), as well as gynecological cancers (10).

In clinical practice, the diagnosis of bacterial vaginosis is often based on experienced vaginal symptoms and pH testing, sometimes combined with a visual assessment of a vaginal smear wet mount under microscopy. Systems such as the Amsel criteria (11) and Nugent scoring (12) have been developed to assist in this assessment but are low resolution and low throughput. In research settings, however, it has become standard to sequence part of the 16S rRNA gene to characterize the vaginal microbiome. However, no consensus exists in this field for experimental or bioinformatic best practices, with different studies (sometimes within the same research group) focusing on different variable regions of the 16S rRNA gene (Table 1) (13–20).

View this table:
  • View inline
  • View popup
  • Download powerpoint
TABLE 1

List of primer pairs considered for in silico analysis, including region, sequence, and citation

While extensive work has been published assessing best practices for characterizing free-living bacterial communities (21) or human-associated microbes as a whole (15), these findings are not directly translatable to the human vaginal microbiome for a few reasons. First, clinically important species such as Mycoplasma genitalium and Chlamydia trachomatis have an unusual pattern of substitutions in their rRNA genes, meaning that optimizing for a broad taxonomic range might have the unwanted effect of missing these species. Even more importantly, the 16S rRNA gene is generally regarded to provide taxonomic resolution only down to the genus level (22). However, for the human vaginal microbiome, distinguishing between different Lactobacillus species is crucial, since, e.g., Lactobacillus crispatus often plays a protective role not exerted by Lactobacillus iners (5, 7, 23).

One way to bypass the tradeoffs involved in selecting a PCR primer set is to perform full metagenomic shotgun sequencing. This approach presents several advantages and some serious challenges. Among the advantages of metagenomics is the possibility of going deeper than species-level classification, including identifying strains and specific genes. Recent work applying metagenomics to a large set of vaginal samples has identified extensive intraspecies variation in several important taxa, such as various Lactobacillus species, Gardnerella vaginalis, and Atopobium vaginae (24). It is also known that the degree of stability of the vaginal microbiome can be quite different between individuals (25). This sum of intraspecies variation and variable stability brings the necessity of subspecies resolution to explain why certain microbiomes are more resilient than others.

While all of the methods described above can broadly assess a wide range of taxa, they are only semiquantitative and may introduce different biases at the library preparation and bioinformatic steps. To systematically assess the effect of different variable regions, different bioinformatic approaches, and different taxonomic annotation pipelines on the observed microbial profile of human vaginal samples, we have attempted to identify all primer pairs used in published human vaginal microbiome studies in the past decade. Each of these primers was assessed in silico for taxonomic coverage and annotation accuracy. Different annotation schemes were used for each primer pair. The pairs with the best performance were taken into the lab and used to amplify the same set of samples. Furthermore, shotgun metagenomic sequencing was applied to each of these samples as well. This way, we can directly compare the results between primer sets and sequencing strategies.

The gold standard for quantifying specific organisms is still qPCR, a fully quantitative method. Here, we performed qPCR on three key vaginal taxa (Lactobacillus crispatus, Lactobacillus iners, and Gardnerella vaginalis) to provide a ground truth against which each of the other methods could be assessed. The results described here can guide the implementation of future vaginal microbiome studies and provide valuable information for the comparison of previous studies which have used diverging methods. A summary of all parameters assessed is presented in Table 2.

View this table:
  • View inline
  • View popup
  • Download powerpoint
TABLE 2

Summary of the analyses presented in this work, including parameters varied and where to find the relevant results

RESULTS AND DISCUSSION

Coverage of each primer.To assess how well each primer sequence or primer combination covers potential vaginal taxa, all sequences matching each primer or primer combination were extracted from the database with regular expressions allowing only exact matches to the full length of any variant of each degenerate primer. A problem for the 27f primer variants is that many sequences in the database are incomplete at their 5′ ends, which makes this assessment impossible. The same was not true at the 3′ end: the coverage for this region does not wane until after the V8 region, so it did not affect the assessment of any primers. The total coverage of each primer is depicted in Table 3, and coverage for primer pairs is shown in Table 4. Pair 967-1061 performed much more poorly than the remainder, with the exception of the 27f primers, which could not be properly assessed.

View this table:
  • View inline
  • View popup
  • Download powerpoint
TABLE 3

Coverage of each primer assessed individually

View this table:
  • View inline
  • View popup
  • Download powerpoint
TABLE 4

Coverage of primers in relevant pairs

FIG S1

Effect of error correction/clustering strategy on the estimated alpha-diversity of samples based on different metrics. Simpson’s and Shannon’s diversity scores are calculated either on the full data set or on the data set with exclusion of low-abundance ASV/OTU. Chao1 and ACE richness metrics should be calculated only on the original data set and are therefore presented only in this way. All observed effects are much larger for concatenated reads than for merged reads. Download FIG S1, PDF file, 0.04 MB.
Copyright © 2020 Hugerth et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

FIG S2

Taxonomic profile of each amplicon sample, with different primer sets and different annotation databases. The reproducibility within triplicates is very high. There is good agreement between SILVA and RDP, but GTDB assigns a very large fraction of reads to Bifidobacterium. Download FIG S2, PDF file, 0.1 MB.
Copyright © 2020 Hugerth et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

TABLE S3

Coverage of each vaginotropic genus by each primer pair combination (in percentage points). Download Table S3, CSV file, 0.01 MB.
Copyright © 2020 Hugerth et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

In addition to covering a large percentage of all sequences, it is important that primers avoid taxonomic bias. The taxonomic coverage of each primer pair is depicted in Table S3. Three of the genera that are mostly missed are Propionibacterium, Chlamydia, and Mycoplasma. Propionibacterium is well covered by 341f-805r and possibly 27f-338r. These same pairs perform well with Mycoplasma, but only the former also covers Chlamydia. To add Chlamydia coverage to the 27f pool, one extra degeneracy has to be added to the reverse primer, making it either 515r 6× 5′-GTGBCAGCMGCCGCGGTAA-3′ + 5′-GTGCCAGCAGCTGCGGTAA-3′ or 534r 5′-GTGCCAGCAGCYGCGGTAA-3′. Figure 1 shows a heat map with the taxonomic coverage of each primer pair, assuming a match of the 27f primers, for which an assessment was impossible.

FIG 1
  • Open in new tab
  • Download powerpoint
FIG 1

Heat map of taxon coverage at the genus level for commonly used primer pairs. Each column represents a primer pair, and each row depicts a vaginotropic genus. The percentage of sequences in each genus covered by each primer is indicated through a color scale. Most previously published work uses 16S primers with good coverage, but a few genera remain a problem, such as Chlamydia and Sneathia.

Importantly, primer amplification bias goes beyond entirely missing certain clades. G+C-rich templates might perform differently than those rich in A+T (26), and taxa with longer variants might not be detected as efficiently as others with a shorter variable region (27). These biases are compounded by the exponential nature of PCR amplification. A single copy amplified with an efficiency of 1.9× per cycle will, after 30 cycles, appear to be 5 times less abundant than one amplified at perfect efficiency.

Taxonomic annotation strategies.Even when provided with a primer pair that is potentially informative, researchers must use appropriate bioinformatic pipelines to retrieve this information. At this step, we assume that we have perfect error correction capability and do not attempt to simulate PCR and sequencing errors. For long amplicons, where merging of forward and reverse reads might not be possible, we present results for both merged and unmerged reads. Figure 2 (top panels) presents the taxonomic accuracy for each primer pair and taxonomic annotation strategy for the full set of vaginal taxa. V1-V2 and V1-V3 perform better for the vaginal microbiome than other regions, provided that they are merged, since processing reads separately entails a loss of precision and accuracy as large as a switch to a different region. Species-level accuracy is particularly critical for genus Lactobacillus, since, e.g., L. iners is associated with a very different outcome for the host subject than L. crispatus (5, 23). Figure 2 (bottom panels) presents bar plots of taxonomic accuracy for the 114 Lactobacillus species included in our study. The trends observed are very similar to the ones observed for the full vaginal database.

FIG 2
  • Open in new tab
  • Download powerpoint
FIG 2

Bar plots showing the taxonomic classification accuracy of each primer pair under two classification schemes. DADA2 taxonomic annotation gives higher taxonomic resolution than mapping to a comparable database, both in general and for Lactobacillus in particular. The entire OptiVag database was extracted in silico with each of the candidate primer sets, without errors. (Top left) The complete database, annotated by mapping; (top right) the complete database, annotated with DADA2’s algorithm; (bottom left) same as panel a but focusing only on Lacobacillus; (bottom right) same as panel b but focusing only on Lactobacillus.

Amplicon sequencing.To assess the accuracy of these algorithms, 8 pools of vaginal swabs (coming from 4 consecutive days of sampling from a single individual each) were amplified using either the V1-V3 or the V3-V4 region. For the V1-V3 region, two primer pairs were assessed: 27f-534r has the potential to amplify Chlamydia trachomatis, which is lacking from most other primer pairs. However, its length could create other issues, which is why 27f-515r was also assessed. The V3-V4 regions was amplified by the primer pair 341f-805r. The results observed for this pair can be naturally extended to the also popular pairs V3V4 341f-806r and V4 515f-806r.

For the V3-V4 region, two experimental approaches were compared, using either a single PCR (which both amplifies this region and barcodes it), or two consecutive PCRs (one for amplification and one for barcoding). The one-step PCR approach is more cost-effective, since a single cleaning step is necessary, and minimizes the risk of cross-contamination between wells, since at no point are there samples amplified but not barcoded. However, the long PCR primers can be challenging to obtain, and reaction conditions are also more delicate. Here, both approaches performed very similarly (Fig. 3a), but in some replicates there is a difference in richness (Fig. 3b). These results mean that either the 2-step approach produces more artifacts or the 1-step approach did not capture the full richness of the sample due to worse PCR performance for the long primers. Since the triplicates for the two-step approach yielded more similar results, the latter is the more likely explanation. However, it is also worth considering that while the negative extraction controls for the 1-step approach yielded a total of 3 16S reads post-quality control (QC), the 2-step approach had 2,044 reads, highlighting the risk of working with amplified but not barcoded molecules, especially in a high-throughput setting.

FIG 3
  • Open in new tab
  • Download powerpoint
FIG 3

Effect of various analysis parameters on alpha- and beta-diversity of real amplicons. Orange, V3-V4, 2-step PCR; blue, V3-V4, 1-step PCR; green, V1-V3–515r; gray, V1-V3–534r. (a) Nonmetric multidimensional scaling of the 8 pools, processed in triplicate with V3-V4 primers, shows good replicability within triplicates and regardless of PCR set-up (single-step versus nested reactions). (b) Chao1 richness estimate for each of the samples in panel a. The 1-step PCR approach generally yields a lower richness estimate but has slightly higher variability within triplicates. (c) Box plots depicting the estimated relative abundance of Lactobacillus spp. in each sample when the reads were merged or simply concatenated. There is a disproportional loss of Lactobacillus spp. upon attempting to merge long amplicons. (d) Bar plots showing the number of ASV or operational taxonomic units (OTU) of different cluster size classes obtained with each primer pair and error correction or clustering method. The effect of these choices on alpha-diversity estimates can be seen in Fig. S1.

The V1-V3 amplicons are too long for current paired-end 300-bp approaches to accurately bridge the space between reads. Although ca. 80% of reads in each sample could be merged (medians, 79% for 27f-5153 and 85% for 27f-534r), there is a strong taxonomic bias on the reads kept. Indeed, for pools 3 and 4, which are strongly Lactobacillus dominated, less than 1% of reads could be merged. Figure 3c shows the percentage of Lactobacillus in each sample pool when merging or concatenating (classified with the DADA2 classifier on the SILVA database). Due to this strong bias, read concatenation must be used rather than read merging. Failing to merge decreases the accuracy of this middle region, which is generally already low due to the failing accuracy of sequencing along the read length (28). This poses a challenge. To achieve species-level resolution and an accurate estimate of total species, it has been shown to be crucial to use an error correction strategy rather than a clustering one (29). However, the additional errors kept by not merging reads could potentially make error correction more error prone than simple clustering. Here, we compared two error-correcting strategies, DADA2 (30) and Unoise3 (31), as well as traditional average-linkage clustering at 97% identity.

DADA2 is optimized to correct sequencing errors and will not eliminate PCR errors, so this algorithm is recommended only in combination with a high-fidelity polymerase to avoid large numbers of false positives. Unoise3 eliminates both amplification and sequencing errors but also presents a higher risk of excluding rare but correct sequences, which generally makes Unoise3 a more conservative approach (29). Indeed, in the case of error-prone concatenated reads, DADA2 generated more low-abundance amplicon sequence variants (ASV) than Unoise (Fig. 3d). Clustering yielded even more ASV than error correction, strongly suggesting that the ASV results are more correct. These differences also affect estimates of alpha-diversity (Fig. S1). Since estimates of diversity can be both over- and underestimated due to a large added number of singletons, differences in Shannon's and inverted Simpson's diversity were not significantly different between methods. Estimates of richness, however, were all significantly different between methods (all P < 0.0001), with Unoise giving the lowest estimates and clustering the highest.

DADA2 taxonomic assignment had a higher rate of reads assigned at the species level than the mapping strategy, confirming the results in Fig. 2 (Fig. 4a). The taxonomic composition of each PCR triplicate analyzed with the best possible setup for each primer set is highly comparable (Fig. S2). The effect of the database used for annotation can be larger than that of the region used. Remarkably, the very well established SILVA and RDP databases yield very similar results (Fig. 4b). Primer pair V1-V3–534r yielded slightly worse taxonomic resolution than the other two primers analyzed. Compared to qPCR, all three approaches are extremely accurate, regardless of the database used (Fig. 4c).

FIG 4
  • Open in new tab
  • Download powerpoint
FIG 4

Effects of various parameters on the taxonomic annotation of real amplicons. The taxonomic accuracy of each of the 16S primer sets is good, but V1-V3–534r yields more shallow annotations. It can, however, reliably detect Chlamydia trachomatis spike-in DNA. (a) Box plots showing the depth of classification for each sequence with different classification strategies. The DADA2 classifier yielded higher taxonomic resolution thatn simply mapping, regardless of the database used. (b) Taxonomy bar plots for each of the pools, processed with 2-step PCR with each of the primer sets and with each of the databases SILVA, RDP, and GTDB. An average for each triplicate is shown. Each technical replicate can be seen in Fig. S2. Only ASV with >10 counts are included in this figure. (c) Same samples as in panel a, compared to qPCR results for Lactobacillus iners, Lactobacillus crispatus, and Gardnerella vaginalis. For each sample, the sum of these three taxa was normalized to 1, to make them comparable to the qPCR results in the triaxial plot.

Despite its somewhat lower taxonomic resolution with the read lengths obtained, primer pair V1-V3–534r is the only one expected to amplify and detect Chlamydia trachomatis. To confirm this, a spike-in experiment was conducted (Fig. S3). The varying amount of human DNA initially found in each sample means that a spike-in of 5% of total DNA may correspond to >50% of bacterial DNA, making this analysis harder to interpret. In general, there is a good correlation between spiked-in and observed C. trachomatis.

FIG S3

Percentage of C. trachomatis detected in each sample as a function of the DNA spike-in. Differences in human DNA content affect the observed bacterial counts, and for the three samples with highest DNA content (pools 3, 4, and 6), the assay quickly becomes saturated. Dashed gray lines mark 1%, 5%, and 10%, which were the proportions used for the spike-in experiment. Download FIG S3, PDF file, 0.03 MB.
Copyright © 2020 Hugerth et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

In silico removal of human DNA from metagenomic data.An alternative to PCR amplification is performing full shotgun metagenomic sequencing of samples. The first challenge for processing metagenomic reads derived from vaginal swabs is the large amount of human DNA in these samples. In our pools, 86 to 98% of the reads could be mapped to the human genome. While human DNA depletion can be performed in vitro prior to sequencing (32, 33), this depends on the storage condition of the samples and was not evaluated in this work. Instead, we focused on in silico removal of human reads.

Removal of reads of human origin is a conceptually straightforward process consisting of mapping reads to a reference genome. However, two critical factors can affect the outcome: the mapping algorithm and the masking applied to the reference genome to hide regions exhibiting homology to Bacteria and Fungi. Strict mapping is time and memory intensive. A looser mapping is less resource intensive but might remove more bacterial reads or retain more human reads, depending on how strictly the reference is masked. Many mappers provide a preformatted human genome reference for host removal. Here, we tested three of them: BMTagger, BBmap, and Kraken2, the latter in both “quick” and “standard” modes. We also ran Bowtie2 in –fastlocal settings to contrast it with the –very-sensitive-local settings that we used as the gold standard for this analysis. The percentage of human DNA left in each sample after human DNA removal with the different techniques is depicted in Fig. 5a. The percentage of bacterial DNA kept, from the initial bacterial pool, is shown in Fig. 5b. These two quality scores are combined in Fig. 5c, where the optimal method would place all samples in the upper left corner. BBMap and Bowtie2 retained the most human DNA but also the most bacterial reads. Conversely, BMTagger and Kraken2 removed the most human reads, at the expense of also decreasing the microbial pool. Based on the results above, Kraken2, in quick mode, was chosen for downstream analysis.

FIG 5
  • Open in new tab
  • Download powerpoint
FIG 5

Effect of human-DNA removal strategies on the amount of bacterial and human DNA retained. The human-DNA content in the 8 pools varied from 86 to 98% of the total DNA content. Different DNA removal methods have various amounts of human DNA left in the filtered sample but also retain various amounts of the original microbial pool. Kraken and BMTagger remove most human DNA but also the most microbial reads. (a) Percentage of human reads in each sample before and after each human removal strategy. (b) Percentage of the original pool of microbial reads kept in each sample after each human removal strategy. (c) The two measurements in a and b are combined into a scatterplot to give an overview of the performance of each tool.

Interestingly, all tools followed the same general trends, removing more bacterial reads and also retaining more human reads in samples with an initially very high (>95%) human DNA content. Of notice, these three samples (pools 3, 4, and 6) also have the highest Lactobacillus counts. To assess whether removal of human content causes selective removal of specific bacterial taxa, we also attempted to assign taxonomy to these putative human reads (detected with Bowtie2 in very-sensitive-local mode). For each sample pool, >98.4% of putative human reads could not be assigned to any bacterial genome, strongly suggesting that these are indeed eukaryotic reads. About two-thirds of the 1.5% of reads are classified as Zoebellia, a genus of marine Flavobacteriaceae not known to infect humans. The remaining third is chiefly assigned to Chlamydia psittaci and Chlamydia abortus. While we have not evaluated the read alignments in detail, we speculate that the reference genome of these intracellular parasites may contain small amounts of sequences of human origin, generating this misleading assignment. Therefore, the larger amount of human DNA observed in Lactobacillus-rich samples is likely to be true human DNA, connected to the shedding of glycogen-rich epithelial cells that feeds the Lactobacillus community.

Taxonomic annotation of metagenomic data.Five approaches were assessed for taxonomic assignment on these data: a general marker gene-based approach (MetaPhlAn2), a marker gene-based approach built from a curated set of vaginal bacteria (VIRGO), a k-mer-based approach with a broad taxonomic database (Kraken2; see Materials and Methods for details), a k-mer-based approach with a vaginal-only database (Kraken2), and a novel prefiltering and alignment tool (Metalign). The taxonomic profile inferred by each method for each pool is depicted in Fig. 6a. Metalign stands out in identifying Chlamydia trachomatis in almost every pool, as well as a higher frequency of detection of Veillonella spp. and Prevotella spp. The standard Kraken2 database failed to identify L. iners, despite this species being present in the database. Kraken2 with OptiVagDB, Metaphlan, and VIRGO tended to present similar results, with a few notable differences. First, the clade called BVAB3 in VIRGO takes its current name Mageeibacillus indolicus in the other two references. Metaphlan fails to identify BVAB1, perhaps because this genome is still not in NCBI’s RefSeq database. OptiVag is alone in identifying significant amounts of Peptoniphilus in three of the Gardnerella-dominated samples. This clade has been identified in women with bacterial vaginosis (34) but is generally not considered a key taxon for this condition. Finally, VIRGO stands out in not identifying any Sneathia organisms, even in samples where all other methods are in agreement.

FIG 6
  • Open in new tab
  • Download powerpoint
FIG 6

Effect of taxonomy assignment strategy on the perceived taxonomic profile of each sample. Assigning taxonomy to shotgun metagenomic reads with various tools yields somewhat different community profiles. (a) Taxonomy for each pool assigned with Metaphlan, Metalign, or Kraken2 to its complete microbial database, Kraken2 to the OptiVag database, or VIRGO. (b) Same samples as in panel a, compared to qPCR results for Lactobacillus iners, Lactobacillus crispatus, and Gardnerella vaginalis. For each sample, the sum of these three taxa was normalized to 1, to make them comparable to the qPCR results in the triaxial plot. (c) Manhattan distance between each sample and method and its corresponding qPCR profile. In this three-dimensional structure, the Manhattan distance is strictly limited between 0 (identical profiles) and 3 (maximum distance for each of the three species considered).

Comparison to qPCR showed that none of the shotgun methods was as accurate as the PCR-based methods (Fig. 6b; contrast to Fig. 4b). Still, when each pool is considered, VIRGO and OptiVag performed better than the other methods (Fig. 4c). It is possible that assessing taxonomy after assembly would yield more accurate results (35), but this was not possible with the current sampling depth. Still, this could be a valid alternative for samples sequenced more deeply, or for a different experimental design, e.g., a time series from the same woman, which would enable coassembly across closely related samples.

Conclusions.None of the methods assessed here is superior in all respects. With regard to amplicons, V3-V4 yielded the most plausible alpha-diversity estimates and had very good taxonomic coverage. However, much of the existing literature is based on region V1-V3 (14–16). The major drawback of 16S amplicons is their failure to detect eukaryotic taxa such as Candida spp. and Trichomonas vaginalis. An ITS (internal transcribed spacer)-based amplicon approach could selectively amplify fungi without amplifying human DNA (36), but it would miss the pathogenic parabasalid T. vaginalis. Therefore, no simple combination of one or two primer sets can accurately profile all relevant taxa in the human vaginal environment.

To overcome the limitations imposed by primer selection, shotgun metagenomic sequencing presents an interesting alternative, since it is not a priori bound by phylogeny. Its cost, which used to be prohibitive, is now low enough to compete with a multiprimer PCR-based approach. In addition to taxonomic classification, shotgun data allow researchers to assess the functional gene content of a sample and, given enough sequencing depth, assemble draft genomes of strains of interest.

The main practical obstacle to a broader application of shotgun metagenomics in the field of obstetrics and gynecology is the large amount of human DNA in vaginal swabs, but this can potentially be bypassed, either with molecular biology techniques or a combination of deep sequencing and in silico human DNA removal. The bioinformatic skill set and computational requirements necessary to handle this type of data are also significantly larger than those needed for marker gene (16S) analyses.

Comparing data sets derived from amplicon or shotgun sequencing also requires an understanding of the specific biases in each of these technologies. Despite using different primer sets and enzymes, it is not entirely unexpected that the PCR-based data have better agreement with the qPCR data, since these share many common biases, such as copy number variations. The linear amplification strategy used with DNBSeq (37) is potentially less biased than PCR-based strategies, but these claims have not yet been supported by independent research groups. The role of GC bias, which is significant for most other massively parallel sequencing technologies (26), is also currently unknown for this technology.

Here, we present a thorough comparison of multiple methods available for the survey of the vaginal microbiota. Since none of the methods is universally optimal, it is still up to each research center to select the appropriate method for their specific research question. While this will necessarily limit comparability between studies, acknowledging the strengths and weaknesses of each method is already a substantial improvement to the current state of the field.

MATERIALS AND METHODS

Construction of the databases.To create a corresponding shotgun database, we started from the list of vaginotropic species published by Diop et al. (38). In addition to these previously published results, a data set of 480 vaginal swabs collected throughout the menstrual cycle of a healthy Danish cohort (M. C. Krog et al., submitted for publication) and sequenced by CoreBiome (St. Paul, MN, USA) using BoosterShot technology was used. For every bacterial species identified in the data set and not present in the Diop database, manual searches of PubMed and NucCore were done, and the species was kept if it had been previously identified in the human urogenital tract. Eukaryotic species were added by searching NucCore with the search key “((vagina[All Fields] AND “Eukaryota”[Organism]) NOT “Metazoa”[Organism]) NOT “Viridiplantae”[Organism] AND (biomol_genomic[PROP] AND refseq[filter]).” Finally, a free-text search for “BVAB” retrieved metagenome-associated genomes representative of the bacterial vaginosis-associated Clostridiales group. The resulting list of taxa is available in Table S1. When a taxon could not be programmatically included in the database, manual searches against NCBI’s Taxonomy database were used to verify whether the taxon name had been updated. Not all taxa could be retrieved as full genomes, as some are present in the databases only as single genes; these taxa are missing from the current version of the database. The resulting database (v0.1) and the scripts used for producing a genome database based on a taxon list are available at https://github.com/ctmrbio/optivag/tree/master/database.

TABLE S1

Species including in the OptiVagDB, including the source of their description as vaginotropic. Download Table S1, CSV file, 0.08 MB.
Copyright © 2020 Hugerth et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

Simulated amplicons.Amplicons were extracted from the 16S rRNA gene database based on exact matches to the primers. For amplicons starting at the 27f position, which is often not included in the reference sequence due to its location, two alternative approaches were compared. The pessimistic approach extracts only sequences containing the primer regions, while the optimistic assumes that all sequences lacking the 5′ end would be amplified by the 27f primer. The truth is likely somewhere between these extremes.

In silico reads were extracted from the OptiVagDB v0.1 for each primer pair, using a read length of 250 bp. While it is possible to sequence longer fragments with the commercial kits available today, this is a realistic read length after trimming primer pairs and low-quality base pairs. We did not simulate PCR and sequencing errors for these reads, since the goal of this step was to assess the performance of primers under ideal conditions. For amplicons <500 bp long, the resulting reads were merged; otherwise, they were treated independently. When the resulting amplicon length was very close to 500 bp, both approaches were considered, since the ability to merge reads becomes dependent on the accuracy of the sequencer used.

Sample collection.Women were recruited by advertisements in student magazines, university notice boards, and social media and were included between September 2017 and January 2018 at Rigshospitalet, Copenhagen, Denmark. The women were provided with self-collection kits and received instructions for vaginal swab collection. In short, they were instructed to separate the labia major with one hand (in order to reduce the risk of contamination with microbiota from external genitals), insert a swab (FLOQSwabs [CP520CS01; Copan Flock Technologies, Brescia, Italy]) into the vagina with the other hand, and rotate it for 10 to 15 s before placing the swab in the provided collection tube (FluidX tube [65-7534; Brooks Life Sciences, Chelmsford, MA, USA] containing 0.8 ml DNA/RNA-shield [R1100-250; Zymo Research, Irvine, CA, USA]) and breaking off the handle. Samples were kept at room temperature for up to 2 weeks and then at −20°C for up to 4 weeks before being transferred to −80°C. All participants gave oral and written consent to participate in the study and were remunerated with 3,000 Danish kroner (DKK) after completing sample collection. All data were collected and managed using REDCap electronic data capture tools (39), hosted at the Capital Region of Denmark. The study was approved by The Regional Committee on Health Research Ethics (H-17017580) and the Data Protection Agency in the Capital Region of Denmark (2012-58-0004).

DNA extraction.DNA extraction was performed with the Quick-DNA Magbead Plus kit (D4082; Zymo Research, Irvine, CA, USA), according to the manufacturer’s instructions with few modifications. Prior to extraction, the samples were subjected to bead beating for 1 min at 1,600 rpm using ZR Bashing Bead lysis matrix (S6012; Zymo Research, Irvine, CA, USA). After bead beating, samples were treated with a lysozyme solution 37°C for 60 min (lysozyme recipe: 20 mM Tris-Cl, pH 8; 2 mM sodium EDTA [Tris-EDTA; Sigma-Aldrich, catalog no. T9285]; lysozyme [Sigma-Aldrich, catalog no. L6876-100G] to 100 mg/ml) and proteinase K at 55°C for 30 min (20 mg/ml, part of the extraction kit), previously to DNA cleanup using a Freedom EVO robot (Tecan, Männendorf, Switzerland). Eight sample pools were created for this study, consisting of 4 consecutive daily vaginal swabs from each of 8 individuals from a cohort of healthy young women. All eight sample pools were used for each of the experimental approaches attempted.

Sequence amplification, sequencing, and error correction.The following PCR set-ups were used: (i) one-step PCR amplification of the V3-V4 region, (ii) two-step PCR amplification of the V3-V4 region, (iii) two-step PCR amplification of the V1-V3 region using reverse primer 515r, and (iv) two-step PCR amplification of the V1-V3 region using reverse primer 534r. The same settings were used for an experiment with a Chlamydia DNA spike-in (gblocks gene fragment; Integrated DNA Technologies, Coralville, IA, USA). DNA was spiked in at 1%, 5%, or 10%.

The primer sequences and specific PCR conditions are described in Table S2. All PCRs were performed in 50-μl reaction mixtures using Phusion Hot Start II high-fidelity PCR master mix (F-565L; Thermo Fisher Scientific, MA, USA). The 1-step PCR included 1.5 μl of dimethyl sulfoxide (DMSO). All PCR products were purified with Agencourt AMPure XP beads (A63881; Beckman Coulter, Brea, CA, USA). For the two-step reactions, the purified sample was used as the template for barcoding with Nextera XT index kit v2 (FC-131-1002; Illumina, Inc., San Diego, CA, USA). The finished libraries were normalized to 4 nM, pooled, and sequenced in a MiSeq system using V3 chemistry (Illumina, Inc.).

TABLE S2

PCR conditions for all reactions described in this work. Download Table S2, CSV file, 0.00 MB.
Copyright © 2020 Hugerth et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

Cutadapt (40) was used to trim primers, remove sequences not containing the expected primer pairs, and remove bases with a Phred score of <15.

Merging and error correction was performed with DADA2 (30) or Unoise (31), as described in Results. For amplicons in the V1-V3 region which were too long to be merged appropriately, concatenation of the forward and reverse reads was performed. In this case, reads were trimmed to 270 bp each. Amplicons for which at least one read did not reach 270 bp with a Phred score of >15 were discarded. Amplicons for which the expected error rate over the resulting 540 bp was >4 were discarded. The resulting concatenated products were subjected to either error correction as described above or clustering at 97% identity and chimera removal with Vsearch (41).

Taxonomic annotation of amplicons.Taxonomic annotation of in silico amplicons was performed with DADA2’s (v1.5) (30) built-in sequence classifier, based on the SILVA database (v128) (42), or by direct mapping to the SILVA v128 database. In addition to these two approaches, real amplicons were also classified with the DADA2 classifier against the RDP v16 (43) or GTDB v86 (44).

For amplicons that could not be merged, a consensus between the potentially distinct annotations of forward and reverse reads was established as follows. (i) If the two annotations were incompatible, the lowest common ancestor was kept (e.g., in cases where families agreed but genera diverged, only family-level annotation was kept). (ii) If one annotation was more detailed than the other (e.g., to genus level versus to family level) but the two annotations agreed on all levels where they overlapped, the most detailed annotation was kept. (iii) For species-level annotations where more than one species was possible, the intersection of the species suggested for each of the reads was kept (e.g., if the forward read was annotated as “Lactobacillus crispatus/gasseri/jensenii” and the reverse as “Lactobacillus gasseri/jensenii/longum,” the resulting annotation would be “Lactobacillus gasseri/jensenii”).

Metagenomic shotgun sequencing.The same eight pools were used for whole-genome library preparation. MGI FS DNA library prep kit (16×, 1000006987; MGI, Shenzhen, China) was used according to the manufacturer’s instructions, except that 50 ng of DNA was used as input instead of the suggested 200 ng. Due to the smaller amount of input DNA, instead of double bead cleanup for size selection, a single cleanup step was applied. MGI sequencing technology uses enzymatic fragmentation of DNA followed by barcoding of samples using PCR (7 PCR cycles in this study), single-strand circularization, and DNA nanoball construction. All procedures were automated using SP-960 and SP-100 robots (MGI). The sequencing step was performed in a DNBSEQ-G400 sequencer (MGI) using the high-throughput sequencing set (PE150 1000016952; MGI) with DNA libraries loaded onto to the flow cell using the DNB loader MGIDL-200 (MGI).

Human-DNA removal.Human reads were removed in silico by one of the following strategies: (i) Bowtie2 v2.3.5 (45) with the setting –fast-local; (ii) BMTagger v1.1.0 (46) mapping to the GRCh38 reference library with standard masking; (iii) BBMap v38.68 (47) against the hg19 reference library, masked as described in http://seqanswers.com/forums/showthread.php?t=42552; (iv) Kraken2 v2.0.8-beta (48) against its built-in GRCh38 human reference, setting the confidence parameter to 0.1; (v) Kraken2 with the parameters named above, adding flag –quick.

To be able to independently assess the human read removal performance of the aforementioned methods, reads were mapped to the hg19 masked reference using Bowtie2 v2.3.5 (45) with the setting –very-sensitive-local.

Taxonomic annotation of shotgun reads.For assigning taxonomy to the remaining microbial reads, four approaches were assessed: (i) Metaphlan2 v2.9.21 (49) with standard parameters; (ii) Kraken2 v2.0.8-beta (48) to a general database (built using –download-library flags for archaea, bacteria, viruses, fungi, and human) setting confidence to 0.5, followed by Bracken v2.0 (50) with threshold set to 1 read per million; (iii) Kraken2 with the same parameters, except for using the curated vaginal database described above; (iv) Metalign v0.9.1 (51) with length normalization.

qPCR quantification of key taxa.To further validate the results observed by sequencing, three key taxa, namely, Lactobacillus crispatus (VPI-3199), Lactobacillus iners (ATCC-55195), and Gardnerella vaginalis (CCUG-44120) were quantified by qPCR using LightCycler 480 (Roche, Mannheim, Germany) and a SYBR green assay from Bio-Rad (1725270; Bio-Rad, Sundbyberg, Sweden). The primer sequences and PCR conditions are described in Table S2. These primers were originally described by Zozaya-Hinchliffe et al. (52) and were further validated by Akutsu et al. (53) In the triaxial plots presented, the sum of these three taxa is normalized to 1 for each method presented, to allow a direct comparison.

Data availability.All sequencing data analyzed in this study are available from the European Nucleotide Archive under project number PRJEB37382. 16S reads have the identifiers ERR4704801 to ERR4704929, and shotgun reads have the identifiers ERR4705195 to ERR4705329.

ACKNOWLEDGMENTS

We thank Pia Angelidou from the Centre for Translational Microbiome Research for her efforts in DNA extraction.

L.W.H. wrote code, performed in silico experiments, planned in vitro experiments, analyzed the sequencing data, and wrote the manuscript; M.P., Y.Z., and M.S. planned and performed in vitro experiments and wrote the manuscript; V.K. wrote code and performed in silico experiments; F.B., I.S.K., and M.H. planned experiments and wrote the manuscript; E.F. and L.E. wrote the manuscript; M.C.K., Z.B., and H.S.N. obtained ethics and data protection approval, planned and organized the study cohort, included participants, secured informed consent, and collected samples. All authors read and approved the final manuscript.

FOOTNOTES

    • Received May 12, 2020.
    • Accepted October 21, 2020.
  • [This article was published on 18 November 2020 with 11th author's name misspelled. The byline was updated in the current version, posted on 11 December 2020.]

  • Copyright © 2020 Hugerth et al.

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license.

REFERENCES

  1. 1.↵
    1. Godha K,
    2. Tucker KM,
    3. Biehl C,
    4. Archer DF,
    5. Mirkin S
    . 2018. Human vaginal pH and microbiota: an update. Gynecol Endocrinol 34:451–455. doi:10.1080/09513590.2017.1407753.
    OpenUrlCrossRef
  2. 2.↵
    1. Linhares IM,
    2. Sisti G,
    3. Minis E,
    4. de Freitas GB,
    5. Moron AF,
    6. Witkin SS
    . 2019. Contribution of epithelial cells to defense mechanisms in the human vagina. Curr Infect Dis Rep 21:30. doi:10.1007/s11908-019-0686-5.
    OpenUrlCrossRef
  3. 3.↵
    1. Amabebe E,
    2. Anumba DOC
    . 2018. The vaginal microenvironment: the physiologic role of lactobacilli. Front Med (Lausanne) 5:181. doi:10.3389/fmed.2018.00181.
    OpenUrlCrossRef
  4. 4.↵
    1. Donders GGG,
    2. Bellen G,
    3. Grinceviciene S,
    4. Ruban K,
    5. Vieira-Baptista P
    . 2017. Aerobic vaginitis: no longer a stranger. Res Microbiol 168:845–858. doi:10.1016/j.resmic.2017.04.004.
    OpenUrlCrossRef
  5. 5.↵
    1. Norenhag J,
    2. Du J,
    3. Olovsson M,
    4. Verstraelen H,
    5. Engstrand L,
    6. Brusselaers N
    . 2020. The vaginal microbiota, human papillomavirus and cervical dysplasia: a systematic review and network meta‐analysis. BJOG 29:171–180. doi:10.1111/1471-0528.15854.
    OpenUrlCrossRef
  6. 6.↵
    1. Cherpes TL,
    2. Meyn LA,
    3. Krohn MA,
    4. Lurie JG,
    5. Hillier SL
    . 2003. Association between acquisition of herpes simplex virus type 2 in women and bacterial vaginosis. Clin Infect Dis 37:319–325. doi:10.1086/375819.
    OpenUrlCrossRefPubMedWeb of Science
  7. 7.↵
    1. Farcasanu M,
    2. Kwon DS
    . 2018. The influence of cervicovaginal microbiota on mucosal immunity and prophylaxis in the battle against HIV. Curr HIV/AIDS Rep 15:30–38. doi:10.1007/s11904-018-0380-5.
    OpenUrlCrossRef
  8. 8.↵
    1. Eckert LO,
    2. Moore DE,
    3. Patton DL,
    4. Agnew KJ,
    5. Eschenbach DA
    . 2003. Relationship of vaginal bacteria and inflammation with conception and early pregnancy loss following in-vitro fertilization. Infect Dis Obstet Gynecol 11:11–17. doi:10.1155/S1064744903000024.
    OpenUrlCrossRefPubMed
  9. 9.↵
    1. Freitas AC,
    2. Bocking A,
    3. Hill JE,
    4. Money DM
    , VOGUE Research Group. 2018. Increased richness and diversity of the vaginal microbiota and spontaneous preterm birth. Microbiome 6:117. doi:10.1186/s40168-018-0502-8.
    OpenUrlCrossRef
  10. 10.↵
    1. Łaniewski P,
    2. Ilhan ZE,
    3. Herbst-Kralovetz MM
    . 2020. The microbiome and gynaecological cancer development, prevention and therapy. Nat Rev Urol 17:232–250. doi:10.1038/s41585-020-0286-z.
    OpenUrlCrossRef
  11. 11.↵
    1. Amsel R,
    2. Totten PA,
    3. Spiegel CA,
    4. Chen KC,
    5. Eschenbach D,
    6. Holmes KK
    . 1983. Nonspecific vaginitis. Diagnostic criteria and microbial and epidemiologic associations. Am J Med 74:14–22. doi:10.1016/0002-9343(83)91112-9.
    OpenUrlCrossRefPubMedWeb of Science
  12. 12.↵
    1. Nugent RP,
    2. Krohn MA,
    3. Hillier SL
    . 1991. Reliability of diagnosing bacterial vaginosis is improved by a standardized method of gram stain interpretation. J Clin Microbiol 29:297–301. doi:10.1128/JCM.29.2.297-301.1991.
    OpenUrlAbstract/FREE Full Text
  13. 13.↵
    1. Frank JA,
    2. Reich CI,
    3. Sharma S,
    4. Weisbaum JS,
    5. Wilson BA,
    6. Olsen GJ
    . 2008. Critical evaluation of two primers commonly used for amplification of bacterial 16S rRNA genes. Appl Environ Microbiol 74:2461–2470. doi:10.1128/AEM.02272-07.
    OpenUrlAbstract/FREE Full Text
  14. 14.↵
    1. Ravel J,
    2. Gajer P,
    3. Abdo Z,
    4. Schneider GM,
    5. Koenig SSK,
    6. McCulle SL,
    7. Karlebach S,
    8. Gorle R,
    9. Russell J,
    10. Tacket CO,
    11. Brotman RM,
    12. Davis CC,
    13. Ault K,
    14. Peralta L,
    15. Forney LJ
    . 2011. Vaginal microbiome of reproductive-age women. Proc Natl Acad Sci U S A 108(Suppl 1):4680–4687. doi:10.1073/pnas.1002611107.
    OpenUrlAbstract/FREE Full Text
  15. 15.↵
    Jumpstart Consortium Human Microbiome Project Data Generation Working Group. 2012. Evaluation of 16S rDNA-based community profiling for human microbiome research. PLoS One 7:e39315. doi:10.1371/journal.pone.0039315.
    OpenUrlCrossRefPubMed
  16. 16.↵
    1. Brotman RM,
    2. Shardell MD,
    3. Gajer P,
    4. Tracy JK,
    5. Zenilman JM,
    6. Ravel J,
    7. Gravitt PE
    . 2014. Interplay between the temporal dynamics of the vaginal microbiota and human papillomavirus detection. J Infect Dis 210:1723–1733. doi:10.1093/infdis/jiu330.
    OpenUrlCrossRefPubMed
  17. 17.↵
    1. Hugerth LW,
    2. Wefer HA,
    3. Lundin S,
    4. Jakobsson HE,
    5. Lindberg M,
    6. Rodin S,
    7. Engstrand L,
    8. Andersson AF
    . 2014. DegePrime, a program for degenerate primer design for broad-taxonomic-range PCR in microbial ecology studies. Appl Environ Microbiol 80:5116–5123. doi:10.1128/AEM.01403-14.
    OpenUrlAbstract/FREE Full Text
  18. 18.↵
    1. Mändar R,
    2. Punab M,
    3. Borovkova N,
    4. Lapp E,
    5. Kiiker R,
    6. Korrovits P,
    7. Metspalu A,
    8. Krjutškov K,
    9. Nõlvak H,
    10. Preem J-K,
    11. Oopkaup K,
    12. Salumets A,
    13. Truu J
    . 2015. Complementary seminovaginal microbiome in couples. Res Microbiol 166:440–447. doi:10.1016/j.resmic.2015.03.009.
    OpenUrlCrossRef
  19. 19.↵
    1. Elovitz MA,
    2. Gajer P,
    3. Riis V,
    4. Brown AG,
    5. Humphrys MS,
    6. Holm JB,
    7. Ravel J
    . 2019. Cervicovaginal microbiota and local immune response modulate the risk of spontaneous preterm delivery. Nat Commun 10:1305. doi:10.1038/s41467-019-09285-9.
    OpenUrlCrossRef
  20. 20.↵
    1. Chen C,
    2. Song X,
    3. Wei W,
    4. Zhong H,
    5. Dai J,
    6. Lan Z,
    7. Li F,
    8. Yu X,
    9. Feng Q,
    10. Wang Z,
    11. Xie H,
    12. Chen X,
    13. Zeng C,
    14. Wen B,
    15. Zeng L,
    16. Du H,
    17. Tang H,
    18. Xu C,
    19. Xia Y,
    20. Xia H,
    21. Yang H,
    22. Wang J,
    23. Wang J,
    24. Madsen L,
    25. Brix S,
    26. Kristiansen K,
    27. Xu X,
    28. Li J,
    29. Wu R,
    30. Jia H
    . 2017. The microbiota continuum along the female reproductive tract and its relation to uterine-related diseases. Nat Commun 8:875. doi:10.1038/s41467-017-00901-0.
    OpenUrlCrossRef
  21. 21.↵
    1. Klindworth A,
    2. Pruesse E,
    3. Schweer T,
    4. Peplies J,
    5. Quast C,
    6. Horn M,
    7. Glöckner FO
    . 2013. Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies. Nucleic Acids Res 41:e1. doi:10.1093/nar/gks808.
    OpenUrlCrossRefPubMed
  22. 22.↵
    1. Escobar-Zepeda A,
    2. Godoy-Lozano EE,
    3. Raggi L,
    4. Segovia L,
    5. Merino E,
    6. Gutiérrez-Rios RM,
    7. Juarez K,
    8. Licea-Navarro AF,
    9. Pardo-Lopez L,
    10. Sanchez-Flores A
    . 2018. Analysis of sequencing strategies and tools for taxonomic annotation: defining standards for progressive metagenomics. Sci Rep 8:12034. doi:10.1038/s41598-018-30515-5.
    OpenUrlCrossRef
  23. 23.↵
    1. Petricevic L,
    2. Domig KJ,
    3. Nierscher FJ,
    4. Sandhofer MJ,
    5. Fidesser M,
    6. Krondorfer I,
    7. Husslein P,
    8. Kneifel W,
    9. Kiss H
    . 2015. Characterisation of the vaginal Lactobacillus microbiota associated with preterm delivery. Sci Rep 4:5136. doi:10.1038/srep05136.
    OpenUrlCrossRef
  24. 24.↵
    1. Ma B,
    2. France MT,
    3. Crabtree J,
    4. Holm JB,
    5. Humphrys MS,
    6. Brotman RM,
    7. Ravel J
    . 2020. A comprehensive non-redundant gene catalog reveals extensive within-community intraspecies diversity in the human vagina. Nat Commun 11:940. doi:10.1038/s41467-020-14677-3.
    OpenUrlCrossRef
  25. 25.↵
    1. Ravel J,
    2. Brotman RM,
    3. Gajer P,
    4. Ma B,
    5. Nandy M,
    6. Fadrosh DW,
    7. Sakamoto J,
    8. Koenig SS,
    9. Fu L,
    10. Zhou X,
    11. Hickey RJ,
    12. Schwebke JR,
    13. Forney LJ
    . 2013. Daily temporal dynamics of vaginal microbiota before, during and after episodes of bacterial vaginosis. Microbiome 1:29. doi:10.1186/2049-2618-1-29.
    OpenUrlCrossRefPubMed
  26. 26.↵
    1. Browne PD,
    2. Nielsen TK,
    3. Kot W,
    4. Aggerholm A,
    5. Gilbert MTP,
    6. Puetz L,
    7. Rasmussen M,
    8. Zervas A,
    9. Hansen LH
    . 2020. GC bias affects genomic and metagenomic reconstructions, underrepresenting GC-poor organisms. Gigascience 9:giaa008. doi:10.1093/gigascience/giaa008.
    OpenUrlCrossRef
  27. 27.↵
    1. Peng W,
    2. Li X,
    3. Wang C,
    4. Cao H,
    5. Cui Z
    . 2018. Metagenome complexity and template length are the main causes of bias in PCR-based bacteria community analysis. J Basic Microbiol 58:987–997. doi:10.1002/jobm.201800265.
    OpenUrlCrossRef
  28. 28.↵
    1. Kozich JJ,
    2. Westcott SL,
    3. Baxter NT,
    4. Highlander SK,
    5. Schloss PD
    . 2013. Development of a dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq Illumina sequencing platform. Appl Environ Microbiol 79:5112–5120. doi:10.1128/AEM.01043-13.
    OpenUrlAbstract/FREE Full Text
  29. 29.↵
    1. Prodan A,
    2. Tremaroli V,
    3. Brolin H,
    4. Zwinderman AH,
    5. Nieuwdorp M,
    6. Levin E
    . 2020. Comparing bioinformatic pipelines for microbial 16S rRNA amplicon sequencing. PLoS One 15:e0227434. doi:10.1371/journal.pone.0227434.
    OpenUrlCrossRefPubMed
  30. 30.↵
    1. Callahan BJ,
    2. McMurdie PJ,
    3. Rosen MJ,
    4. Han AW,
    5. Johnson AJA,
    6. Holmes SP
    . 2016. DADA2: high-resolution sample inference from Illumina amplicon data. Nat Methods 13:581–583. doi:10.1038/nmeth.3869.
    OpenUrlCrossRefPubMed
  31. 31.↵
    1. Edgar RC
    . 2016. UNOISE2: improved error-correction for Illumina 16S and ITS amplicon sequencing. bioRxiv doi:10.1101/081257.
    OpenUrlCrossRef
  32. 32.↵
    1. Wagner AO,
    2. Malin C,
    3. Knapp BA,
    4. Illmer P
    . 2008. Removal of free extracellular DNA from environmental samples by ethidium monoazide and propidium monoazide. Appl Environ Microbiol 74:2537–2539. doi:10.1128/AEM.02288-07.
    OpenUrlAbstract/FREE Full Text
  33. 33.↵
    1. Hunter SJ,
    2. Easton S,
    3. Booth V,
    4. Henderson B,
    5. Wade WG,
    6. Ward JM
    . 2011. Selective removal of human DNA from metagenomic DNA samples extracted from dental plaque. J Basic Microbiol 51:442–446. doi:10.1002/jobm.201000372.
    OpenUrlCrossRefPubMed
  34. 34.↵
    1. Diop K,
    2. Diop A,
    3. Michelle C,
    4. Richez M,
    5. Rathored J,
    6. Bretelle F,
    7. Fournier P-E,
    8. Fenollar F
    . 2019. Description of three new Peptoniphilus species cultured in the vaginal fluid of a woman diagnosed with bacterial vaginosis: Peptoniphilus pacaensis sp. nov., Peptoniphilus raoultii sp. nov., and Peptoniphilus vaginalis sp. nov. Microbiologyopen 8:e00661. doi:10.1002/mbo3.661.
    OpenUrlCrossRef
  35. 35.↵
    1. Sczyrba A,
    2. Hofmann P,
    3. Belmann P,
    4. Koslicki D,
    5. Janssen S,
    6. Dröge J,
    7. Gregor I,
    8. Majda S,
    9. Fiedler J,
    10. Dahms E,
    11. Bremges A,
    12. Fritz A,
    13. Garrido-Oter R,
    14. Jørgensen TS,
    15. Shapiro N,
    16. Blood PD,
    17. Gurevich A,
    18. Bai Y,
    19. Turaev D,
    20. DeMaere MZ,
    21. Chikhi R,
    22. Nagarajan N,
    23. Quince C,
    24. Meyer F,
    25. Balvočiūtė M,
    26. Hansen LH,
    27. Sørensen SJ,
    28. Chia BKH,
    29. Denis B,
    30. Froula JL,
    31. Wang Z,
    32. Egan R,
    33. Don Kang D,
    34. Cook JJ,
    35. Deltel C,
    36. Beckstette M,
    37. Lemaitre C,
    38. Peterlongo P,
    39. Rizk G,
    40. Lavenier D,
    41. Wu Y-W,
    42. Singer SW,
    43. Jain C,
    44. Strous M,
    45. Klingenberg H,
    46. Meinicke P,
    47. Barton MD,
    48. Lingner T,
    49. Lin H-H,
    50. Liao Y-C, et al
    . 2017. Critical assessment of metagenome interpretation—a benchmark of metagenomics software. Nat Methods 14:1063–1071. doi:10.1038/nmeth.4458.
    OpenUrlCrossRefPubMed
  36. 36.↵
    1. Martin KJ,
    2. Rygiewicz PT
    . 2005. Fungal-specific PCR primers developed for analysis of the ITS region of environmental DNA extracts. BMC Microbiol 5:28. doi:10.1186/1471-2180-5-28.
    OpenUrlCrossRefPubMed
  37. 37.↵
    1. Xu Y,
    2. Lin Z,
    3. Tang C,
    4. Tang Y,
    5. Cai Y,
    6. Zhong H,
    7. Wang X,
    8. Zhang W,
    9. Xu C,
    10. Wang J,
    11. Wang J,
    12. Yang H,
    13. Yang L,
    14. Gao Q
    . 2019. A new massively parallel nanoball sequencing platform for whole exome research. BMC Bioinformatics 20:153. doi:10.1186/s12859-019-2751-3.
    OpenUrlCrossRef
  38. 38.↵
    1. Diop K,
    2. Dufour J-C,
    3. Levasseur A,
    4. Fenollar F
    . 2019. Exhaustive repertoire of human vaginal microbiota. Human Microbiome J 11:100051. doi:10.1016/j.humic.2018.11.002.
    OpenUrlCrossRef
  39. 39.↵
    1. Harris PA,
    2. Taylor R,
    3. Minor BL,
    4. Elliott V,
    5. Fernandez M,
    6. O'Neal L,
    7. McLeod L,
    8. Delacqua G,
    9. Delacqua F,
    10. Kirby J,
    11. Duda SN
    , REDCap Consortium. 2019. The REDCap consortium: building an international community of software platform partners. J Biomed Inform 95:103208. doi:10.1016/j.jbi.2019.103208.
    OpenUrlCrossRefPubMed
  40. 40.↵
    1. Martin M
    . 2011. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J 17:10–12. doi:10.14806/ej.17.1.200.
    OpenUrlCrossRefPubMed
  41. 41.↵
    1. Rognes T,
    2. Flouri T,
    3. Nichols B,
    4. Quince C,
    5. Mahé F
    . 2016. VSEARCH: a versatile open source tool for metagenomics. PeerJ 4:e2584. doi:10.7717/peerj.2584.
    OpenUrlCrossRefPubMed
  42. 42.↵
    1. Quast C,
    2. Pruesse E,
    3. Yilmaz P,
    4. Gerken J,
    5. Schweer T,
    6. Yarza P,
    7. Peplies J,
    8. Glöckner FO
    . 2013. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res 41:D590–D596. doi:10.1093/nar/gks1219.
    OpenUrlCrossRefPubMedWeb of Science
  43. 43.↵
    1. Cole JR,
    2. Wang Q,
    3. Fish JA,
    4. Chai B,
    5. McGarrell DM,
    6. Sun Y,
    7. Brown CT,
    8. Porras-Alfaro A,
    9. Kuske CR,
    10. Tiedje JM
    . 2014. Ribosomal Database Project: data and tools for high throughput rRNA analysis. Nucleic Acids Res 42:D633–D642. doi:10.1093/nar/gkt1244.
    OpenUrlCrossRefPubMedWeb of Science
  44. 44.↵
    1. Parks DH,
    2. Chuvochina M,
    3. Waite DW,
    4. Rinke C,
    5. Skarshewski A,
    6. Chaumeil P-A,
    7. Hugenholtz P
    . 2018. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat Biotechnol 36:996–1004. doi:10.1038/nbt.4229.
    OpenUrlCrossRef
  45. 45.↵
    1. Langmead B,
    2. Salzberg SL
    . 2012. Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359. doi:10.1038/nmeth.1923.
    OpenUrlCrossRefPubMedWeb of Science
  46. 46.↵
    1. Rotmistrovsky K,
    2. Agarwala R
    . 2012. BMTagger: Best Match Tagger for removing human reads from metagenomics datasets. http://hmpdacc.org/resources/tools_protocols.php
  47. 47.↵
    1. Bushnell B
    . 2014. BBMap: a fast, accurate, splice-aware aligner. 9th Annual Genomics of Energy & Environment Meeting.
  48. 48.↵
    1. Wood DE,
    2. Lu J,
    3. Langmead B
    . 2019. Improved metagenomic analysis with Kraken 2. Genome Biol 20:257. doi:10.1186/s13059-019-1891-0.
    OpenUrlCrossRefPubMed
  49. 49.↵
    1. Truong DT,
    2. Franzosa EA,
    3. Tickle TL,
    4. Scholz M,
    5. Weingart G,
    6. Pasolli E,
    7. Tett A,
    8. Huttenhower C,
    9. Segata N
    . 2015. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nat Methods 12:902–903. doi:10.1038/nmeth.3589.
    OpenUrlCrossRefPubMed
  50. 50.↵
    1. Lu J,
    2. Breitwieser FP,
    3. Thielen P,
    4. Salzberg SL
    . 2017. Bracken: estimating species abundance in metagenomics data. PeerJ Comput Sci 3:e104. doi:10.7717/peerj-cs.104.
    OpenUrlCrossRefPubMed
  51. 51.↵
    1. LaPierre N,
    2. Alser M,
    3. Eskin E,
    4. Koslicki D,
    5. Mangul S
    . 2020. Metalign: efficient alignment-based metagenomic profiling via containment min hash. Genome Biol 21:242. doi:10.1186/s13059-020-02159-0.
    OpenUrlCrossRef
  52. 52.↵
    1. Zozaya-Hinchliffe M,
    2. Lillis R,
    3. Martin DH,
    4. Ferris MJ
    . 2010. Quantitative PCR assessments of bacterial species in women with and without bacterial vaginosis. J Clin Microbiol 48:1812–1819. doi:10.1128/JCM.00851-09.
    OpenUrlAbstract/FREE Full Text
  53. 53.↵
    1. Akutsu T,
    2. Motani H,
    3. Watanabe K,
    4. Iwase H,
    5. Sakurada K
    . 2012. Detection of bacterial 16S ribosomal RNA genes for forensic identification of vaginal fluid. Leg Med (Tokyo) 14:160–162. doi:10.1016/j.legalmed.2012.01.005.
    OpenUrlCrossRefPubMed
  54. 54.↵
    1. Caporaso JG,
    2. Lauber CL,
    3. Walters WA,
    4. Berg-Lyons D,
    5. Lozupone CA,
    6. Turnbaugh PJ,
    7. Fierer N,
    8. Knight R
    . 2011. Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proc Natl Acad Sci U S A 108(Suppl 1):4516–1522. doi:10.1073/pnas.1000080107.
    OpenUrlAbstract/FREE Full Text
PreviousNext
Back to top
Download PDF
Citation Tools
Assessment of In Vitro and In Silico Protocols for Sequence-Based Characterization of the Human Vaginal Microbiome
Luisa W. Hugerth, Marcela Pereira, Yinghua Zha, Maike Seifert, Vilde Kaldhusdal, Fredrik Boulund, Maria C. Krog, Zahra Bashir, Marica Hamsten, Emma Fransson, Henriette Svarre Nielsen, Ina Schuppe-Koistinen, Lars Engstrand
mSphere Nov 2020, 5 (6) e00448-20; DOI: 10.1128/mSphere.00448-20

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Print
Alerts
Sign In to Email Alerts with your Email Address
Email

Thank you for sharing this mSphere article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
Assessment of In Vitro and In Silico Protocols for Sequence-Based Characterization of the Human Vaginal Microbiome
(Your Name) has forwarded a page to you from mSphere
(Your Name) thought you would be interested in this article in mSphere.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Assessment of In Vitro and In Silico Protocols for Sequence-Based Characterization of the Human Vaginal Microbiome
Luisa W. Hugerth, Marcela Pereira, Yinghua Zha, Maike Seifert, Vilde Kaldhusdal, Fredrik Boulund, Maria C. Krog, Zahra Bashir, Marica Hamsten, Emma Fransson, Henriette Svarre Nielsen, Ina Schuppe-Koistinen, Lars Engstrand
mSphere Nov 2020, 5 (6) e00448-20; DOI: 10.1128/mSphere.00448-20
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
  • Top
  • Article
    • ABSTRACT
    • INTRODUCTION
    • RESULTS AND DISCUSSION
    • MATERIALS AND METHODS
    • ACKNOWLEDGMENTS
    • FOOTNOTES
    • REFERENCES
  • Figures & Data
  • Info & Metrics
  • PDF

KEYWORDS

16S rRNA
PCR
amplicon
human microbiome
metagenomics
molecular methods
quantitative methods
vaginal microbiome

Related Articles

Cited By...

About

  • About mSphere
  • Board of Editors
  • Policies
  • For Reviewers
  • For the Media
  • Embargo Policy
  • For Librarians
  • For Advertisers
  • Alerts
  • RSS
  • FAQ
  • Permissions
  • Journal Announcements

Authors

  • ASM Author Center
  • Submit a Manuscript
  • Author Warranty
  • Types of Articles
  • Getting Started
  • Ethics
  • Contact Us

Follow #mSphereJ

@ASMicrobiology

       

 

Website feedback

ASM Journals

ASM journals are the most prominent publications in the field, delivering up-to-date and authoritative coverage of both basic and clinical microbiology.

About ASM | Contact Us | Press Room

 

ASM is a member of

Scientific Society Publisher Alliance

 

American Society for Microbiology
1752 N St. NW
Washington, DC 20036
Phone: (202) 737-3600

Copyright © 2021 American Society for Microbiology | Privacy Policy | Website feedback

Online ISSN: 2379-5042