Skip to main content
  • ASM Journals
    • Antimicrobial Agents and Chemotherapy
    • Applied and Environmental Microbiology
    • Clinical Microbiology Reviews
    • Clinical and Vaccine Immunology
    • EcoSal Plus
    • Eukaryotic Cell
    • Infection and Immunity
    • Journal of Bacteriology
    • Journal of Clinical Microbiology
    • Journal of Microbiology & Biology Education
    • Journal of Virology
    • mBio
    • Microbiology and Molecular Biology Reviews
    • Microbiology Resource Announcements
    • Microbiology Spectrum
    • Molecular and Cellular Biology
    • mSphere
    • mSystems
  • Log in
  • My alerts
  • My Cart

Main menu

  • Home
  • Articles
    • Latest Articles
    • COVID-19 Research and News from ASM Journals
    • mSphere of Influence: Commentaries from Early Career Microbiologists
    • Archive
  • Topics
    • Applied and Environmental Science
    • Clinical Science and Epidemiology
    • Ecological and Evolutionary Science
    • Host-Microbe Biology
    • Molecular Biology and Physiology
    • Therapeutics and Prevention
  • For Authors
    • Getting Started
    • Submit a Manuscript
    • Scope
    • Editorial Policy
    • Submission, Review, & Publication Processes
    • Organization and Format
    • Errata, Author Corrections, Retractions
    • Illustrations and Tables
    • Nomenclature
    • Abbreviations and Conventions
    • Publication Fees
    • Ethics Resources and Policies
  • About the Journal
    • About mSphere
    • Editor in Chief
    • Board of Editors
    • For Reviewers
    • For the Media
    • For Librarians
    • For Advertisers
    • Alerts
    • RSS
    • FAQ
  • ASM Journals
    • Antimicrobial Agents and Chemotherapy
    • Applied and Environmental Microbiology
    • Clinical Microbiology Reviews
    • Clinical and Vaccine Immunology
    • EcoSal Plus
    • Eukaryotic Cell
    • Infection and Immunity
    • Journal of Bacteriology
    • Journal of Clinical Microbiology
    • Journal of Microbiology & Biology Education
    • Journal of Virology
    • mBio
    • Microbiology and Molecular Biology Reviews
    • Microbiology Resource Announcements
    • Microbiology Spectrum
    • Molecular and Cellular Biology
    • mSphere
    • mSystems

User menu

  • Log in
  • My alerts
  • My Cart

Search

  • Advanced search
mSphere
publisher-logosite-logo

Advanced Search

  • Home
  • Articles
    • Latest Articles
    • COVID-19 Research and News from ASM Journals
    • mSphere of Influence: Commentaries from Early Career Microbiologists
    • Archive
  • Topics
    • Applied and Environmental Science
    • Clinical Science and Epidemiology
    • Ecological and Evolutionary Science
    • Host-Microbe Biology
    • Molecular Biology and Physiology
    • Therapeutics and Prevention
  • For Authors
    • Getting Started
    • Submit a Manuscript
    • Scope
    • Editorial Policy
    • Submission, Review, & Publication Processes
    • Organization and Format
    • Errata, Author Corrections, Retractions
    • Illustrations and Tables
    • Nomenclature
    • Abbreviations and Conventions
    • Publication Fees
    • Ethics Resources and Policies
  • About the Journal
    • About mSphere
    • Editor in Chief
    • Board of Editors
    • For Reviewers
    • For the Media
    • For Librarians
    • For Advertisers
    • Alerts
    • RSS
    • FAQ
Research Article | Ecological and Evolutionary Science

Environmental Viral Genomes Shed New Light on Virus-Host Interactions in the Ocean

Yosuke Nishimura, Hiroyasu Watai, Takashi Honda, Tomoko Mihara, Kimiho Omae, Simon Roux, Romain Blanc-Mathieu, Keigo Yamamoto, Pascal Hingamp, Yoshihiko Sako, Matthew B. Sullivan, Susumu Goto, Hiroyuki Ogata, Takashi Yoshida
Hideyuki Tamaki, Editor
Yosuke Nishimura
aInstitute for Chemical Research, Kyoto University, Uji, Kyoto, Japan
bGraduate School of Agriculture, Kyoto University, Kyoto, Japan
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hiroyasu Watai
bGraduate School of Agriculture, Kyoto University, Kyoto, Japan
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Takashi Honda
bGraduate School of Agriculture, Kyoto University, Kyoto, Japan
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Tomoko Mihara
aInstitute for Chemical Research, Kyoto University, Uji, Kyoto, Japan
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kimiho Omae
bGraduate School of Agriculture, Kyoto University, Kyoto, Japan
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Simon Roux
cDepartment of Microbiology, the Ohio State University, Columbus, Ohio, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Romain Blanc-Mathieu
aInstitute for Chemical Research, Kyoto University, Uji, Kyoto, Japan
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Keigo Yamamoto
dResearch Institute of Environment, Agriculture and Fisheries, Osaka Prefecture, Osaka, Japan
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Pascal Hingamp
aInstitute for Chemical Research, Kyoto University, Uji, Kyoto, Japan
eCNRS, IGS UMR 7256, Aix Marseille Université, Marseille, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yoshihiko Sako
bGraduate School of Agriculture, Kyoto University, Kyoto, Japan
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Matthew B. Sullivan
cDepartment of Microbiology, the Ohio State University, Columbus, Ohio, USA
fDepartment of Civil, Environmental and Geodetic Engineering, the Ohio State University, Columbus, Ohio, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Susumu Goto
aInstitute for Chemical Research, Kyoto University, Uji, Kyoto, Japan
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hiroyuki Ogata
aInstitute for Chemical Research, Kyoto University, Uji, Kyoto, Japan
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Takashi Yoshida
bGraduate School of Agriculture, Kyoto University, Kyoto, Japan
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hideyuki Tamaki
National Institute of Advanced Industrial Science and Technology
Roles: Editor
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
DOI: 10.1128/mSphere.00359-16
  • Article
  • Figures & Data
  • Info & Metrics
  • PDF
Loading

ABSTRACT

Metagenomics has revealed the existence of numerous uncharacterized viral lineages, which are referred to as viral “dark matter.” However, our knowledge regarding viral genomes is biased toward culturable viruses. In this study, we analyzed 1,600 (1,352 nonredundant) complete double-stranded DNA viral genomes (10 to 211 kb) assembled from 52 marine viromes. Together with 244 previously reported uncultured viral genomes, a genome-wide comparison delineated 617 genus-level operational taxonomic units (OTUs) for these environmental viral genomes (EVGs). Of these, 600 OTUs contained no representatives from known viruses, thus putatively corresponding to novel viral genera. Predicted hosts of the EVGs included major groups of marine prokaryotes, such as marine group II Euryarchaeota and SAR86, from which no viruses have been isolated to date, as well as Flavobacteriaceae and SAR116. Our analysis indicates that marine cyanophages are already well represented in genome databases and that one of the EVGs likely represents a new cyanophage lineage. Several EVGs encode many enzymes that appear to function for an efficient utilization of iron-sulfur clusters or to enhance host survival. This suggests that there is a selection pressure on these marine viruses to accumulate genes for specific viral propagation strategies. Finally, we revealed that EVGs contribute to a 4-fold increase in the recruitment of photic-zone viromes compared with the use of current reference viral genomes.

IMPORTANCE Viruses are diverse and play significant ecological roles in marine ecosystems. However, our knowledge of genome-level diversity in viruses is biased toward those isolated from few culturable hosts. Here, we determined 1,352 nonredundant complete viral genomes from marine environments. Lifting the uncertainty that clouds short incomplete sequences, whole-genome-wide analysis suggests that these environmental genomes represent hundreds of putative novel viral genera. Predicted hosts include dominant groups of marine bacteria and archaea with no isolated viruses to date. Some of the viral genomes encode many functionally related enzymes, suggesting a strong selection pressure on these marine viruses to control cellular metabolisms by accumulating genes.

INTRODUCTION

Viruses outnumber microbes such as bacteria in the oceans (1), and the destructive lytic infections caused by viruses are thought to have crucial effects on energy and nutrient cycles driven by marine microorganisms (2, 3). Genomics-based research has been a powerful approach used to clarify the biology of viruses, including their infection strategies as well as their ecological significance (4–7). However, the diversity of viral genomes is still underrepresented in publically available genome databases (8, 9). For example, SAR11 (Pelagibacterales) and SAR116 are major marine prokaryotic components, but only four and one phage genomes have been sequenced for these bacteria, respectively (10, 11). Cyanophages, for which about 100 genomes have already been characterized, are the sole exception.

To address the issue of the paucity of viral genomic data, Roux et al. analyzed publicly available prokaryotic genome sequence data to mine marine and nonmarine viral genomes that have been sequenced along with the genomes of their hosts (12). They identified 12,498 viral DNA sequences (either long fragments or whole circular genomes) representing 264 predicted new genera.

Culture-independent viral metagenomics is also an effective research option for analyzing viral genomes in complex marine microbial communities (9, 13–16). A decisive advantage of viral metagenomics stems from the small genomes of viruses. Viral genomes have so far been assembled from the metagenomes of the following viral types: RNA viruses (17, 18), single-stranded DNA (ssDNA) viruses (19–26), and double-stranded DNA (dsDNA) viruses (27–29). Among these viruses, the genomes of dsDNA viruses have been the most difficult to assemble from metagenomes because of their relatively large genomes. However, recent advances in the construction of libraries (30), sequencing technologies, and bioinformatics software have resulted in the generation of larger assemblies. For example, 7 complete dsDNA viral genomes have been reported for a hypersaline lake (27), 18 for the deep-sea hydrothermal vent plumes (28), and 54 for glacial cryoconite holes (29). An interesting approach involved the construction of metagenomic fosmid libraries from virus-infected prokaryotes, which revealed 1 (31) and 42 (32) complete viral DNA genomes for solar salterns and 208 marine tailed-phage genomes (33). These studies indicated that marine viral metagenomics investigations have advanced from focusing on environmental genetics (i.e., collections of genes) to analyzing environmental genomics (i.e., collections of complete genomes), helping to unveil the evolutionary histories, life cycles, and metabolic strategies of individual viruses. In this study, we analyzed nine novel marine viral metagenomes (i.e., viromes) generated using a benchtop Illumina/MiSeq sequencer as well as previously published large-scale viromes (9). We identified 1,352 nonredundant complete viral genomes, the vast majority of which corresponded to previously unidentified viral lineages.

RESULTS AND DISCUSSION

Choice of assemblers.We generated nine viromes (Osaka Bay viromes [OBVs]; 8.5 M read pairs; 2.4 Gbp) from water samples collected over a 24-h period in Osaka Bay, Japan (see Materials and Methods). We first compared four assemblers (SPAdes [34], metaSPAdes, IDBA-UD [35], and Ray Meta [36]) regarding their ability to assemble viromes. SPAdes, metaSPAdes, and IDBA-UD clearly outperformed Ray Meta in terms of the total size of >10-kb contigs (Table 1). Of the first three assemblers, SPAdes (11.9 Mb) produced the largest assemblies (i.e., metaSPAdes, 6.8 Mb; IDBA-UD, 5.3 Mb). Regarding assembly error rates assessed by REAPR (37), SPAdes (8.48 regions/kb), metaSPAdes (8.73), and IDBA-UD (8.80) had similar error rates, which were slightly higher than that of Ray Meta (6.42). Most (99.97%) of these errors were short insertion/deletions (REAPR type 1 and type 3 errors), while there were very few (0 to 0.00662 regions/kb) scaffolding errors (type 2 and type 4 errors) (Table 1). On the basis of these results, we chose SPAdes as the best assembler for the following analyses.

View this table:
  • View inline
  • View popup
TABLE 1

Comparison of four assemblers

Forty-six genomes assembled from the Osaka Bay viromes.Given that the nine samples were collected at the same location over a short period and that the reads were relatively long (i.e., 2 × 150 or 2 × 300 bp), a coassembly consisting of the pooled nine samples was also prepared. The coassembly resulted in 879 contigs (>10 kb) that likely originated from dsDNA viruses (see Materials and Methods). Of these, 46 (28.5 to 192 kb; average, 54.2 kb) were assembled in a circular form (see Fig. S1 in the supplemental material). Thus, we refer to these 46 contigs as environmental viral genomes (EVGs).

FIG S1

Length and read coverage distributions of OBV contigs. Lengths and read coverages of 46 EVGs (i.e., circular contigs; red) and 833 noncircular contigs (blue) are presented. Boxes represent first quartile, median, and third quartile. (A) Lengths. (B) Coverages. Download FIG S1, PDF file, 1.7 MB.
Copyright © 2017 Nishimura et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license .

The EVGs did not contain any scaffolding errors (REAPR type 2 and type 4 errors), indicating high structural integrity for the contigs. To further assess the integrity of these EVGs, we mapped the contigs assembled from individual viromes on the EVGs. Of the 46 EVGs, 16 were totally covered by the contigs from individual assemblies, thus decreasing the possibility of artefactual chimeras due to coassembly for these 16 EVGs. The remaining 30 EVGs contained 1 to 24 regions (229 in total) that were supported only by coassembly and were not observed in the individually assembled contigs. We randomly selected 21 such weakly supported regions and tested the coassemblies by PCR assays (using the environmental DNA samples as a template) and sequencing. The results verified all of the tested regions of the coassembled contigs (Fig. S2A). Furthermore, 18 of the 46 EVGs exhibited complete or nearly complete genomic colinearity with closely related reference genomes (Fig. S2B; see Materials and Methods for the definition of genomic colinearity) or with the other independently determined EVGs described below (Fig. S2C). These results further corroborated the accuracy of the overall structure of the EVG assemblies.

FIG S2

Assessment of the integrity of OBV contigs. (A) PCR validation of 21 weakly supported regions of OBV_N00005, OBV_N00020, OBV_N00021, and OBV_N00023. All PCR products for the 21 regions were the expected sizes, and their sequences confirmed the OBV coassemblies. Ladder: 2-log DNA ladder (0.1 to 10.0 kb; New England Biolabs, Ipswich, MA). (B) Dot plots of 46 OBV-EVGs and their most closely related reference genomes. The sequences in the comparison were selected on the basis of the best SG score for the RVG data set (i.e., prokaryotic dsDNA viruses). (C) Dot plots of 46 OBV-EVGs and their most closely related viral genomes. The sequences in the comparison were selected on the basis of the best SG score for the EVGs and RVGs (dsDNA viruses). Genome sequence IDs beginning with “TARA_” represent TOV-EVGs obtained in this study, and IDs with “AP” represent other previously described EVGs (33). In panels B and C, Bg represents the percentage of OBV-EVG genes that had orthologous relationships (bidirectional best hits of BLASTp; E value, <1e−5) with the genome in the comparison. Cg corresponds to the percentage of genes in each OBV-EVG that are present in colinearity regions as determined by MCScanX with default settings. These dot plots are sorted by Bg (from upper left to bottom right on three pages). All tBLASTx alignments are shown. The sequences in the dot plots are circularly permuted and/or reversed for clarity. Grid lines are drawn for every 10 kb, and bold grid lines indicate 50-kb intervals. The color scale represents tBLASTx percent identity. Download FIG S2, PDF file, 1.5 MB.
Copyright © 2017 Nishimura et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license .

SNPs and nucleotide diversity.Each of the individual EVGs likely corresponds to genomes of closely related viruses because the sequence assemblies were obtained from environmental viral populations. To assess the genetic diversity of each EVG, we analyzed single nucleotide polymorphisms (SNPs) and calculated the nucleotide diversity of each EVG. Nucleotide sites containing SNPs that were supported by at least one read were present in genomes at a rate of 0.558 to 7.897% (median, 2.473%) (see Table S1A in the supplemental material). The nucleotide diversity of EVGs was 0.073 to 1.734% (median, 0.423%). These results are within the ranges for genomes from the same viral species (38). We conclude that each of the EVGs represents a consensus genome of a viral species.

TABLE S1

(A) SNP and nucleotide diversity of 46 OBV-EVGs. (B) List of 4,240 prokaryotic dsDNA virus genomes in the viral proteomic tree (sorted in the order used for Fig. 1; clockwise) and the corresponding genus-level gOTUs. (C) Host predictions of 29 EVGs by genome-wide sequence similarity (SG). (D) Predicted virion structural and packaging proteins encoded among 58 putative archaeal EVGs. (E) Predicted Fe-S cluster assembly genes and Fe-S-related genes in nine T4-like EVGs. (F) Photosynthetic genes detected in a set of EVGs/RVGs. (G) Gene annotation table of aceBA encoding TOV-EVG (TARA_ERS478052_N000008). (H) PCR primer pairs for validating OBV-EVG assemblies. (I) Seed sequences of photosynthetic genes for PSI-BLAST analysis. Download TABLE S1, PDF file, 1.2 MB.
Copyright © 2017 Nishimura et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license .

One thousand five hundred genomes assembled from the Tara Oceans viromes.Prompted by the detection of 46 OBV-EVGs in a modest sequencing effort, we applied our genome assembly and complete genome identification protocol to the Tara Oceans viromes (TOV), which consist of 43 viromes representing 26 oceanic locations (9). Given the wide geographic areas and seasons covered by these samples and the large volume of sequence data for individual TOV samples (i.e., average, 50 M reads; 2 × 100 bp), we assembled these 43 viromes individually. We obtained 1,554 TOV-EVGs (i.e., circular complete contigs, 10 to 211 kb) with a predicted viral origin. Only 64 were detected as complete in the previously reported original TOV assemblies (9), and 85.6% of the remaining EVGs (i.e., 1,275 EVGs) were detected in the original assemblies as smaller contigs with less than half the size of the contigs in these new assemblies. Clustering on the basis of the nucleotide sequence identity among the OBV-/TOV-EVGs resulted in 1,352 nonredundant complete genomes (i.e., 46 OBV-EVGs and 1,306 TOV-EVGs).

After discarding possible eukaryotic virus genomes, we obtained 1,567 complete genomes that were likely of prokaryotic dsDNA viral origin (45 OBV-EVGs and 1,522 TOV-EVGs; see Materials and Methods). Of these genomes, 1,404 (89.6%) were predicted to encode homologs of tailed-virus hallmark proteins (i.e., terminase large subunits [89.5%], major capsid proteins [34.4%], or portal proteins [60.2%]), suggesting that the genomes were derived from tailed viruses. Of the remaining 163 EVGs, 72 were predicted to encode integrase homologs.

Diversity of environmental viral genomes.To investigate the global novelty offered by culture-independent viral genome sequencing efforts, we compiled a set of 1,811 EVGs (>10 kb) composed of the 45 OBV-EVGs, the 1,522 TOV-EVGs, and 244 EVGs from other studies (29, 33, 39). We also compiled a set of 2,429 prokaryotic dsDNA viral genomes (>10 kb) from cultured viruses, which are referred to here as reference viral genomes (RVGs) (Fig. S3; Table S1B).

FIG S3

Length and percent G+C content of EVGs and RVGs. Lengths (top panel) and percent G+C content (bottom panel) of 1,811 EVGs and 2,429 RVGs are presented. The EVGs (i.e., 45 OBV-EVGs, 1,522 TOV-EVGs, and 244 other EVGs) were categorized on the basis of data sources. The other EVGs include viral genomes from uvMED (33), cryoconite (29), and SAG (39). Boxes represent the first quartile, median, and third quartile. Dot colors represent the following data sources: OBV (red), TOV (blue), other EVG (green), and RVG (gray). Download FIG S3, PDF file, 0.1 MB.
Copyright © 2017 Nishimura et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license .

We first generated a viral proteomic tree (40) on the basis of genomic similarity scores (denoted by SG) derived from tBLASTx scores. The SG value is 1 when two genomes in a comparison are identical and decreases to 0 when a tBLASTx search fails to detect any sequence similarities. The viral proteomic tree revealed a clear separation between EVG and RVG clades, with most of the EVGs grouped with other EVGs and not with the RVGs (Fig. 1). We also used average linkage clustering of the EVGs/RVGs to delineate operational taxonomic units (i.e., genomic OTUs [gOTUs]) on the basis of the SG value, with six different clustering cutoff values (Fig. 2 for cutoff SG = 0.15 and Fig. S4 for all cutoff values from 0.1 to 0.9). The EVG-containing gOTUs outnumbered the RVG-containing gOTUs at five of six tested SG cutoff values. For example, we observed a 1.6-fold EVG-to-RVG gOTU overrepresentation ratio at SG = 0.3 (Fig. S4A). The proteomic tree and comparative genome maps are available at http://www.genome.jp/viptree/EVG2017 .

FIG S4

Genomic operational taxonomic unit (gOTU) richness with six different thresholds. The EVGs and RVGs were clustered together, and the gOTU subsets of the EVGs and RVGs were then constructed by extracting each member. (A) Rarefaction curves for the number of gOTU clusters. The cutoffs of the genome-wide proteomic similarity score (i.e., SG) for clustering are indicated in each plot. Numbers in parentheses represent the actual number of genomes and gOTUs. Rarefaction curves are presented with shading representing 95% confidence intervals obtained from 100 bootstrap replicates using the R package iNEXT (107). Dashed curves represent extrapolations to 5,000 genome sequences. Chao1 richness estimates for the EVGs and RVGs are indicated. (B) Proportions of three types of clusters, EVG-only clusters (red), RVG-only clusters (blue), and shared clusters (gray). Proportions determined by the number of gOTU clusters (left) and by the number of genomes (right) are presented. Download FIG S4, PDF file, 0.3 MB.
Copyright © 2017 Nishimura et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license .

FIG 1
  • Open in new tab
  • Download powerpoint
FIG 1

Proteomic tree. The dendrogram represents proteome-wide similarity relationships among 4,240 prokaryotic dsDNA virus genomes. Branches are colored orange (EVG, environmental viral genome) or black (RVG, reference viral genome), and branch lengths are indicated using a logarithmic scale. TOV, Tara Oceans viromes; OBV, Osaka Bay viromes. The tree is midpoint rooted. Rings outside the dendrogram represent, from inside to outside, sources of genome data, taxonomic groups of known hosts, and viral family classifications.

FIG 2
  • Open in new tab
  • Download powerpoint
FIG 2

Genus-level genomic OTU (gOTU) richness. The genome-wide similarity score (SG) cutoff for clustering was set to 0.15 (i.e., viral-genus-level cutoff). The EVGs and RVGs were clustered together, and subsets of the EVGs and RVGs were then constructed by extracting each member. (A) Rarefaction curves for the number of gOTUs. Rarefaction curves are presented with shading representing 95% confidence intervals obtained from 100 bootstrap replicates using the R package iNEXT (107). Dashed curves represent extrapolations to 5,000 genome sequences. Numbers in parentheses represent the number of genomes and gOTUs. Chao1 richness estimates for the EVGs and RVGs are indicated. (B) Proportions of genus-level gOTU clusters. Colors represent the following cluster categories: EVG-only clusters (red), RVG-only clusters (blue), and shared clusters (gray).

Genus-level operational taxonomic units.We analyzed the viral taxonomic classification of the RVGs and evaluated the correspondence between viral genera and gOTUs using different SG cutoff values. The SG values between 0.07 and 0.2 were associated with relatively high adjusted Rand index values (i.e., > 0.79), and SG = 0.15 (adjusted Rand index = 0.847) was determined to be the most accurate cutoff value for a genus-level classification (Fig. S5). With this cutoff value, we obtained 1,087 gOTUs for the EVGs/RVGs. The 2,429 RVGs were distributed across 487 gOTUs, whereas the 1,811 EVGs were distributed across 617 gOTUs (i.e., 1.27-fold-higher richness), with only 1.4% of the total gOTUs containing both EVGs and RVGs (Fig. 2B). Therefore, the EVGs potentially represent 600 new viral genera. Of the 600 gOTUs, 497 were composed exclusively of OBV-/TOV-EVGs. To complement this analysis, we added 11,779 mined viral genomes (MVGs; genome sizes, >10 kb) (12). We observed only a limited overlap of gOTUs among the EVGs, RVGs, and MVGs (i.e., only two gOTUs with sequences from all three sets), and 590 genus-level gOTUs remained specific to the EVGs.

FIG S5

Evaluation of SG cutoff levels for the delineation of genus-level gOTUs. The correspondence between the genus-level classification of known viruses and their grouping into gOTUs was evaluated using the adjusted Rand index for various SG cutoff values (see Materials and Methods). Numbers in parentheses represent the SG value (0.15) where the adjusted Rand index reached the highest value (0.847). Download FIG S5, PDF file, 0.1 MB.
Copyright © 2017 Nishimura et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license .

Virus-host interactions. (i) Host prediction on the basis of genomic similarity.Because of the dissimilarity between EVGs and RVGs, host predictions on the basis of similarities to known viral genomes (i.e., RVGs) were difficult to make. Using information regarding RVG hosts, we calculated an optimal SG threshold that separated viruses into those that infect similar hosts and those that do not. The threshold was a SG value of >0.2937 (>90% precision) for the prediction of pairs of viruses infecting host organisms that are evolutionarily related at the genus level (Fig. S6). With this cutoff, we predicted host groups for only 29 of 1,811 EVGs (2 OBV-EVGs, 13 TOV-EVGs, and 14 other EVGs; Table S1C). Of the 29 EVGs, 18, 10, and 1 were predicted to be cyanophages, Pelagibacter phages, and Pseudoalteromonas phages, respectively. Two additional host prediction methods based on tRNA genes and clustered regularly interspaced short palindromic repeat (CRISPR) spacer sequences (41) failed to predict possible hosts for the EVGs. However, the physical linkage of genes on the EVGs provided additional clues about their hosts and biology. In the following sections, we describe virus-host interactions inferred from the genomic contexts of EVGs.

FIG S6

Evaluation of host group predictions. A precision curve for predictions of host taxonomic groups (mainly at the genus level, except for Cyanobacteria and Enterobacteriaceae; see Materials and Methods) was generated using the genome-wide proteomic similarity score (i.e., SG). We estimated the precision of host group predictions for a subset of the RVGs used in this study (i.e., 2,429 prokaryotic dsDNA viruses), each of which was uniquely assigned to the host taxonomic group. For each RVG, the best SG values for the members of the same host taxonomic group (i.e., mostly the same genus), and for the different host taxonomic groups, were recorded (i.e., 2,570 SG scores). The precision curve was generated using sliding SG cutoff values. Download FIG S6, PDF file, 0.1 MB.
Copyright © 2017 Nishimura et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license .

(ii) MGII viruses.Four previously undescribed lineages that likely infect unculturable marine group II (MGII) Euryarchaeota species were revealed in the proteomic tree. These four clades were exclusively composed of OBV/TOV-EVGs, with 18, 13, 23, and 4 EVGs in clades 1, 2, 3, and 4, respectively (Fig. 3). Phylogenetic analyses of the DNA polymerases encoded in those EVGs strongly support the existence of the four clades identified in the proteomic tree (Fig. 4A). These clades were grouped with homologs from haloviruses and euryarchaea. Identifications of archaeal hosts for the 58 EVGs were also supported by their gene content. Of the genes in the EVGs with homologs in cellular organisms, an average of 36.1% (14.3 to 60.0%) were most closely matched to archaeal proteins. Additionally, one to five tailed-virus structural protein homologs were detected in each of the EVGs (Table S1D). Archaeal tailed viruses have been detected only in Euryarchaeota species (42), with the exception of a provirus of Nitrososphaera viennensis (Thaumarchaeota) isolated from soil (43).

FIG 3
  • Open in new tab
  • Download powerpoint
FIG 3

Fifty-eight putative archaeal virus genomes. (A) Part of the proteomic tree with 2 OBV-EVGs (red) and 56 TOV-EVGs (orange), predicted to be derived from euryarchaeal tailed viruses infecting marine group II (MGII) species. Genomes with genes encoding DNA polymerase B (squares) and chaperonin (triangles) are indicated. Clade names and genus-level gOTUs are indicated. Numbers in parentheses represent the number of genomes of each clade or gOTU. The ranges of genome sizes and percent G+C contents for each clade are presented, with the exception that clade 2 includes a long contig (121 kb; asterisk). Branch lengths are logarithmically scaled from the root of the entire proteomic tree in Fig. 1. (B) Genome map of nine archaeal viral genomes that are indicated by stars in panel A. The sequences are circularly permuted and/or reversed. Red arrows indicate the original start position of the sequences. Putative gene functions are indicated. All tBLASTx alignments are represented by colored lines between the two genomes. The color scale represents tBLASTx percent identity.

FIG 4
  • Open in new tab
  • Download powerpoint
FIG 4

Gene phylogenetic trees of DNA polymerase B and chaperonin. (A) Maximum likelihood tree of DNA polymerase B. The tree is rooted by four distant bacterial sequences (not shown) and includes 348 sequences. (B) Maximum likelihood tree of chaperonin. The tree is midpoint rooted and includes 381 sequences. In panels A and B, numbers in parentheses represent the number of sequences in each collapsed node. Colors represent taxonomies. Asterisks indicate collapsed nodes that include MGII (*) and MGI (**) sequences. The scale bar refers to the estimated number of amino acid substitutions per site. Numbers near the nodes represent bootstrap percentages of >50%. MGIIA and MGIIB indicate sequences from reported genomes (45 and 46, respectively).

We observed that the EVGs contained chaperonin genes (Fig. 3A). Thirty-eight of the 58 EVGs encode chaperonin homologs, even though chaperonin genes have rarely been identified in sequenced viral genomes (i.e., only 7 of the 2,429 RVGs encode chaperonins). In some viruses, chaperonins, which are usually provided by the hosts, are responsible for the correct assembly of viral particles (44). All 18 EVGs in clade 1 encode archaeon-type chaperonin homologs (i.e., thermosome; group II chaperonin), while 20 EVGs in clades 2 to 4 encode bacterium-type chaperonin homologs (i.e., GroEL; group I chaperonin). We detected both groups of chaperonin genes in the MGII genomes (45, 46). The group I and group II chaperonin sequences from the EVGs were grouped with these MGII chaperonins (Fig. 4B), suggesting that MGII species serve as hosts for these environmental viruses.

The following three archaeal taxa are abundant in the marine water column: marine group I Thaumarchaeota (MGI), MGII, and marine group III Euryarchaeota (MGIII) (47). Of these, currently cultivated representatives exist only in MGI (48). The members of MGII are abundant in particle-rich surface waters (49, 50), while those of MGIII have been observed almost exclusively in deep seas (47). A recent study revealed that MGII members can temporarily become the most abundant (up to 40%) prokaryotic components in the days following a spring bloom (51). The 58 EVGs were derived from surface or deep chlorophyll maximum viromes, suggesting their photic-zone habitat. These observations and the genomic context described above suggest that the 58 EVGs represent genomes of tailed viruses infecting MGII Euryarchaeota species.

(iii) A SAR86 phage encoding IscU.Iron-sulfur (Fe-S) cluster proteins are involved in a variety of biological processes, including gene regulation, electron transfer, catalytic reactions, and oxygen-iron sensing (52). In a previous study, Fe-S cluster assembly protein genes (e.g., sufA and iscU) were identified as auxiliary metabolic genes (AMGs) of photic-zone viromes (15, 53). However, the lack of complete genome data hindered further characterizations of the viruses carrying these genes. We identified 16 OBV/TOV-EVGs with Fe-S cluster assembly protein genes, including 14 EVGs containing an Fe-S cluster A-type carrier (ATC) gene (54) and 6 EVGs carrying the IscU gene (Fig. 5A). These genomes are scattered across four groups of viruses in the proteomic tree, and many of their close relatives (i.e., other EVGs and Pelagibacter phage HTVC008M in Fig. 5A) do not contain these genes. The ATC and IscU proteins function as scaffolds in which Fe and S atoms are assembled into Fe-S clusters (55, 56). Phylogenetic trees of IscU (Fig. 5B) and ATC (Fig. S7) revealed that all six EVG-encoded IscU genes form a clade with gammaproteobacterial homologs. Of these, an IscU gene from OBV_N00005 was phylogenetically closely related to homologs from SAR86 (57), suggesting that SAR86 members represent potential hosts for OBV_N00005. The prevalence of these viral genes in photic-zone viromes (15) appears to be linked to the wide distribution of these bacteria.

In addition to the Fe-S scaffolding proteins, some of the EVGs encode several Fe-S cluster proteins that use Fe-S clusters as prosthetic groups, such as radical S-adenosylmethionine (SAM) superfamily enzymes (58) and CRISPR-associated Cas4 exonucleases (59, 60). The EVGs also encode proteins involved in the metabolism of Fe-S cluster proteins, such as glutaredoxins (Grx), the phenylacetyl-coenzyme A oxygenase component PaaD (61, 62), and ClpP, which is a serine protease targeting Fe-S cluster proteins (15). A notable example is the T4-like TARA_ERS488813_N000010 (183 kb; group iv in Fig. 5A), which includes an ATC gene, 12 genes for radical SAM superfamily enzymes, and cas4, grx, and paaD (16 genes in total; Table S1E). Other T4-like EVGs encoding ATC and/or IscU proteins contain two to seven additional Fe-S-related genes. Of these genes, paaD has not been previously associated with a virally encoded protein and thus represents a novel AMG. These observations suggest that Fe-S cluster assembly proteins encoded in these viral genomes function as a part of Fe-S cluster-related metabolic processes involving not only host proteins but also many virally encoded proteins.

FIG S7

Maximum likelihood tree of ATC genes. The tree includes OBV_N00005 (red), 13 TOV-EVGs (orange), and 58 published reference sequences (54). The reference sequences are colored as follows: alphaproteobacteria and mitochondria (brown), betaproteobacteria (cyan), gammaproteobacteria (blue), and others (black). The group numbers (i.e., ii, iii, and iv) correspond to the groups of EVGs described in the Fig. 5A legend. Subfamilies of ATCs (i.e., ATC-I, ATC-II, and ATC-III) correspond to those described in reference 54. Escherichia coli ErpA, IscA, and SufA are indicated. The tree is midpoint rooted. The scale bar refers to the estimated number of amino acid substitutions per site. Numbers close to the nodes represent bootstrap percentages of >50%. Download FIG S7, PDF file, 0.3 MB.
Copyright © 2017 Nishimura et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license .

FIG 5
  • Open in new tab
  • Download powerpoint
FIG 5

Genomes with Fe-S cluster assembly-related genes. (A) Four parts of the proteomic tree with genomes carrying Fe-S cluster assembly genes (i.e., ATC [■] and IscU [▲] genes). Branch lengths are logarithmically scaled as described for Fig. 3A. Genus-level gOTUs and genome identifiers (IDs), lengths, and percent G+C compositions are indicated. (B) Maximum likelihood tree of IscU genes. The tree contains protein sequences encoded in OBV_N00005 (red), five TOV-EVGs (orange), and 21 Proteobacteria and cyanobacterial genomes (black). The scale bar refers to the estimated number of amino acid substitutions per site. Numbers close to the nodes represent bootstrap percentages of >50%. The tree is rooted by the cyanobacterial Nostoc species sequence. (C) Genome map of OBV_N00085 and Pelagibacter phage HTVC008M. The HTVC008M sequence is circularly permuted at 97,000 bp and reversed. A red arrow indicates the original start position of the HTVC008M sequence. Putative gene functions of OBV_N00005 and HTVC008M (described in reference 10) are indicated. All tBLASTx alignments are represented by colored lines between the two genomes. The color scale represents tBLASTx percent identity. FAD, flavin adenine dinucleotide; NAD, nicotinamide adenine dinucleotide.

(iv) A novel cyanophage lineage.The RVG set included 114 cyanophage genomes, which were grouped into 17 viral-genus-level gOTUs. There were no other RVGs classified into these gOTUs. Of these 17 gOTUs, 5 included 34 EVGs (i.e., 3 OBV-EVGs, 16 TOV-EVGs, and 15 previously described EVGs [33]), which are likely to have been derived from cyanophages or their relatives. Screening all EVGs with 11 photosynthesis-related AMGs (see Materials and Methods) led to the identification of 11 predicted cyanophage EVGs, of which 10 were included in the gOTUs mentioned above (Table S1F). The remaining EVG (i.e., TARA_ERS489084_N000023; gOTU G241), which carries psbA and hli, formed a singleton gOTU and represents a new cyanophage group. To characterize the approximate abundances of these 18 cyanophage gOTUs (149 genomes; Table 2), we mapped the TOV and OBV reads on these putative cyanophage genomes. The following five most abundant gOTUs represented >98% of the total cyanophage content: (i) G386, including T4-like myoviruses (35.1%); (ii) G14, including podoviruses (33.7%); (iii) G234, including a siphovirus and dwarf myoviruses (23.4%); (iv) G238, including Synechococcus phage S-EIV1 (63) (3.3%); and (v) G15, including Prochlorococcus phage P-RSP2 (3.2%) (Tables 2 and S1B for the list of genomes). Thus, marine cyanophage genomes are well represented in the current databases.

View this table:
  • View inline
  • View popup
TABLE 2

Photosynthetic genes and abundance of cyanophage genomes

(v) Diverse marine Bacteroidetes phages. Bacteroidetes is one of the most abundant bacterial phyla in the oceans (e.g., 30% of the bacterioplankton during phytoplankton blooms) (64). Members of this phylum are involved in the decomposition and remineralization of phytoplankton biomass (65). A recent study revealed that an algal bloom is followed by the presence of a rapid succession of diverse Flavobacteriaceae bacteria (64). To the best of our knowledge, the genomes of the following 38 phages infecting marine Bacteroidetes (Flavobacteriaceae) have been described: psychrophilic Flavobacterium phage 11b (66), Croceibacter phage P2559S (67), 2 Persicivirga phages (68), 31 Cellulophaga phages (69), Flavobacterium phage 1/32 (70), and 2 Polaribacter phages (71). Polaribacter was reported to be abundant following a spring phytoplankton bloom (64), while Cellulophaga phages (31 of 38) likely represent a “rare biosphere” rather than abundant marine phages (69). We detected two groups (i.e., groups 1 and 2) of putative Flavobacteriaceae phage genomes (i.e., 5 RVGs, 8 OBV-EVGs, 222 TOV-EVGs, and 9 EVGs from another study; Fig. 6). Group 1 and group 2 consisted of 29 and 25 gOTUs, respectively. Of these, 23 and 21 gOTUs were exclusively composed of OBV/TOV-EVGs. Of the genes in the OBV/TOV-EVGs having homologs in cellular organisms, 64.4% (15.8% to 92.3%) on average for the members of group 1 and 32.4% (10.5% to 59.1%) on average for the members of group 2 were most similar to Bacteroidetes genes. For example, the gene20 sequence of OBV_N00025 (group 2, G506; Fig. 6B) was most similar to the RNA polymerase sigma-70 factor sequence of a Flavobacteria strain from marine surface water (WP_009781949; Leeuwenhoekiella blandensis; E value = 1e-30) (72, 73). Genomes of these groups also encode conserved virion structural or morphogenetic proteins. For the members of group 1, we detected putative portal gene homologs in 148 EVGs (93.7%) and prohead protease homologs in 145 EVGs (91.8%). For the members of group 2, we detected homologs of two to six structural proteins of Cellulophaga phage phi38:1 (i.e., a member of group 2) in 78 EVGs (100%). Additionally, we detected GroEL homologs in 36 EVGs of the members of group 2 (Fig. 6B) which were phylogenetically related to the homologs in Cellulophaga phages (Fig. 4B). Therefore, these EVGs probably correspond to viruses of Flavobacteriaceae species and may prove to be useful genetic markers for studying viruses affecting bacterial decomposer communities.

FIG 6
  • Open in new tab
  • Download powerpoint
FIG 6

Two parts of the proteomic tree with EVGs of putative Flavobacteriaceae phages. Branch lengths are logarithmically scaled as described for Fig. 3A. Genus-level gOTUs are indicated. Numbers in parentheses represent the number of genomes in each gOTU. (A) Group 1 distributed in 29 gOTUs, including two Persicivirga phages (black), 5 OBV-EVGs (red), 147 TOV-EVGs (orange), and 7 other EVGs (blue). (B) Group 2 distributed in 25 genus-level gOTUs, including two Cellulophaga phages (phi40:1 and phi38:1; black; G508), IAS virus (black; G520), 3 OBV-EVGs (red), 75 TOV-EVGs (orange), and 2 other EVGs (blue). Genomes encoding chaperonins are indicated by a triangle.

(vi) A virus potentially enhancing the adaptation of its host.Isocitrate lyase (AceA) and malate synthase (AceB) catalyze two reactions in the glyoxylate shunt, which bypasses the CO2-generating steps of the tricarboxylic acid cycle and enables the net assimilation of carbon from acetyl-coenzyme A (acetyl-CoA), leading to gluconeogenesis (i.e., generation of glucose) and cell growth (74, 75). We identified an aceBA operon in a TOV-EVG (TARA_ERS478052_N000008; 179 kb; see Table S1G for gene description) that included homologs of three structural genes from T4-like phages. Our genomic similarity and gene composition analysis did not provide any clue about the host of this virus. A previous study detected aceA and aceB in ocean viromes (14), but this is the first time, to our knowledge, that an aceBA operon has been observed in a complete viral genome. The genome also encoded six enzymes (i.e., Gmd, WcaG, ManC, NeuA, KdsA, and WaaG) for the biosynthesis of lipopolysaccharides (LPS) and capsular polysaccharides, important components of bacterial cell wall and capsule (76, 77). Previous studies identified LPS synthesis genes in temperate and lytic phages and proposed that these genes function to modify cell surface compositions to prevent other viruses from attaching to the cell during the lysogenic or pseudolysogenic phase, in the latter of which a lytic process is halted due to suboptimal host cell growth (78, 79). Following this “lock out” hypothesis, the aceBA-carrying virus (i.e., TARA_ERS478052_N000008) should have a provirus phase, and AceA and AceB may function to promote the growth of host cells. gene40 of the TOV-EVG was predicted to encode a homolog of zeta toxin proteins (Table S1G) thought to be involved in a toxin-antitoxin system. Toxin-antitoxin systems enhance the stability of plasmids and prophages by postsegregational killing (80). This corroborates the existence of a lysogenic phase of this virus, though there was no other evidence for lysogeny in the viral genome. It should be further noted that the function of LPS is not limited to protection of the cell from viral infection but that LPS on bacterial outer membrane confers tolerance of temperature and oxidative stresses as well as resistance to antibiotics (81). Therefore, aceBA and the cell wall biogenesis genes in the TOV-EVG may contribute to a host’s survival and environmental adaptation by altering carbon metabolism and cell surface compositions during the lysogenic phase.

(vii) Temperate phages of SAR116.Our analysis also unveiled phage genomes likely infecting members of the SAR116 clade, which is one of the most abundant marine bacterial lineages (11). OBV_N00085 (40 kb) and three closely related TOV-EVGs (40 to 41 kb; SG for OBV_N00085 = 0.25 to 0.26) exhibited clear collinearity with an approximately 40-kb genomic segment from “Candidatus Puniceispirillum marinum” IMCC1322 of the SAR116 clade (class: Alphaproteobacteria) (Fig. 7 for OBV_N00085) (82). This suggests that these EVGs are derived from temperate phages infecting SAR116 or related bacteria. These genomes consistently encode integrases.

FIG 7
  • Open in new tab
  • Download powerpoint
FIG 7

Genomic alignment between the whole sequence of OBV_N00085 and a genomic region (385,000 to 425,000 bp) of “Candidatus Puniceispirillum marinum” IMCC1322. The OBV_N00085 sequence is circularly permuted at 15,000 bp for clarity, and a red arrow indicates the original start position of the sequence. Putative gene functions and function categories of OBV_N00085 are indicated by texts and colors. All tBLASTx alignments are presented. The color scale represents tBLASTx percent identity.

(viii) Phages related to SAR11 phages.Seven EVGs (OBV_N00073, three TOV-EVGs, and three other EVGs; 39 to 42 kb) exhibited high genome-wide sequence similarities to Pelagibacter podovirus HTVC019P (10) (SG = 0.34 to 0.44; 42 kb; a dot plot comparing OBV_N00073 and HTVC019P is presented in Fig. S2B). On the basis of the SG values (i.e., >0.2937; estimated precision, >90%), we predict that these EVGs infect host species in the genus Pelagibacter (Table S1C). Another Pelagibacter podovirus (i.e., HTVC010P), which is believed to be a member of the most abundant virus subfamily in the biosphere (10), was classified in a different group of the proteomic tree together with 102 EVGs (OBV_N00107, 77 TOV-EVGs, and 24 other EVGs; 31 to 73 kb; Fig. S8). These 102 genomes carry homologs of HTVC010P structural protein genes. The G+C content of the HTVC010P genome is 32% (10), while the EVGs of this group contain higher levels of G+C content (i.e., 31 to 57%). Low levels of G+C content (i.e., 28.6 to 32.3%) are a common genomic feature of the SAR11 clade members (83). Since high levels of correlation between the G+C content of prokaryotic viruses and that of their hosts were previously reported (84, 85), the variation in the levels of their G+C content suggests that the viruses in this group infect a wide range of host species.

FIG S8

Part of the proteomic tree with Pelagibacter podovirus HTVC019P (black), OBV_N00107 (red), 83 TOV-EVGs (orange), and 25 other EVGs (blue). Branch lengths are logarithmically scaled as described for Fig. 3A. Genus-level gOTUs are indicated. Numbers in parentheses represent the number of genomes in each gOTU. Download FIG S8, PDF file, 0.3 MB.
Copyright © 2017 Nishimura et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license .

Environmental viral genomes as a reference during marine virome analyses.We mapped protein sequences and raw reads from independently generated photic virome data (i.e., the Pacific Ocean viromes [POV]) (86) on the RVGs and EVGs. The RVG set recruited 4.70% of the POV proteins, while the EVG/RVG union set recruited 22.6% of the proteins (i.e., a 4.8-fold increase; Fig. 8A). At the nucleotide sequence level, the RVG set recruited 1.02% of the POV reads, while the EVG/RVG union set recruited 4.20% of the reads (i.e., a 4.1-fold increase; Fig. 8B). Thus, the EVGs serve as an effective additional reference viral genome data set for exploring viromes from photic oceans.

FIG 8
  • Open in new tab
  • Download powerpoint
FIG 8

Recruitment of photic POV sequences to RVGs (blue) and to a pool of EVGs and RVGs (red). Mappings were performed with tBLASTn (proteins) and BLASTn (reads). In both mappings, the initial filtering of hits involved an E value of <1e−3, and an additional filtering was based on ≥60% identity and ≥80% alignment length of the query sequence. (A) Recruitment of proteins. (B) Recruitment of reads.

Conclusion.From the assemblies of 52 marine viromes, we obtained 1,567 circular complete genomes that are most likely of prokaryotic dsDNA viral origin. The acquisition of the complete genome sequences helped classify the viral lineages and provided important clues about their hosts and metabolisms. The genome-based clustering of the metagenome-derived viral genomes together with previously reported ones suggests that 600 of the 617 gOTUs represent new genera of prokaryotic viruses. Additionally, they contain greater genome richness than the reference genomes of cultured prokaryotic viruses that have so far been sequenced. Our analyses also predicted the relationships among the EVGs and the major groups of marine prokaryotes, for which no viruses have been isolated (i.e., MGII and SAR86). Given the lack of isolation of viruses, the physiological features of the sequenced EVGs are unclear. However, some of the newly identified EVGs carried functionally related AMGs, such as those encoding proteins related to Fe-S clusters (16 genes) and to carbon assimilation/cell wall biogenesis enzymes (8 genes). These AMGs may function to coordinate the supply/recycling of Fe-S clusters and to enhance host adaptation during the lysogenic cycle. Previous studies also revealed that cyanophages carry multiple functionally linked photosynthesis and lipopolysaccharide synthesis genes for their efficient replication (79, 87, 88). Therefore, viral survival strategies in marine viruses involving many functionally related AMGs appear to target not only the biosynthesis of molecular building blocks (e.g., nucleotides) but also diverse metabolic and cellular processes.

MATERIALS AND METHODS

Sample preparation and sequencing.Seawater samples (9 × 4 liters) were collected at a 5-m depth at the entrance of Osaka Bay (34°19′28″N, 135°7′15″E), Japan, every 3 h for 24 h on 25 and 26 August 2014. Seawater was filtered through a 142-mm-diameter (3.0-μm-pore-size) polycarbonate membrane (Millipore, Billerica, MA) and then through a 142-mm-diameter (0.22-μm-pore-size) Durapore polyvinylidene fluoride membrane (Millipore). The filtrates were stored at 4°C prior to treatments. The viruses in the filtrate were concentrated by FeCl3 precipitation (89) and purified using DNase and a CsCl density centrifugation step (90). The DNA was then extracted as previously described (91). Libraries were prepared using a Nextera XT DNA sample preparation kit (Illumina, San Diego, CA) according to the manufacturer’s protocol, except that we used 0.25 ng viral DNA. Samples were sequenced with a MiSeq sequencing system and MiSeq V2 (2 × 150 bp; five samples) or V3 (2 × 300 bp; four samples) reagent kits (Illumina, San Diego, CA).

Genome assembly and error estimation.Nine OBVs were individually assembled using the following four assemblers: SPAdes, metaSPAdes (http://bioinf.spbau.ru/spades ), IDBA-UD, and Ray Meta. SPAdes 3.1.1 was used with default k-mer lengths as well as the accompanying BayesHammer (92) and MismatchCorrector. The metaSPAdes 3.7.0 program was used with default k-mer lengths and BayesHammer. The IDBA-UD 1.1.1 program was used with fixed multiple k-mer lengths (24 to 124, increased by 10 for 2 × 300 bp reads; 24 to 84, increased by 10 for 2 × 150 bp reads) and the option of a preread correction with a minimum contig length of 300 bp. Ray Meta 2.3.1 was used with a fixed k-mer length (k = 41). Additionally, we used scaffolds of these assemblies, which we called contigs for simplicity. The REAPR 1.0.18 program was used to assess the quality of the assemblies. This program reports four types of errors categorized as short insertion/deletion errors (i.e., types 1 and 3) or scaffolding errors (i.e., types 2 and 4).

Nine sets of OBV reads were also coassembled by SPAdes with the same settings as described above. We determined that a contig was circular (i.e., complete) if its 5′ and 3′ terminal regions were nearly identical (i.e., >94% and ≥50 bp). We identified 40 circular contigs (>10 kb) satisfying this condition. A coassembly involving the merged paired-end reads generated by FLASH was also prepared (93). We included the merged and remaining unmerged reads for the assembly. With this second coassembly, we detected 34 circular contigs (>10 kb), of which 6 were not detected in the first coassembly. We incorporated these 6 contigs in our data set, and we ultimately obtained 934 OBV contigs (>10 kb), including 46 circular ones. Forty-three TOV samples were similarly analyzed, except that the sequence assemblies were prepared sample by sample and only with raw reads (i.e., not from merged paired-end reads). Code for circular contig detection is downloadable at ftp://ftp.genome.jp/pub/db/community/EVG2017 .

Gene prediction and annotation.Gene predictions were completed using MetaGeneMark (94). Homology searches were conducted using BLASTp against the NCBI-nr database (E value, <1e−5), RPS-BLAST against the COG database (as of April 2015; E value, <1e−4), and HMMER against the Pfam (as of May 2015; E value, <1e−4) and TIGRFAMs (release 15; E value, <1e−4) databases. For predictions of tailed-virus hallmark genes and integrase genes, we used HHsearch (E value, <1e−9) against the Pfam database after constructing query hidden Markov models (HMMs) using jackhmmer (part of the HMMER package) with default settings (95, 96). We also used PSI-BLAST to identify homologs of specific genes.

Discrimination of viral and prokaryotic contigs and PCR assays.We used a newly developed method (see Text S1 in the supplemental material) and VirSorter (97) to distinguish between viral and prokaryotic contigs. We discarded all contigs predicted to be of prokaryotic origin by either or both methods. Finally, 879 of the 934 OBV contigs (including 46 circular ones) and 1,554 of the 1,618 TOV circular contigs were considered to originate from viruses.

TEXT S1

New method for discriminating between viral and prokaryotic contigs. Download TEXT S1, DOCX file, 0.05 MB.
Copyright © 2017 Nishimura et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license .

We conducted PCR assays for 21 weakly supported regions in four randomly selected OBV circular contigs (i.e., OBV_N00005, OBV_N00020, OBV_N00021, and OBV_N00023; see Fig. S2A in the supplemental material). Primer sequences are provided in Table S1H in the supplemental material.

Genomic colinearity.Colinearity was evaluated on the basis of the percentage of OBV-EVG genes that had orthologous relationships with the most closely related genome (i.e., Bg in Fig. S2B). If ≥60% of the OBV-EVG genes had orthologs in the closest relative, we considered the OBV-EVG to exhibit nearly complete genomic colinearity. Eighteen OBV-EVGs (39%) were observed to exhibit complete or nearly complete colinearity with other viral genomes. Additionally, we identified colinear genomic regions using MCScanX (98) and calculated the percentage of OBV-EVG genes in these regions (i.e., Cg in Fig. S2B).

Quality control of reads.We used raw reads for the above assemblies, but the reads underwent a quality-control screening before being back-mapped to contigs with the following procedure: (i) duplicated reads were removed using FastUniq (99); (ii) paired-end reads were merged with FLASH, and the merged and unmerged reads were kept; (iii) reads were removed if the percentage of high-quality nucleotide positions (i.e., quality score >30) was <80%; and (iv) reads were removed if the sum of the lengths of ambiguous nucleotide positions and low-complexity regions detected by DUST was >40% of the total length. If one of the paired-end reads was removed in step iii or step iv, the mate was retained as a single read.

Detection of single nucleotide polymorphisms and calculation of nucleotide diversity.To detect SNPs and assess nucleotide diversity, we mapped quality-controlled reads on contigs using the Bowtie 2 program. To minimize the inclusion of sequencing errors among the mapped nucleotides, we considered only high-quality nucleotides (i.e., quality score, >30). Nucleotide diversity was defined as previously described (100) and was calculated using equation 1 of a published method (101). The SNPs were detected for positions with ≥5× sequence coverage using the following six criteria: (i) at least one read, (ii) at least two reads, (iii) more than 10% coverage, (iv) more than 20% coverage, (v) more than 10% coverage or at least two reads, and (vi) more than 10% coverage and at least two reads. These criteria were applied to the second-most-frequent nucleotide at each position.

Redundancy of obtained environmental viral genomes.To detect redundancies among TOV-EVGs and OBV-EVGs, an all-against-all BLASTn search was conducted. We merged high-scoring segment pairs (HSPs) for each resulting pair, and if the merged HSPs covered ≥80% of the shorter EVG, with ≥95% average identity, the EVGs were considered redundant. Nonredundant EVGs were obtained by single-linkage clustering of these redundant pairs.

Viral genomes.We first compiled 46 OBV-EVGs, 1,554 TOV-EVGs, and 247 EVGs from three projects, including 192 complete contigs (33), 54 circular consensus genomes (29), and a complete viral genome obtained from samples from single amplified genomes (SAG) (39). The RVGs were retrieved from RefSeq (release 75; March 2016), EBI Genomes Pages (May 2015), and CAMERA. We selected dsDNA viral genomes that were larger than 10 kb. We then removed the genomes of eukaryotic viruses identified using the GenomeNet Virus–Host Database (85). Thirty-six EVGs (i.e., 1 OBV, 32 TOVs, and 3 others) were most similar to eukaryotic viral genomes among RVGs and were removed from the proteomic tree and gOTU analyses, which were used to compare the levels of diversity of the RVGs and EVGs of prokaryotic viruses.

Proteomic tree.We constructed a proteomic tree as previously described (102). Briefly, the all-against-all distance matrix of the EVG/RVG data set was calculated on the basis of the normalized bit score of tBLASTx (SG), and the proteomic tree was built with BIONJ using the distance matrix. The proteomic tree, gene annotations, and genome alignment views are accessible at http://www.genome.ad.jp/viptree/EVG2017 .

Genus-level operational taxonomic units.The genus-level threshold value for gOTU clustering was estimated from a subset of the RVGs used in this study (i.e., 345 prokaryotic dsDNA viruses), each of which was assigned to a viral genus (i.e., 82 genera in total). We constructed gOTUs with different SG cutoffs (intervals of 0.01) and evaluated how closely the resulting gOTUs corresponded to the genus-level viral classifications using the adjusted Rand index (103).

Host predictions according to proteomic similarities.We attempted to predict host taxonomic groups for EVGs on the basis of viral genomic similarities measured with SG. We estimated the precision of our prediction method on the basis of RVGs (i.e., 1,285 prokaryotic dsDNA viruses), each of which was linked to a uniquely assigned host taxonomic group according to the Virus-Host Database. Regarding host taxonomic groups, Cyanobacteria (phylum) and Enterobacteriaceae (family) were regarded as individual host taxonomic groups because closely related viruses are known to infect hosts of different genera belonging to these host groups. The remaining viral hosts were grouped at the genus level. For each RVG, the best SG values for the members of the same host group, and for the members outside the host group, were recorded (i.e., 2,570 SG scores in total). A precision curve was generated using sliding SG cutoff values (Fig. S6). When the SG cutoff value was >0.3889 or >0.2937, the viral pairs were predicted to infect hosts in the same group at >95% or >90% precision, respectively.

Photosynthetic gene identification.To detect photosynthetic genes in the EVG/RVG data set, we used PSI-BLAST (E value, 1e−6; inclusion_ethresh, 1e−6; num_iterations, 3) and the query sequences listed in Table S1I.

Phylogenetic trees.Multiple sequences were aligned using the MAFFT program (version 7.245) (104), with the FFT-NS-2 mode and a maximum of 1,000 iterations (--retree 2, --maxiterate 1000). Conserved positions in the alignments were selected with the trimAl program (version 1.3) (105). Maximum likelihood trees with 100 bootstrap replicates were calculated with RAxML (version 8.2.4) (106) using the fast bootstrapping mode, and models were selected by the use of ProteinModelSelection.pl (i.e., LGF for DNA polymerase B and LG for chaperonins, IscU, and ATC).

Recruitment of Pacific Ocean virome sequences.Reads (3.68 M sequences) and proteins (2.78 M sequences) of 16 photic POV samples were downloaded from iMicrobe (http://data.imicrobe.us ). These sequences were mapped on EVGs and RVGs using BLASTn (for reads; E value, <1e−3) and tBLASTn (for proteins; E value, <1e−3) if the alignment revealed ≥60% identity and covered ≥80% of the query sequence.

Accession number(s).Read and assembled sequences obtained from OBV were deposited at DNA Data Bank of Japan (DDBJ) under accession numbers DRR053207 to DRR053215 and SAMD00045684 to SAMD00045692 . The sequence data for the OBV project are accessible under DDBJ BioProject accession number PRJDB4437 . Sequences and additional data are available at ftp://ftp.genome.jp/pub/db/community/EVG2017 .

ACKNOWLEDGMENTS

We thank the Tara Oceans consortium, people, and sponsors who supported the Tara Oceans expedition (http://www.embl.de/tara-oceans/ ) for making the data accessible. Computational work was completed at the Supercomputer System, Institute for Chemical Research, Kyoto University.

This work was supported by the Canon Foundation (no. 203143100025), JSPS/KAKENHI (no. 26430184 and 16KT0020), Scientific Research on Innovative Areas from the Ministry of Education, Culture, Science, Sports and Technology (MEXT) of Japan (no. 16H06429, 16K21723, and 16H06437), and the Collaborative Research Program of the Institute for Chemical Research, Kyoto University (no. 2016-28). P.H. was supported by the OCEANOMICS “Investissements d’Avenir” program of the French Government (no. ANR-11-BTBR-0008). M.B.S. was supported by Gordon and Betty Moore Foundation grants (no. 3790 and GBMF2631), and S.R. was partially supported by the University of Arizona Technology and Research Initiative Fund through a grant from the Water, Environmental, and Energy Solutions Initiative and the Ecosystem Genomics Institute to M.B.S.

This is contribution number 51 of the Tara Oceans Expedition 2009–2012.

FOOTNOTES

    • Received December 7, 2016.
    • Accepted February 2, 2017.
  • Copyright © 2017 Nishimura et al.

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license .

REFERENCES

  1. 1.↵
    1. Bergh O,
    2. Børsheim KY,
    3. Bratbak G,
    4. Heldal M
    . 1989. High abundance of viruses found in aquatic environments. Nature340:467–468. doi:10.1038/340467a0.
    OpenUrlCrossRefPubMedWeb of Science
  2. 2.↵
    1. Falkowski PG,
    2. Fenchel T,
    3. Delong EF
    . 2008. The microbial engines that drive Earth’s biogeochemical cycles. Science320:1034–1039. doi:10.1126/science.1153213.
    OpenUrlAbstract/FREE Full Text
  3. 3.↵
    1. Proctor LM,
    2. Fuhrman JA
    . 1990. Viral mortality of marine bacteria and cyanobacteria. Nature343:60–62. doi:10.1038/343060a0.
    OpenUrlCrossRefWeb of Science
  4. 4.↵
    1. Mann NH,
    2. Cook A,
    3. Millard A,
    4. Bailey S,
    5. Clokie M
    . 2003. Marine ecosystems: bacterial photosynthesis genes in a virus. Nature424:741. doi:10.1038/424741a.
    OpenUrlCrossRefPubMed
  5. 5.↵
    1. Brüssow H,
    2. Canchaya C,
    3. Hardt WD
    . 2004. Phages and the evolution of bacterial pathogens: from genomic rearrangements to lysogenic conversion. Microbiol Mol Biol Rev68:560–602. doi:10.1128/MMBR.68.3.560-602.2004.
    OpenUrlAbstract/FREE Full Text
  6. 6.↵
    1. Sullivan MB,
    2. Lindell D,
    3. Lee JA,
    4. Thompson LR,
    5. Bielawski JP,
    6. Chisholm SW
    . 2006. Prevalence and evolution of core photosystem II genes in marine cyanobacterial viruses and their hosts. PLoS Biol4:e234. doi:10.1371/journal.pbio.0040234.
    OpenUrlCrossRefPubMed
  7. 7.↵
    1. Hatfull GF
    . 2008. Bacteriophage genomics. Curr Opin Microbiol11:447–453. doi:10.1016/j.mib.2008.09.004.
    OpenUrlCrossRefPubMedWeb of Science
  8. 8.↵
    1. Rohwer F
    . 2003. Global phage diversity. Cell113:141. doi:10.1016/S0092-8674(03)00276-9.
    OpenUrlCrossRefPubMedWeb of Science
  9. 9.↵
    1. Brum JR,
    2. Ignacio-Espinoza JC,
    3. Roux S,
    4. Doulcier G,
    5. Acinas SG,
    6. Alberti A,
    7. Chaffron S,
    8. Cruaud C,
    9. de Vargas C,
    10. Gasol JM,
    11. Gorsky G,
    12. Gregory AC,
    13. Guidi L,
    14. Hingamp P,
    15. Iudicone D,
    16. Not F,
    17. Ogata H,
    18. Pesant S,
    19. Poulos BT,
    20. Schwenck SM,
    21. Speich S,
    22. Dimier C,
    23. Kandels-Lewis S,
    24. Picheral M,
    25. Searson S,
    26. Tara Oceans Coordinators
    27. Bork P,
    28. Bowler C,
    29. Sunagawa S,
    30. Wincker P,
    31. Karsenti E,
    32. Sullivan MB
    . 2015. Ocean plankton. Patterns and ecological drivers of ocean viral communities. Science 348:1261498. doi:10.1126/science.1261498.
    OpenUrlAbstract/FREE Full Text
  10. 10.↵
    1. Zhao Y,
    2. Temperton B,
    3. Thrash JC,
    4. Schwalbach MS,
    5. Vergin KL,
    6. Landry ZC,
    7. Ellisman M,
    8. Deerinck T,
    9. Sullivan MB,
    10. Giovannoni SJ
    . 2013. Abundant SAR11 viruses in the ocean. Nature494:357–360. doi:10.1038/nature11921.
    OpenUrlCrossRefPubMedWeb of Science
  11. 11.↵
    1. Kang I,
    2. Oh HM,
    3. Kang D,
    4. Cho JC
    . 2013. Genome of a SAR116 bacteriophage shows the prevalence of this phage type in the oceans. Proc Natl Acad Sci U S A110:12343–12348. doi:10.1073/pnas.1219930110.
    OpenUrlAbstract/FREE Full Text
  12. 12.↵
    1. Roux S,
    2. Hallam SJ,
    3. Woyke T,
    4. Sullivan MB
    . 2015. Viral dark matter and virus-host interactions resolved from publicly available microbial genomes. Elife4:e08490. doi:10.7554/eLife.08490.
    OpenUrlCrossRefPubMed
  13. 13.↵
    1. Hingamp P,
    2. Grimsley N,
    3. Acinas SG,
    4. Clerissi C,
    5. Subirana L,
    6. Poulain J,
    7. Ferrera I,
    8. Sarmento H,
    9. Villar E,
    10. Lima-Mendez G,
    11. Faust K,
    12. Sunagawa S,
    13. Claverie JM,
    14. Moreau H,
    15. Desdevises Y,
    16. Bork P,
    17. Raes J,
    18. de Vargas C,
    19. Karsenti E,
    20. Kandels-Lewis S,
    21. Jaillon O,
    22. Not F,
    23. Pesant S,
    24. Wincker P,
    25. Ogata H
    . 2013. Exploring nucleo-cytoplasmic large DNA viruses in Tara oceans microbial metagenomes. ISME J7:1678–1695. doi:10.1038/ismej.2013.59.
    OpenUrlCrossRefPubMedWeb of Science
  14. 14.↵
    1. Hurwitz BL,
    2. Hallam SJ,
    3. Sullivan MB
    . 2013. Metabolic reprogramming by viruses in the sunlit and dark ocean. Genome Biol14:R123. doi:10.1186/gb-2013-14-11-r123.
    OpenUrlCrossRefPubMed
  15. 15.↵
    1. Hurwitz BL,
    2. Brum JR,
    3. Sullivan MB
    . 2015. Depth-stratified functional and taxonomic niche specialization in the ‘core’ and “flexible” Pacific Ocean virome. ISME J9:472–484. doi:10.1038/ismej.2014.143.
    OpenUrlCrossRefPubMed
  16. 16.↵
    1. Roux S,
    2. Brum JR,
    3. Dutilh BE,
    4. Sunagawa S,
    5. Duhaime MB,
    6. Loy A,
    7. Poulos BT,
    8. Solonenko N,
    9. Lara E,
    10. Poulain J,
    11. Pesant S,
    12. Kandels-Lewis S,
    13. Dimier C,
    14. Picheral M,
    15. Searson S,
    16. Cruaud C,
    17. Alberti A,
    18. Duarte CM,
    19. Gasol JM,
    20. Vaqué D,
    21. Tara Oceans Coordinators
    22. Bork P,
    23. Acinas SG,
    24. Wincker P,
    25. Sullivan MB
    . 2016. Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses. Nature537:689–693. doi:10.1038/nature19366.
    OpenUrlCrossRefPubMed
  17. 17.↵
    1. Culley AI,
    2. Lang AS,
    3. Suttle CA
    . 2006. Metagenomic analysis of coastal RNA virus communities. Science312:1795–1798. doi:10.1126/science.1127404.
    OpenUrlAbstract/FREE Full Text
  18. 18.↵
    1. Culley AI,
    2. Mueller JA,
    3. Belcaid M,
    4. Wood-Charlson EM,
    5. Poisson G,
    6. Steward GF
    . 2014. The characterization of RNA viruses in tropical seawater using targeted PCR and metagenomics. mBio5:e01210-14. doi:10.1128/mBio.01210-14.
    OpenUrlAbstract/FREE Full Text
  19. 19.↵
    1. López-Bueno A,
    2. Tamames J,
    3. Velázquez D,
    4. Moya A,
    5. Quesada A,
    6. Alcamí A
    . 2009. High diversity of the viral community from an Antarctic lake. Science326:858–861. doi:10.1126/science.1179287.
    OpenUrlAbstract/FREE Full Text
  20. 20.↵
    1. Rosario K,
    2. Duffy S,
    3. Breitbart M
    . 2009. Diverse circovirus-like genome architectures revealed by environmental metagenomics. J Gen Virol90:2418–2424. doi:10.1099/vir.0.012955-0.
    OpenUrlCrossRefPubMedWeb of Science
  21. 21.↵
    1. Tucker KP,
    2. Parsons R,
    3. Symonds EM,
    4. Breitbart M
    . 2011. Diversity and distribution of single-stranded DNA phages in the North Atlantic Ocean. ISME J5:822–830. doi:10.1038/ismej.2010.188.
    OpenUrlCrossRefPubMedWeb of Science
  22. 22.↵
    1. Diemer GS,
    2. Stedman KM
    . 2012. A novel virus genome discovered in an extreme environment suggests recombination between unrelated groups of RNA and DNA viruses. Biol Direct7:13. doi:10.1186/1745-6150-7-13.
    OpenUrlCrossRefPubMed
  23. 23.↵
    1. Roux S,
    2. Krupovic M,
    3. Poulet A,
    4. Debroas D,
    5. Enault F
    . 2012. Evolution and diversity of the Microviridae viral family through a collection of 81 new complete genomes assembled from virome reads. PLoS One7:e40418. doi:10.1371/journal.pone.0040418.
    OpenUrlCrossRefPubMed
  24. 24.↵
    1. Labonté JM,
    2. Suttle CA
    . 2013. Previously unknown and highly divergent ssDNA viruses populate the oceans. ISME J7:2169–2177. doi:10.1038/ismej.2013.110.
    OpenUrlCrossRefPubMedWeb of Science
  25. 25.↵
    1. McDaniel LD,
    2. Rosario K,
    3. Breitbart M,
    4. Paul JH
    . 2014. Comparative metagenomics: natural populations of induced prophages demonstrate highly unique, lower diversity viral sequences. Environ Microbiol16:570–585. doi:10.1111/1462-2920.12184.
    OpenUrlCrossRef
  26. 26.↵
    1. Zawar-Reza P,
    2. Argüello-Astorga GR,
    3. Kraberger S,
    4. Julian L,
    5. Stainton D,
    6. Broady PA,
    7. Varsani A
    . 2014. Diverse small circular single-stranded DNA viruses identified in a freshwater pond on the McMurdo Ice Shelf (Antarctica). Infect Genet Evol26:132–138. doi:10.1016/j.meegid.2014.05.018.
    OpenUrlCrossRefPubMed
  27. 27.↵
    1. Emerson JB,
    2. Thomas BC,
    3. Andrade K,
    4. Allen EE,
    5. Heidelberg KB,
    6. Banfield JF
    . 2012. Dynamic viral populations in hypersaline systems as revealed by metagenomic assembly. Appl Environ Microbiol78:6309–6320. doi:10.1128/AEM.01212-12.
    OpenUrlAbstract/FREE Full Text
  28. 28.↵
    1. Anantharaman K,
    2. Duhaime MB,
    3. Breier JA,
    4. Wendt KA,
    5. Toner BM,
    6. Dick GJ
    . 2014. Sulfur oxidation genes in diverse deep-sea viruses. Science344:757–760. doi:10.1126/science.1252229.
    OpenUrlAbstract/FREE Full Text
  29. 29.↵
    1. Bellas CM,
    2. Anesio AM,
    3. Barker G
    . 2015. Analysis of virus genomes from glacial environments reveals novel virus groups with unusual host interactions. Front Microbiol6:656. doi:10.3389/fmicb.2015.00656.
    OpenUrlCrossRefPubMed
  30. 30.↵
    1. Duhaime MB,
    2. Sullivan MB
    . 2012. Ocean viruses: rigorously evaluating the metagenomic sample-to-sequence pipeline. Virology434:181–186. doi:10.1016/j.virol.2012.09.036.
    OpenUrlCrossRefPubMed
  31. 31.↵
    1. Santos F,
    2. Meyerdierks A,
    3. Peña A,
    4. Rosselló-Mora R,
    5. Amann R,
    6. Antón J
    . 2007. Metagenomic approach to the study of halophages: the environmental halophage 1. Environ Microbiol9:1711–1723. doi:10.1111/j.1462-2920.2007.01289.x.
    OpenUrlCrossRefPubMedWeb of Science
  32. 32.↵
    1. Garcia-Heredia I,
    2. Martin-Cuadrado AB,
    3. Mojica FJ,
    4. Santos F,
    5. Mira A,
    6. Antón J,
    7. Rodriguez-Valera F
    . 2012. Reconstructing viral genomes from the environment using fosmid clones: the case of haloviruses. PLoS One7:e33802. doi:10.1371/journal.pone.0033802.
    OpenUrlCrossRefPubMed
  33. 33.↵
    1. Mizuno CM,
    2. Rodriguez-Valera F,
    3. Kimes NE,
    4. Ghai R
    . 2013. Expanding the marine virosphere using metagenomics. PLoS Genet9:e1003987. doi:10.1371/journal.pgen.1003987.
    OpenUrlCrossRef
  34. 34.↵
    1. Bankevich A,
    2. Nurk S,
    3. Antipov D,
    4. Gurevich AA,
    5. Dvorkin M,
    6. Kulikov AS,
    7. Lesin VM,
    8. Nikolenko SI,
    9. Pham S,
    10. Prjibelski AD,
    11. Pyshkin AV,
    12. Sirotkin AV,
    13. Vyahhi N,
    14. Tesler G,
    15. Alekseyev MA,
    16. Pevzner PA
    . 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol19:455–477. doi:10.1089/cmb.2012.0021.
    OpenUrlCrossRefPubMed
  35. 35.↵
    1. Peng Y,
    2. Leung HCM,
    3. Yiu SM,
    4. Chin FYL
    . 2012. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics28:1420–1428. doi:10.1093/bioinformatics/bts174.
    OpenUrlCrossRefPubMedWeb of Science
  36. 36.↵
    1. Boisvert S,
    2. Raymond F,
    3. Godzaridis E,
    4. Laviolette F,
    5. Corbeil J
    . 2012. Ray Meta: scalable de novo metagenome assembly and profiling. Genome Biol13:R122. doi:10.1186/gb-2012-13-12-r122.
    OpenUrlCrossRefPubMed
  37. 37.↵
    1. Hunt M,
    2. Kikuchi T,
    3. Sanders M,
    4. Newbold C,
    5. Berriman M,
    6. Otto TD
    . 2013. REAPR: a universal tool for genome assembly evaluation. Genome Biol14:R47. doi:10.1186/gb-2013-14-5-r47.
    OpenUrlCrossRefPubMed
  38. 38.↵
    1. Bao Y,
    2. Chetvernin V,
    3. Tatusova T
    . 2014. Improvements to pairwise sequence comparison (PASC): a genome-based web tool for virus classification. Arch Virol159:3293–3304. doi:10.1007/s00705-014-2197-x.
    OpenUrlCrossRefPubMed
  39. 39.↵
    1. Labonté JM,
    2. Swan BK,
    3. Poulos B,
    4. Luo H,
    5. Koren S,
    6. Hallam SJ,
    7. Sullivan MB,
    8. Woyke T,
    9. Wommack KE,
    10. Stepanauskas R
    . 2015. Single-cell genomics-based analysis of virus-host interactions in marine surface bacterioplankton. ISME J9:2386–2399. doi:10.1038/ismej.2015.48.
    OpenUrlCrossRef
  40. 40.↵
    1. Rohwer F,
    2. Edwards R
    . 2002. The phage proteomic tree: a genome-based taxonomy for phage. J Bacteriol184:4529–4535. doi:10.1128/JB.184.16.4529-4535.2002.
    OpenUrlAbstract/FREE Full Text
  41. 41.↵
    1. Paez-Espino D,
    2. Eloe-Fadrosh EA,
    3. Pavlopoulos GA,
    4. Thomas AD,
    5. Huntemann M,
    6. Mikhailova N,
    7. Rubin E,
    8. Ivanova NN,
    9. Kyrpides NC
    . 2016. Uncovering Earth’s virome. Nature536:425–430. doi:10.1038/nature19094.
    OpenUrlCrossRefPubMed
  42. 42.↵
    1. Ackermann HW,
    2. Prangishvili D
    . 2012. Prokaryote viruses studied by electron microscopy. Arch Virol157:1843–1849. doi:10.1007/s00705-012-1383-y.
    OpenUrlCrossRefPubMed
  43. 43.↵
    1. Krupovic M,
    2. Spang A,
    3. Gribaldo S,
    4. Forterre P,
    5. Schleper C
    . 2011. A thaumarchaeal provirus testifies for an ancient association of tailed viruses with archaea. Biochem Soc Trans39:82–88. doi:10.1042/BST0390082.
    OpenUrlAbstract/FREE Full Text
  44. 44.↵
    1. Hildenbrand ZL,
    2. Bernal RA
    . 2012. Chaperonin-mediated folding of viral proteins. Adv Exp Med Biol726:307–324. doi:10.1007/978-1-4614-0980-9_13.
    OpenUrlCrossRefPubMed
  45. 45.↵
    1. Iverson V,
    2. Morris RM,
    3. Frazar CD,
    4. Berthiaume CT,
    5. Morales RL,
    6. Armbrust EV
    . 2012. Untangling genomes from metagenomes: revealing an uncultured class of marine Euryarchaeota. Science335:587–590. doi:10.1126/science.1212665.
    OpenUrlAbstract/FREE Full Text
  46. 46.↵
    1. Martin-Cuadrado AB,
    2. Garcia-Heredia I,
    3. Moltó AG,
    4. López-Úbeda R,
    5. Kimes N,
    6. López-García P,
    7. Moreira D,
    8. Rodriguez-Valera F
    . 2015. A new class of marine Euryarchaeota group II from the Mediterranean deep chlorophyll maximum. ISME J9:1619–1634. doi:10.1038/ismej.2014.249.
    OpenUrlCrossRefPubMed
  47. 47.↵
    1. Fuhrman JA,
    2. Davis AA
    . 1997. Widespread Archaea and novel Bacteria from the deep sea as shown by 16S rRNA gene sequences. Mar Ecol Prog Ser150:275–285. doi:10.3354/meps150275.
    OpenUrlCrossRefWeb of Science
  48. 48.↵
    1. Könneke M,
    2. Bernhard AE,
    3. de la Torre JR,
    4. Walker CB,
    5. Waterbury JB,
    6. Stahl DA
    . 2005. Isolation of an autotrophic ammonia-oxidizing marine archaeon. Nature437:543–546. doi:10.1038/nature03911.
    OpenUrlCrossRefPubMedWeb of Science
  49. 49.↵
    1. Massana R,
    2. DeLong EF,
    3. Pedrós-Alió C
    . 2000. A few cosmopolitan phylotypes dominate planktonic archaeal assemblages in widely different oceanic provinces. Appl Environ Microbiol66:1777–1787. doi:10.1128/AEM.66.5.1777-1787.2000.
    OpenUrlAbstract/FREE Full Text
  50. 50.↵
    1. DeLong EF,
    2. Preston CM,
    3. Mincer T,
    4. Rich V,
    5. Hallam SJ,
    6. Frigaard NU,
    7. Martinez A,
    8. Sullivan MB,
    9. Edwards R,
    10. Brito BR,
    11. Chisholm SW,
    12. Karl DM
    . 2006. Community genomics among stratified microbial assemblages in the ocean’s interior. Science311:496–503. doi:10.1126/science.1120250.
    OpenUrlAbstract/FREE Full Text
  51. 51.↵
    1. Needham DM,
    2. Fuhrman JA
    . 2016. Pronounced daily succession of phytoplankton, archaea and bacteria following a spring bloom. Nat Microbiol1:16005. doi:10.1038/nmicrobiol.2016.5.
    OpenUrlCrossRef
  52. 52.↵
    1. Barras F,
    2. Loiseau L,
    3. Py B
    . 2005. How Escherichia coli and Saccharomyces cerevisiae build Fe/S proteins. Adv Microb Physiol50:41–101. doi:10.1016/S0065-2911(05)50002-X.
    OpenUrlCrossRefPubMedWeb of Science
  53. 53.↵
    1. Sharon I,
    2. Battchikova N,
    3. Aro EM,
    4. Giglione C,
    5. Meinnel T,
    6. Glaser F,
    7. Pinter RY,
    8. Breitbart M,
    9. Rohwer F,
    10. Béjà O
    . 2011. Comparative metagenomics of microbial traits within oceanic viral communities. ISME J5:1178–1190. doi:10.1038/ismej.2011.2.
    OpenUrlCrossRefPubMedWeb of Science
  54. 54.↵
    1. Vinella D,
    2. Brochier-Armanet C,
    3. Loiseau L,
    4. Talla E,
    5. Barras F
    . 2009. Iron-sulfur (Fe/S) protein biogenesis: phylogenomic and genetic studies of A-type carriers. PLoS Genet5:e1000497. doi:10.1371/journal.pgen.1000497.
    OpenUrlCrossRefPubMed
  55. 55.↵
    1. Lill R,
    2. Dutkiewicz R,
    3. Elsässer HP,
    4. Hausmann A,
    5. Netz DJ,
    6. Pierik AJ,
    7. Stehling O,
    8. Urzica E,
    9. Mühlenhoff U
    . 2006. Mechanisms of iron-sulfur protein maturation in mitochondria, cytosol and nucleus of eukaryotes. Biochim Biophys Acta1763:652–667. doi:10.1016/j.bbamcr.2006.05.011.
    OpenUrlCrossRefPubMed
  56. 56.↵
    1. Shepard EM,
    2. Boyd ES,
    3. Broderick JB,
    4. Peters JW
    . 2011. Biosynthesis of complex iron-sulfur enzymes. Curr Opin Chem Biol15:319–327. doi:10.1016/j.cbpa.2011.02.012.
    OpenUrlCrossRefPubMed
  57. 57.↵
    1. Dupont CL,
    2. Rusch DB,
    3. Yooseph S,
    4. Lombardo MJ,
    5. Richter RA,
    6. Valas R,
    7. Novotny M,
    8. Yee-Greenbaum J,
    9. Selengut JD,
    10. Haft DH,
    11. Halpern AL,
    12. Lasken RS,
    13. Nealson K,
    14. Friedman R,
    15. Venter JC
    . 2012. Genomic insights to SAR86, an abundant and uncultivated marine bacterial lineage. ISME J6:1186–1199. doi:10.1038/ismej.2011.189.
    OpenUrlCrossRefPubMedWeb of Science
  58. 58.↵
    1. Grell TA,
    2. Goldman PJ,
    3. Drennan CL
    . 2015. SPASM and twitch domains in S-adenosylmethionine (SAM) radical enzymes. J Biol Chem290:3964–3971. doi:10.1074/jbc.R114.581249.
    OpenUrlAbstract/FREE Full Text
  59. 59.↵
    1. White MF,
    2. Dillingham MS
    . 2012. Iron-sulphur clusters in nucleic acid processing enzymes. Curr Opin Struct Biol22:94–100. doi:10.1016/j.sbi.2011.11.004.
    OpenUrlCrossRefPubMed
  60. 60.↵
    1. Hooton SP,
    2. Connerton IF
    . 2014. Campylobacter jejuni acquire new host-derived CRISPR spacers when in association with bacteriophages harboring a CRISPR-like Cas4 protein. Front Microbiol5:744. doi:10.3389/fmicb.2014.00744.
    OpenUrlCrossRefPubMed
  61. 61.↵
    1. Roche B,
    2. Aussel L,
    3. Ezraty B,
    4. Mandin P,
    5. Py B,
    6. Barras F
    . 2013. Iron/sulfur proteins biogenesis in prokaryotes: formation, regulation and diversity. Biochim Biophys Acta1827:455–469. doi:10.1016/j.bbabio.2012.12.010.
    OpenUrlCrossRefPubMedWeb of Science
  62. 62.↵
    1. Fernández C,
    2. Ferrández A,
    3. Miñambres B,
    4. Díaz E,
    5. García JL
    . 2006. Genetic characterization of the phenylacetyl-coenzyme A oxygenase from the aerobic phenylacetic acid degradation pathway of Escherichia coli. Appl Environ Microbiol72:7422–7426. doi:10.1128/AEM.01550-06.
    OpenUrlAbstract/FREE Full Text
  63. 63.↵
    1. Chénard C,
    2. Chan AM,
    3. Vincent WF,
    4. Suttle CA
    . 2015. Polar freshwater cyanophage S-EIV1 represents a new widespread evolutionary lineage of phages. ISME J9:2046–2058. doi:10.1038/ismej.2015.24.
    OpenUrlCrossRef
  64. 64.↵
    1. Hahnke RL,
    2. Bennke CM,
    3. Fuchs BM,
    4. Mann AJ,
    5. Rhiel E,
    6. Teeling H,
    7. Amann R,
    8. Harder J
    . 2015. Dilution cultivation of marine heterotrophic bacteria abundant after a spring phytoplankton bloom in the North Sea. Environ Microbiol17:3515–3526. doi:10.1111/1462-2920.12479.
    OpenUrlCrossRef
  65. 65.↵
    1. Teeling H,
    2. Fuchs BM,
    3. Becher D,
    4. Klockow C,
    5. Gardebrecht A,
    6. Bennke CM,
    7. Kassabgy M,
    8. Huang S,
    9. Mann AJ,
    10. Waldmann J,
    11. Weber M,
    12. Klindworth A,
    13. Otto A,
    14. Lange J,
    15. Bernhardt J,
    16. Reinsch C,
    17. Hecker M,
    18. Peplies J,
    19. Bockelmann FD,
    20. Callies U,
    21. Gerdts G,
    22. Wichels A,
    23. Wiltshire KH,
    24. Glöckner FO,
    25. Schweder T,
    26. Amann R
    . 2012. Substrate-controlled succession of marine bacterioplankton populations induced by a phytoplankton bloom. Science336:608–611. doi:10.1126/science.1218344.
    OpenUrlAbstract/FREE Full Text
  66. 66.↵
    1. Borriss M,
    2. Lombardot T,
    3. Glöckner FO,
    4. Becher D,
    5. Albrecht D,
    6. Schweder T
    . 2007. Genome and proteome characterization of the psychrophilic Flavobacterium bacteriophage 11b. Extremophiles11:95–104. doi:10.1007/s00792-006-0014-5.
    OpenUrlCrossRefPubMed
  67. 67.↵
    1. Kang I,
    2. Kang D,
    3. Cho JC
    . 2012. Complete genome sequence of Croceibacter bacteriophage P2559S. J Virol86:8912–8913. doi:10.1128/JVI.01396-12.
    OpenUrlAbstract/FREE Full Text
  68. 68.↵
    1. Kang I,
    2. Jang H,
    3. Cho JC
    . 2012. Complete genome sequences of two Persicivirga bacteriophages, P12024S and P12024L. J Virol86:8907–8908. doi:10.1128/JVI.01327-12.
    OpenUrlAbstract/FREE Full Text
  69. 69.↵
    1. Holmfeldt K,
    2. Solonenko N,
    3. Shah M,
    4. Corrier K,
    5. Riemann L,
    6. Verberkmoes NC,
    7. Sullivan MB
    . 2013. Twelve previously unknown phage genera are ubiquitous in global oceans. Proc Natl Acad Sci U S A110:12798–12803. doi:10.1073/pnas.1305956110.
    OpenUrlAbstract/FREE Full Text
  70. 70.↵
    1. Senčilo A,
    2. Luhtanen AM,
    3. Saarijärvi M,
    4. Bamford DH,
    5. Roine E
    . 2015. Cold-active bacteriophages from the Baltic Sea ice have diverse genomes and virus-host interactions. Environ Microbiol17:3628–3641. doi:10.1111/1462-2920.12611.
    OpenUrlCrossRef
  71. 71.↵
    1. Kang I,
    2. Jang H,
    3. Cho JC
    . 2015. Complete genome sequences of bacteriophages P12002L and P12002S, two lytic phages that infect a marine Polaribacter strain. Stand Genomic Sci10:82. doi:10.1186/s40793-015-0076-z.
    OpenUrlCrossRef
  72. 72.↵
    1. Pinhassi J,
    2. Bowman JP,
    3. Nedashkovskaya OI,
    4. Lekunberri I,
    5. Gomez-Consarnau L,
    6. Pedrós-Alió C
    . 2006. Leeuwenhoekiella blandensis sp. nov., a genome-sequenced marine member of the family Flavobacteriaceae. Int J Syst Evol Microbiol56:1489–1493. doi:10.1099/ijs.0.64232-0.
    OpenUrlCrossRefPubMedWeb of Science
  73. 73.↵
    1. Gómez-Consarnau L,
    2. González JM,
    3. Coll-Lladó M,
    4. Gourdon P,
    5. Pascher T,
    6. Neutze R,
    7. Pedrós-Alió C,
    8. Pinhassi J
    . 2007. Light stimulates growth of proteorhodopsin-containing marine Flavobacteria. Nature445:210–213. doi:10.1038/nature05381.
    OpenUrlCrossRefPubMedWeb of Science
  74. 74.↵
    1. Cozzone AJ
    . 1998. Regulation of acetate metabolism by protein phosphorylation in enteric bacteria. Annu Rev Microbiol52:127–164. doi:10.1146/annurev.micro.52.1.127.
    OpenUrlCrossRefPubMedWeb of Science
  75. 75.↵
    1. Dunn MF,
    2. Ramírez-Trujillo JA,
    3. Hernández-Lucas I
    . 2009. Major roles of isocitrate lyase and malate synthase in bacterial and fungal pathogenesis. Microbiology155:3166–3175. doi:10.1099/mic.0.030858-0.
    OpenUrlCrossRefPubMedWeb of Science
  76. 76.↵
    1. Vimr ER,
    2. Steenbergen SM
    . 2009. Early molecular-recognition events in the synthesis and export of group 2 capsular polysaccharides. Microbiology155:9–15. doi:10.1099/mic.0.023564-0.
    OpenUrlCrossRefPubMedWeb of Science
  77. 77.↵
    1. Smyth KM,
    2. Marchant A
    . 2013. Conservation of the 2-keto-3-deoxymanno-octulosonic acid (Kdo) biosynthesis pathway between plants and bacteria. Carbohydr Res380:70–75. doi:10.1016/j.carres.2013.07.006.
    OpenUrlCrossRefPubMed
  78. 78.↵
    1. Williamson SJ,
    2. McLaughlin MR,
    3. Paul JH
    . 2001. Interaction of the PhiHSIC virus with its host: lysogeny or pseudolysogeny?Appl Environ Microbiol67:1682–1688. doi:10.1128/AEM.67.4.1682-1688.2001.
    OpenUrlAbstract/FREE Full Text
  79. 79.↵
    1. Sullivan MB,
    2. Coleman ML,
    3. Weigele P,
    4. Rohwer F,
    5. Chisholm SW
    . 2005. Three Prochlorococcus cyanophage genomes: signature features and ecological interpretations. PLoS Biol3:e144. doi:10.1371/journal.pbio.0030144.
    OpenUrlCrossRefPubMed
  80. 80.↵
    1. Dziewit L,
    2. Jazurek M,
    3. Drewniak L,
    4. Baj J,
    5. Bartosik D
    . 2007. The SXT conjugative element and linear prophage N15 encode toxin-antitoxin-stabilizing systems homologous to the tad-ata module of the Paracoccus aminophilus plasmid pAMI2. J Bacteriol189:1983–1997. doi:10.1128/JB.01610-06.
    OpenUrlAbstract/FREE Full Text
  81. 81.↵
    1. Thomsen LE,
    2. Chadfield MS,
    3. Bispham J,
    4. Wallis TS,
    5. Olsen JE,
    6. Ingmer H
    . 2003. Reduced amounts of LPS affect both stress tolerance and virulence of Salmonella enterica serovar Dublin. FEMS Microbiol Lett228:225–231. doi:10.1016/S0378-1097(03)00762-6.
    OpenUrlCrossRefPubMed
  82. 82.↵
    1. Oh HM,
    2. Kwon KK,
    3. Kang I,
    4. Kang SG,
    5. Lee JH,
    6. Kim SJ,
    7. Cho JC
    . 2010. Complete genome sequence of “Candidatus Puniceispirillum marinum” IMCC1322, a representative of the SAR116 clade in the Alphaproteobacteria. J Bacteriol192:3240–3241. doi:10.1128/JB.00347-10.
    OpenUrlAbstract/FREE Full Text
  83. 83.↵
    1. Grote J,
    2. Thrash JC,
    3. Huggett MJ,
    4. Landry ZC,
    5. Carini P,
    6. Giovannoni SJ,
    7. Rappé MS
    . 2012. Streamlining and core genome conservation among highly divergent members of the SAR11 clade. mBio3:e00252-12. doi:10.1128/mBio.00252-12.
    OpenUrlAbstract/FREE Full Text
  84. 84.↵
    1. Cardinale DJ,
    2. Duffy S
    . 2011. Single-stranded genomic architecture constrains optimal codon usage. Bacteriophage1:219–224. doi:10.4161/bact.1.4.18496.
    OpenUrlCrossRefPubMed
  85. 85.↵
    1. Mihara T,
    2. Nishimura Y,
    3. Shimizu Y,
    4. Nishiyama H,
    5. Yoshikawa G,
    6. Uehara H,
    7. Hingamp P,
    8. Goto S,
    9. Ogata H
    . 2016. Linking virus genomes with host taxonomy. Viruses8:66. doi:10.3390/v8030066.
    OpenUrlCrossRef
  86. 86.↵
    1. Hurwitz BL,
    2. Sullivan MB
    . 2013. The Pacific Ocean virome (POV): a marine viral metagenomic dataset and associated protein clusters for quantitative viral ecology. PLoS One8:e57355. doi:10.1371/journal.pone.0057355.
    OpenUrlCrossRefPubMed
  87. 87.↵
    1. Breitbart M,
    2. Thompson LR,
    3. Suttle CA,
    4. Sullivan MB
    . 2007. Exploring the vast diversity of marine viruses. Oceanography20:135–139. doi:10.5670/oceanog.2007.58.
    OpenUrlCrossRefWeb of Science
  88. 88.↵
    1. Thompson LR,
    2. Zeng Q,
    3. Kelly L,
    4. Huang KH,
    5. Singer AU,
    6. Stubbe J,
    7. Chisholm SW
    . 2011. Phage auxiliary metabolic genes and the redirection of cyanobacterial host carbon metabolism. Proc Natl Acad Sci U S A108:E757–E764. doi:10.1073/pnas.1102164108.
    OpenUrlAbstract/FREE Full Text
  89. 89.↵
    1. John SG,
    2. Mendez CB,
    3. Deng L,
    4. Poulos B,
    5. Kauffman AK,
    6. Kern S,
    7. Brum J,
    8. Polz MF,
    9. Boyle EA,
    10. Sullivan MB
    . 2011. A simple and efficient method for concentration of ocean viruses by chemical flocculation. Environ Microbiol Rep3:195–202. doi:10.1111/j.1758-2229.2010.00208.x.
    OpenUrlCrossRefPubMed
  90. 90.↵
    1. Hurwitz BL,
    2. Deng L,
    3. Poulos BT,
    4. Sullivan MB
    . 2013. Evaluation of methods to concentrate and purify ocean virus communities through comparative, replicated metagenomics. Environ Microbiol15:1428–1440. doi:10.1111/j.1462-2920.2012.02836.x.
    OpenUrlCrossRefPubMedWeb of Science
  91. 91.↵
    1. Kimura S,
    2. Yoshida T,
    3. Hosoda N,
    4. Honda T,
    5. Kuno S,
    6. Kamiji R,
    7. Hashimoto R,
    8. Sako Y
    . 2012. Diurnal infection patterns and impact of Microcystis cyanophages in a Japanese pond. Appl Environ Microbiol78:5805–5811. doi:10.1128/AEM.00571-12.
    OpenUrlAbstract/FREE Full Text
  92. 92.↵
    1. Nikolenko SI,
    2. Korobeynikov AI,
    3. Alekseyev MA
    . 2013. BayesHammer: Bayesian clustering for error correction in single-cell sequencing. BMC Genomics14(Suppl 1):S7. doi:10.1186/1471-2164-14-S1-S7.
    OpenUrlCrossRefPubMed
  93. 93.↵
    1. Magoč T,
    2. Salzberg SL
    . 2011. FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics27:2957–2963. doi:10.1093/bioinformatics/btr507.
    OpenUrlCrossRefPubMedWeb of Science
  94. 94.↵
    1. Zhu W,
    2. Lomsadze A,
    3. Borodovsky M
    . 2010. Ab initio gene identification in metagenomic sequences. Nucleic Acids Res38:e132. doi:10.1093/nar/gkq275.
    OpenUrlCrossRefPubMed
  95. 95.↵
    1. Söding J
    . 2005. Protein homology detection by HMM-HMM comparison. Bioinformatics21:951–960. doi:10.1093/bioinformatics/bti125.
    OpenUrlCrossRefPubMedWeb of Science
  96. 96.↵
    1. Eddy SR
    . 2011. Accelerated profile HMM searches. PLoS Comput Biol7:e1002195. doi:10.1371/journal.pcbi.1002195.
    OpenUrlCrossRefPubMed
  97. 97.↵
    1. Roux S,
    2. Enault F,
    3. Hurwitz BL,
    4. Sullivan MB
    . 2015. VirSorter: mining viral signal from microbial genomic data. PeerJ3:e985. doi:10.7717/peerj.985.
    OpenUrlCrossRefPubMed
  98. 98.↵
    1. Wang Y,
    2. Tang H,
    3. Debarry JD,
    4. Tan X,
    5. Li J,
    6. Wang X,
    7. Lee TH,
    8. Jin H,
    9. Marler B,
    10. Guo H,
    11. Kissinger JC,
    12. Paterson AH
    . 2012. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res40:e49. doi:10.1093/nar/gkr1293.
    OpenUrlCrossRefPubMed
  99. 99.↵
    1. Xu H,
    2. Luo X,
    3. Qian J,
    4. Pang X,
    5. Song J,
    6. Qian G,
    7. Chen J,
    8. Chen S
    . 2012. FastUniq: a fast de novo duplicates removal tool for paired short reads. PLoS One7:e52249. doi:10.1371/journal.pone.0052249.
    OpenUrlCrossRef
  100. 100.↵
    1. Nei M,
    2. Li WH
    . 1979. Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc Natl Acad Sci U S A76:5269–5273. doi:10.1073/pnas.76.10.5269.
    OpenUrlAbstract/FREE Full Text
  101. 101.↵
    1. Shao W,
    2. Kearney MF,
    3. Boltz VF,
    4. Spindler JE,
    5. Mellors JW,
    6. Maldarelli F,
    7. Coffin JM
    . 2014. PAPNC, a novel method to calculate nucleotide diversity from large scale next generation sequencing data. J Virol Methods203:73–80. doi:10.1016/j.jviromet.2014.03.008.
    OpenUrlCrossRefPubMed
  102. 102.↵
    1. Bhunchoth A,
    2. Blanc-Mathieu R,
    3. Mihara T,
    4. Nishimura Y,
    5. Askora A,
    6. Phironrit N,
    7. Leksomboon C,
    8. Chatchawankanphanich O,
    9. Kawasaki T,
    10. Nakano M,
    11. Fujie M,
    12. Ogata H,
    13. Yamada T
    . 2016. Two Asian jumbo phages, ϕRSL2 and ϕRSF1, infect Ralstonia solanacearum and show common features of ϕKZ-related phages. Virology494:56–66. doi:10.1016/j.virol.2016.03.028.
    OpenUrlCrossRef
  103. 103.↵
    1. Hubert L,
    2. Arabie P
    . 1985. Comparing partitions. J Classif2:193–218. doi:10.1007/BF01908075.
    OpenUrlCrossRefWeb of Science
  104. 104.↵
    1. Katoh K,
    2. Standley DM
    . 2013. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol30:772–780. doi:10.1093/molbev/mst010.
    OpenUrlCrossRefPubMedWeb of Science
  105. 105.↵
    1. Capella-Gutiérrez S,
    2. Silla-Martínez JM,
    3. Gabaldón T
    . 2009. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics25:1972–1973. doi:10.1093/bioinformatics/btp348.
    OpenUrlCrossRefPubMedWeb of Science
  106. 106.↵
    1. Stamatakis A,
    2. Hoover P,
    3. Rougemont J
    . 2008. A rapid bootstrap algorithm for the RAxML Web servers. Syst Biol57:758–771. doi:10.1080/10635150802429642.
    OpenUrlCrossRefPubMedWeb of Science
  107. 107.↵
    1. Chao A,
    2. Gotelli NJ,
    3. Hsieh TC,
    4. Sander EL,
    5. Ma KH,
    6. Colwell RK,
    7. Ellison AM
    . 2014. Rarefaction and extrapolation with Hill numbers: a framework for sampling and estimation in species diversity studies. Ecol Monogr84:45–67. doi:10.1890/13-0133.1.
    OpenUrlCrossRefWeb of Science
View Abstract
PreviousNext
Back to top
Download PDF
Citation Tools
Environmental Viral Genomes Shed New Light on Virus-Host Interactions in the Ocean
Yosuke Nishimura, Hiroyasu Watai, Takashi Honda, Tomoko Mihara, Kimiho Omae, Simon Roux, Romain Blanc-Mathieu, Keigo Yamamoto, Pascal Hingamp, Yoshihiko Sako, Matthew B. Sullivan, Susumu Goto, Hiroyuki Ogata, Takashi Yoshida
mSphere Mar 2017, 2 (2) e00359-16; DOI: 10.1128/mSphere.00359-16

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Print
Alerts
Sign In to Email Alerts with your Email Address
Email

Thank you for sharing this mSphere article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
Environmental Viral Genomes Shed New Light on Virus-Host Interactions in the Ocean
(Your Name) has forwarded a page to you from mSphere
(Your Name) thought you would be interested in this article in mSphere.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Environmental Viral Genomes Shed New Light on Virus-Host Interactions in the Ocean
Yosuke Nishimura, Hiroyasu Watai, Takashi Honda, Tomoko Mihara, Kimiho Omae, Simon Roux, Romain Blanc-Mathieu, Keigo Yamamoto, Pascal Hingamp, Yoshihiko Sako, Matthew B. Sullivan, Susumu Goto, Hiroyuki Ogata, Takashi Yoshida
mSphere Mar 2017, 2 (2) e00359-16; DOI: 10.1128/mSphere.00359-16
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
  • Top
  • Article
    • ABSTRACT
    • INTRODUCTION
    • RESULTS AND DISCUSSION
    • MATERIALS AND METHODS
    • ACKNOWLEDGMENTS
    • FOOTNOTES
    • REFERENCES
  • Figures & Data
  • Info & Metrics
  • PDF

KEYWORDS

genome
marine ecosystem
metabolism
metagenomics
virus

Related Articles

Cited By...

About

  • About mSphere
  • Board of Editors
  • Policies
  • For Reviewers
  • For the Media
  • Embargo Policy
  • For Librarians
  • For Advertisers
  • Alerts
  • RSS
  • FAQ
  • Permissions
  • Journal Announcements

Authors

  • ASM Author Center
  • Submit a Manuscript
  • Author Warranty
  • Types of Articles
  • Getting Started
  • Ethics
  • Contact Us

Follow #mSphereJ

@ASMicrobiology

       

 

Website feedback

ASM Journals

ASM journals are the most prominent publications in the field, delivering up-to-date and authoritative coverage of both basic and clinical microbiology.

About ASM | Contact Us | Press Room

 

ASM is a member of

Scientific Society Publisher Alliance

 

American Society for Microbiology
1752 N St. NW
Washington, DC 20036
Phone: (202) 737-3600

Copyright © 2021 American Society for Microbiology | Privacy Policy | Website feedback

Online ISSN: 2379-5042