Campylobacter Abundance in Breastfed Infants and Identification of a New Species in the Global Enterics Multicenter Study.

Campylobacter is the primary cause of bacterial diarrhea in the United States and can lead to the development of the postinfectious autoimmune neuropathy known as Guillain-Barré syndrome. Also, drug-resistant campylobacters are becoming a serious concern both locally and abroad. In low- and middle-income countries (LMICs), infection with Campylobacter is linked to high rates of morbidity, growth stunting, and mortality in children, and breastfeeding is important for infant nutrition, development, and protection against infectious diseases. In this study, we examined the relationship between breastfeeding and Campylobacter infection and demonstrate the increased selection for C. jejuni and C. coli strains unable to metabolize fucose. We also identify a new Campylobacter species coinfecting these infants with a high prevalence in five of the seven countries in sub-Saharan Africa and South Asia examined. These findings indicate that more detailed studies are needed in LMICs to understand the Campylobacter infection process in order to devise a strategy for eliminating this pathogenic microbe.

molecular techniques to examine the relative abundances of campylobacters and other resident gut microbes in asymptomatic and symptomatic Campylobacter infections. We also screened the C. jejuni and C. coli isolates for their ability to utilize L-fucose and D-glucose and assessed whether this correlated with diarrheal disease. Through these studies, we identified a new Campylobacter species that is present in stool samples of infants from multiple countries and is as prevalent as C. coli.
The V4 hypervariable region of the 16S rRNA gene amplifies Bifidobacterium effectively (26,27), and a higher proportion of Bifidobacterium in exclusively breastfed than in nonbreastfed infants was observed following sequencing of the V4 hypervariable region (Fig. 1C). The Bifidobacterium genus is the most abundant in exclusively breastfed infants, but a higher abundance of Escherichia-Shigella was still found in cases (50.01%) than in controls (15.22%) with no breastfeeding (not statistically significant due to the smaller sample size of the V4 data set).
Relative abundance of Campylobacter in symptomatic and asymptomatic infants. Although Campylobacter strains were isolated from all the fecal samples included in our study, 16S rRNA sequencing (V6-V8 regions) of the fecal DNA detected a fraction of them (cases with exclusive breastfeeding, 61/75; controls with exclusive breastfeeding, 31/65; cases with no breastfeeding, 34/49; controls with no breastfeeding, 10/40). As shown in Fig. 1D and E, the 16S rRNA of Campylobacter had Ͼ0.99 identity value to C. jejuni, C. coli, C. lari, C. hyointestinalis, C. upsaliensis, and other campylobacters. A greater abundance of Campylobacter species was found in cases with breastfeeding (ϳ10%) than in other groups (Fig. 1B). Consistent with results obtained from sequencing the V6-V8 rRNA regions, the results with the V4 regions (Fig. 1C) showed that Campylobacter species are also more abundant in cases with breastfeeding (ϳ10%) than in all other groups (not statistically significant because of the smaller V4 sample size).
The relative abundance of Campylobacter in the fecal microbiome was highly variable in each group (Fig. 1D). In cases with breastfeeding, the highest abundance of Campylobacter was in one child with 83% of the fecal microbiome comprised of one strain with 0.99 identity value to C. hyointestinalis subsp. lawsonii/uncultured Campylobacter by 16S rRNA sequencing. Our results clearly indicate that exclusively breastfed infants with diarrheal symptoms had statistically the highest abundance of campylobacters compared to other groups, including nonbreastfed cases (Fig. 1E). Among nonbreastfed infants, cases had higher levels of Campylobacter than controls (Fig. 1E). When comparing controls with and those without breastfeeding, exclusively breastfed infants had higher Campylobacter levels than nonbreastfed infants, but the results were not statistically significant (Fig. 1E). Similar trends were found when the data were examined for different sites and ages ( Fig. S3C to H).
Fucose metabolism in Campylobacter strains isolated from GEMS. Despite previous research suggesting that breastfeeding protects against C. jejuni-induced diarrhea (11,28), reports from studies such as GEMS and MAL-ED indicated that a large proportion of breastfed children with diarrhea were Campylobacter positive, and our results unexpectedly showed even higher proportions of this genus in breastfed infants than in nonbreastfed infants. C. jejuni and C. coli were isolated in the original GEMS using selective plates for these two species (25). To assess whether there were any correlations between C. jejuni and C. coli abundances and carbohydrate metabolism, PCR screening for the fucose permease gene fucP was done. We used nested colony PCR for the fucP gene (with the 16S rRNA gene as a control) to screen for strains with the operon for fucose metabolism (Fig. 2A). The subset of isolates that were PCR positive was subsequently tested in growth assays in limited growth medium supple- , and error bars represent 1 standard deviation (fucP ϩ strains are highlighted in boldface type). White, control; black, L-fucose addition. (C) Pie charts comparing the correlations between fucose metabolism and the isolated Campylobacter strains. Black, L-fucose-utilizing strains; white, non-L-fucoseutilizing strains. P values were determined by chi-squared testing compared to 50%. mented with L-fucose to confirm that they possessed a functional pathway (Fig. 2B). The results indicate that in nonbreastfed infants, approximately 50% of the strains were fucose utilizing versus nonutilizing (P ϭ 0.5775 and P ϭ 0.4652 for cases and controls, respectively), but in breastfed infants, significantly fewer fucose-utilizing strains (33% and 22% in cases and controls, respectively; P Ͻ 0.05) were found (Fig. 2C). Previous studies suggested that there is a negative correlation between the possession of the fucose metabolic locus and the ␥-glutamyl transpeptidase (GGT) gene (ggt), which provides the strain the ability to utilize glutamine and glutathione (29). Since glutamine is one of the most abundant free amino acids in breast milk (30), we screened for the ggt gene in all strains and found that only 15 of 137 (11%) screened strains possessed the ggt gene, 12 of 79 (15%) in strains lacking fucP and 3 of 58 (5%) fucP ϩ strains. For the utilization of D-glucose, 25% of strains were randomly selected and tested in growth assays using limited growth medium supplemented with D-glucose, and none of the tested strains showed enhanced growth in glucose.
A putative new Campylobacter species, "Candidatus Campylobacter infans." Multiplex PCR of the lipid A biosynthesis gene lpxA was used to confirm the presence of C. jejuni, C. coli, and C. upsaliensis identified by 16S rRNA sequencing (31). However, no molecular identification method was available for C. hyointestinalis, and only one C. hyointestinalis subsp. lawsonii genome was present in GenBank (accession number CP015576), so we sequenced nine additional C. hyointestinalis subsp. lawsonii strains (32), and together with the genome of C. hyointestinalis subsp. hyointestinalis (GenBank accession number NZ_CP015575.1), we designed a primer specific for the C. hyointestinalis lpxA gene of both subspecies to be used for multiplex PCR. The revised lpxA multiplex PCR was performed on the known C. hyointestinalis strains (Fig. S4A) and applied to the GEMS fecal DNA samples that were putatively designated to belong to this species by 16S rRNA sequencing (Fig. S4B). Only samples G9, G21, and G22 have the expected 285-bp product out of the predicted 26 GEMS fecal samples and were confirmed to contain C. hyointestinalis, suggesting that the other campylobacters identified as C. hyointestinalis subsp. lawsonii/uncultured Campylobacter by 16S rRNA sequencing may actually be a new species. One of the 26 infants (G1) (Fig. S4C) had prolonged diarrhea for 9 days, and 83.6% of the fecal microbiome was comprised of this species, with no other Campylobacter species being detected by 16S rRNA analysis, although one C. jejuni strain was isolated from the fecal sample. Metagenomic sequencing of this fecal DNA sample was performed, and assembly of the Illumina reads yielded 6,058 total contigs and 75 contigs identified as Campylobacter, of which 56 were larger than 5,000 bp. Overall, the metagenomic analysis of the fecal sample found the dominant clusters of orthologous groups (COGs) to be associated with functions in metabolism, information storage and processing, and cellular processes and signaling, and in particular, 10.3% of sequence reads were associated with protein metabolism (Fig. S5A and B). While 16S rRNA sequencing found a higher percentage of the fecal microbiome composition to be Campylobacter (83.6%), metagenomic analysis confirmed at the order level (74.3%) and genus level (66.6%) that it was the dominant member of the microbial community ( Fig. S5C to E). Furthermore, 33.8% of all the sequence data that were assembled into contigs particularly Ͼ10 kb were identified as Campylobacter (Fig. S5F), with an average of 76ϫ coverage for these contigs, which further demonstrated that it was the major genus present in the fecal microbiome and provided sufficient coverage for genome assembly. BLASTP analysis of the 1,476 putative coding sequences identified in the 75 Campylobacter contigs indicated that the Campylobacter species present in this fecal sample was not one of the current validly described Campylobacter taxa; however, 72.8% (1,074/1,476) of the matches showed strong similarity to proteins from members of the C. fetus group (i.e., C. fetus, C. hyointestinalis, C. lanienae, and C. iguaniorum). Data from core gene/protein phylogenetic analyses ( Fig. 3A and B) and average nucleotide identities (ANI) (Fig. 3C) are consistent with these results, suggesting that the Campylobacter strain in this fecal sample (referred to here as "Candidatus Campylobacter infans") represents a novel Campylobacter species Sequences of 20 core genes from various Campylobacteraceae or their cognate proteins were concatenated and aligned using CLUSTALX. Included in the alignment were the same 20 concatenated genes or proteins extracted from the metagenomic sequences obtained from a fecal DNA sample containing 83% Campylobacter sequences ("Candidatus Campylobacter infans"). The dendrograms were constructed using the neighbor-joining algorithm and the Kimura two-parameter (gene set) (A) or Poisson (protein set) (B) distance estimation method. Bootstrap values of Ͼ75%, generated from 500 replicates, are shown at the nodes. (C) Average nucleotide identity (ANI) of "Candidatus Campylobacter infans" among known related bacteria, including Campylobacter, Sulfurimonas, Arcobacter, Sulfurospirillum, and Helicobacter species. (Fig. 3) that is related to, but distinct from, C. fetus, C. hyointestinalis, C. lanienae, and C. iguaniorum.
Using the metagenomic sequencing data, we designed lpxA primers for "Candidatus Campylobacter infans" for comparison with the lpxA primers used for the multiplex PCR that we performed previously and tested all 26 fecal samples that had 16S rRNA predictions closest to C. hyointestinalis subsp. lawsonii/uncultured Campylobacter. The PCR results indicated that 18 of the 26 samples had the expected PCR product, except samples G8, G9, G19, G20, G24, G25, and G26 (Fig. S4C). For further confirmation, specific primers for the "Candidatus Campylobacter infans" full-length atpA gene were designed and used to amplify 20 of the remaining 25 samples. The phylogenetic tree constructed from the aligned amplicon and reference atpA sequences showed that the campylobacters in these 20 samples are very similar to "Candidatus Campylobacter infans" (Fig. 4A). Combining the lpxA and atpA results, 24 of 26 fecal samples were confirmed to contain "Candidatus Campylobacter infans," and 19 of them were from exclusively breastfed infants, while 5 were from nonbreastfed infants (Fig. 4B), from five out of the seven GEMS countries, including The Gambia, Mali, Mozambique, India, and Pakistan.
As expected, genes encoding pathways for the utilization of amino acids and peptides, including amino acid transporters and peptidases, were identified in the "Candidatus Campylobacter infans" genome. The respiratory enzyme NADH:quinone oxidoreductase (also called complex I) is a major electron source in many bacteria, including C. jejuni (33), but all 14 nuo genes encoding complex I are absent in "Candidatus Campylobacter infans," suggesting that it depends on other electron donors, such as hydrogen and formate, unless single-protein NADH dehydrogenases serve the same purpose. Genes required for utilizing hydrogen and formate as electron donors were identified in the metagenome, including those encoding the [Ni-Fe]-type hydrogenase HydABC, a nickel uptake system, and a formate dehydrogenase that can oxidize formate and release protons and electrons. Virulence genes, including those encoding invasion antigens (ciaB and ciaD), the zonula occludens toxin (zot), an outer membrane fibronectin-binding protein (cadF), heavy metal resistance genes, a multidrug efflux pump (cmeABC), and other drug resistance transporters, are potentially part of the "Candidatus Campylobacter infans" genome. Also, "Candidatus Campylobacter infans" may express an S-layer since we detected the presence of an S-layer secretion system (sapDEF), without homologs of the surface layer proteins (sapA and sapB), but other proteins may be involved.
Prevalence and coinfection of Campylobacter spp. in GEMS. As expected, since we used a subset of GEMS samples that tested positive for C. jejuni or C. coli, C. jejuni is the most prevalent species among symptomatic and asymptomatic Campylobacter infections, colonizing nearly twice as many infants as the total of all other species (107 colonized with C. jejuni and 68 colonized with other species). However, it is interesting to note that several other Campylobacter species are found, including C. coli (24 infants), "Candidatus Campylobacter infans" (24), and C. upsaliensis (18), and the order of abundance is generally consistent when split into the four groups (Fig. 4B). Among the 131 infants with Campylobacter infections, 36 (27.5%) were coinfected with different Campylobacter species, and among these, 8 (6.1%) were coinfected with three Campylobacter species (Table S4).

DISCUSSION
In LMICs, diarrheal diseases are the second leading cause of mortality in children under 5 years of age, and infection with Campylobacter is one of the major causes of gastroenteritis in these children (1,2,7,34). Previous studies suggest that breastfeeding protects infants from Campylobacter-induced diarrhea by reducing the ingestion of contaminated water/foods and lessening contact with unsanitary surroundings through closer interactions with the mother (7,12,28). In addition, breastfeeding increases exposure to fucosylated HMOs, particularly 2=-fucosyllactose in secretor mothers, which is predicted to act as a binding decoy to prevent C. jejuni colonization of the intestine (11). Yet exceedingly high rates of campylobacteriosis are reported among exclusively breastfed infants in LMICs (7). Consistent with these reports, our studies comparing 16S rRNA profiles of fecal DNA isolated from infants in GEMS show a trend of a higher Campylobacter abundance in exclusively breastfed than in nonbreastfed infants, leading us to evaluate the metabolic preferences among these isolates. We previously demonstrated that Ͼ50% of sequenced C. jejuni isolates possess the fuc locus for L-fucose catabolism (16), and the possession of this pathway provides a competitive colonization advantage in the piglet model of diarrheal disease (17). Since C. jejuni lacks fucosidases, it grows only on fucose-modified gut oligosaccharides in the presence of other gut microbes, such as Bacteroides vulgatus, suggesting a scavenging lifestyle within the intestine (35). Many commensal bacteria in the gut, including Bifidobacterium and Bacteroides, possess fucosidases and are predominant in breastfed infants (18)(19)(20), suggesting that breastfeeding may benefit fuc ϩ C. jejuni strains, particularly since fecal levels of free L-fucose can reach milligram amounts in breastfed infants (22).
In this study, we indeed found high levels of Bifidobacterium in breastfed infants, as expected. However, we found that although nonbreastfed infants were colonized with equal proportions of L-fucose-metabolizing and -nonmetabolizing C. jejuni and C. coli strains, the proportion of strains unable to metabolize L-fucose increased significantly in exclusively breastfed infants, although we cannot exclude the possibility that some infants were colonized with multiple strains of these species. This was particularly evident when comparing infants with symptomatic infections, who exhibited 6-foldhigher levels of this organism. Thus, HMOs may indeed function as decoys for Campy-lobacter fuc ϩ strains, and this may correlate with our recent observation that the fuc locus protein Cj0485 is necessary for Campylobacter L-fucose chemotaxis (16) and new studies in our laboratory comparing chemotaxis and sugar adhesion illustrating differences between strains capable of sugar metabolism and asaccharolytic strains (H. Nothaft, unpublished data). However, these results also suggest that there is a selection for C. jejuni or C. coli strains that are better capable of thriving within the infant gut. In addition to HMOs, breast milk contains high quantities of proteins (especially caseins) and their breakdown products (36), which could promote the growth of campylobacters, particularly since even carbohydrate-utilizing C. jejuni strains prefer amino acids such as serine, glutamate, and aspartate over fucose, indicating the existence of a metabolic hierarchy (35). Thus, a preference for the proteinaceous components in breast milk may make L-fucose or D-glucose utilization irrelevant. Additionally, the quantities of free L-fucose could be smaller than expected if mothers are not secretors or produce smaller quantities of HMOs due to nutrient availability (37,38). Glutamine is one of the most abundant free amino acids in human milk, and the possession of the ␥-glutamyl transpeptidase (GGT) enzyme has been negatively correlated with the presence of the fucose pathway in C. jejuni (29,30), but only 15% of the isolated C. jejuni and C. coli strains lacking fuc examined in this study contain the ggt gene (lower than the 31% expected) (39), so it is likely that other carbon sources from breast milk fuel the metabolism of the isolated Campylobacter strains. In addition to C. jejuni and C. coli, "Candidatus Campylobacter infans" was also detected nearly three times as frequently in exclusively breastfed infants as in nonbreastfed infants, and the metagenomic data identified pathways for amino acid and peptide utilization, without any obvious clusters for carbohydrate metabolism. Based on the unexpected observation that exclusive breastfeeding increases the Campylobacter abundance in infants Ͻ1 year of age, more studies are needed to assess the effect of breast milk on Campylobacter infection, particularly since a recent study in preterm pigs fed human milk oligosaccharides showed increased levels of Proteobacteria dominated by Campylobacter and Helicobacter with longer HMO exposure (40). In particular, targeted epidemiological studies are required to determine whether higher Campylobacter levels are correlated with more severe dysenteric diarrhea. While C. jejuni or C. coli strains were isolated from all infants included in our study, 16S rRNA analysis did not detect Campylobacter in all the corresponding fecal DNA samples. This is consistent with our previous studies on chicken gut microbiome compositions that demonstrated that C. jejuni cannot be detected by 16S rRNA sequencing when Ͻ10 5 CFU per g of cecal content are present (41,42) and with studies by other groups that found that culture-based methods need to be used for bacteria present at levels below the detection threshold (Ͻ10 6 CFU/g feces) to surpass the "depth bias" of genomic approaches (43,44).
C. jejuni and C. coli are the two most commonly isolated pathogenic Campylobacter species worldwide (8); however, recently emerging non-C. jejuni/C. coli Campylobacter species have also been associated with gastrointestinal diseases and diarrhea (15). Here, we show that C. jejuni and C. coli are the predominant Campylobacter species in infants Ͻ1 year of age in GEMS, and non-C. jejuni/C. coli campylobacters, especially "Candidatus Campylobacter infans" and C. upsaliensis, are also prevalent in this age group. The prevalence of other Campylobacter species may be underestimated in this study since only fecal samples positive for C. jejuni and C. coli were included. Nonetheless, the new species "Candidatus Campylobacter infans" is detected at the same level as C. coli and accounts for approximately 1.6% of the fecal microbiome in exclusively breastfed infants with diarrhea and, in one extreme case, 83% of the total fecal microbiome, which suggests that "Candidatus Campylobacter infans" can be a relevant pathogen in infants. As the third most prevalent Campylobacter species in this study, C. upsaliensis has been previously reported to cause diarrhea in humans (45), and a recent study demonstrated that non-C. jejuni/C. coli species, such as C. hyointestinalis, account for more Campylobacter infections in children Ͻ2 years of age in Peru (15,46). In the present study, coinfection with multiple Campylobacter species is not uncommon, which provides more opportunities for genomic rearrangements between campylo-bacters to escape selective pressures and alter virulence properties (47). Taken together, our data indicate that vaccines to prevent Campylobacter infection should not only focus on eliminating C. jejuni and C. coli but also consider other nonthermophilic species, including "Candidatus Campylobacter infans" and C. upsaliensis, for preventing diarrheal disease in infants in LMICs (42,48). However, among all the infants with Campylobacter infection (both symptomatic and asymptomatic), approximately 16% are infected solely with Campylobacter, and 84% showed comorbidities with other diarrheal pathogens (viruses, other bacteria, and parasites), including 67% of infants diagnosed with Campylobacter-related diarrhea. Many studies have reported that coinfection with multiple intestinal pathogens is common in infants in LMICs, and one proteobacterial infection (such as Escherichia coli, Salmonella, and Campylobacter) increases the host's susceptibility to other Proteobacteria (i.e., the proteobacterial bloom), so this should be kept in mind when considering the correlations observed in this study (1,49,50). Overall, our results indicate that approaches to control the risk of infection by other intestinal pathogens, such as sanitization, water treatment, and vaccination, are important to reduce diarrheal diseases in low-resource settings, but microbial gut compositions may also play a role.
In this study, we demonstrate that infants with diarrhea had the lowest fecal microbial diversity, consistent with data from previous studies that showed that diarrheal disease decreases the diversity of the gut microbiome (51). Diarrhea-free controls had significantly higher gut microbial diversity, which shows that asymptomatic Campylobacter infection does not devastate the gut microbiota as seen in diarrheal cases. When comparing cases and controls, the levels of several different genera were significantly elevated in controls among both exclusively breastfed and nonbreastfed infants, most of which were negatively correlated with Campylobacter yet positively correlated with each other, including Blautia and Dorea, Blautia and Faecalibacterium, and Erysipelatoclostridium and the Ruminococcus gnavus group. Among these genera, Blautia has been observed to have beneficial anti-inflammatory effects in illnesses such as graft-versushost disease (52); however, other co-occurring bacteria (such as Faecalibacterium prausnitzii) may also protect against diarrhea (53), indicating that more studies need to be done to explore the potential protective roles of other microbes in diarrheal diseases.
Overall, our study points to the need to further explore the contributions of emerging pathogens such as "Candidatus Campylobacter infans" and C. upsaliensis to diarrheal disease in infants in LMICs and the potentially protective roles of intestinal bacteria, particularly those of the Blautia genus, that are associated with asymptomatic Campylobacter carriage. We also report the unexpected observation that a greater Campylobacter abundance is detected in exclusively breastfed infants with diarrheal disease, and further epidemiological studies with corroborating experimental work are warranted. These infants show a reduced incidence of infection with C. jejuni and C. coli fucose-utilizing strains, suggesting that HMOs may indeed function as decoys for fuc ϩ strains but also select for strains with metabolic properties that render them more capable of thriving in the infant gut. Current studies are focused on the interplay between C. jejuni chemotaxis, binding, and metabolism in exclusively breastfed infants and the development of inexpensive novel intervention and treatment strategies.

MATERIALS AND METHODS
Samples. Fecal DNA samples were obtained from GEMS or the follow-up study GEMS1a, and the study design and methodology were previously described (24). Briefly, GEMS was a prospective casecontrol study of moderate-to-severe diarrhea (MSD) in children under 5 years of age conducted between 1 December 2007 and 3 March 2011 in seven low-and middle-income countries: The Gambia, Kenya, Mali, Mozambique, Bangladesh, India, and Pakistan. Cases with acute diarrhea were included, each representing a new episode and meeting at least one criterion of MSD (sunken eyes, loss of skin turgor, initiation of intravenous rehydration, dysentery, or hospitalization with diarrhea or dysentery). Controls had no diarrhea in 7 days and were included in the same demographic surveillance system area with cases and matched cases in age, sex, residence, and time. Illness management strategies, including feeding, fluid administration, and breastfeeding practices (no breastfeeding, partly breastfeeding, and exclusively breastfeeding), were queried from the caretakers. A follow-up study called GEMS1a also enrolled children with MSD as well as children with less-severe diarrhea (LSD) (acute diarrhea without signs of MSD) and matched controls from 31 October 2011 to 14 November 2012, using essentially the same methodology as the one for GEMS. To study the spread of Campylobacter and the effect of breastfeeding on campylobacteriosis, we included DNA samples isolated from stools of cases (both MSD and LSD) and controls under 1 year of age, whose caretakers indicated that they were exclusively breastfed or not breastfed (infants who were still breastfed but also received complementary food were excluded from the analysis), with Campylobacter isolated from the fecal samples using selective Campy-BAP plates for cephalothin-resistant Campylobacter species, including most C. jejuni and C. coli strains, grown at 42°C for 3 days. Subsequently, one single colony per sample was subcultured and used in this study (9).
16S rRNA sequencing and data analysis. The V6-V8 and V4 hypervariable regions of the 16S rRNA gene were amplified from fecal DNA using universal primers described previously (42,54). The amplification products were cleaned up, normalized, barcoded, and sent to the Georgia Genomics and Bioinformatics Core (GGBC) for sequencing using the Illumina MiSeq PE300 kit. Normalization was based on the double-stranded DNA (dsDNA) concentration detected by using the NanoDrop One C system (Thermo Scientific) and a Qubit 2.0 fluorometer (Thermo Scientific). A total of 12,387,812 sequences were obtained, with a median of 53,166 sequences per sample. DADA2, an open-source software package for modeling and correcting Illumina-sequenced-amplicon errors, was used in QIIME2 to filter, trim, dereplicate, and join the yielded paired-end sequences and construct an amplicon sequence variant (ASV) table for further analysis (55)(56)(57). The alpha diversity of the fecal microbiome was calculated using the Shannon index in QIIME2. Vsearch was used to cluster representative sequences against the SILVA SSU r128 database with 99% identity (58) to known operational taxonomic units (OTUs). Correlation between microbial OTUs was calculated using SparCC, a method to estimate correlation values from compositional data (59). , which was confirmed to also work for C. coli strains (data not shown). To verify that a fucP PCR product corresponded to a strain's ability to use fucose, we performed growth experiments in a subset of the strains lacking fucP and all fucP ϩ strains as described previously (16). Briefly, Campylobacter strains were adjusted to an optical density at 600 nm (OD 600 ) of 0.05 and inoculated into 5 ml MEM␣ (minimum essential medium ␣; Thermo Fisher Scientific) (with 20 M iron added) with or without 25 mM L-fucose. The inoculated culture was incubated microaerobically at 37°C for 18 h, and the OD 600 was measured using the NanoDrop One C system (Thermo Scientific). Screening for the ggt gene was carried out as previously described (29). To test for D-glucose utilization, we performed growth experiments as described above, using a random subset of 33 strains in DMEM (Dulbecco's modified Eagle's medium) without glucose (Thermo Fisher Scientific), and the strains were incubated microaerobically for 48 h. Campylobacter cuniculorum LMG 24588 was used as the positive control.
Verification of C. jejuni, C. coli, C. upsaliensis, C. lari, and C. hyointestinalis. Representative sequences of 16S rRNA (V6-V8) of Campylobacter classified by Scikit-learn (60) in QIIME2 were aligned and classified using SINA with the SILVA SSU r128 database, and bacterial species with a 0.99 minimum identity to the query sequence were obtained (61). A multiplex PCR of the lipid A biosynthesis gene lpxA was used to confirm the presence of C. jejuni, C. coli, C. upsaliensis, and C. lari in fecal DNA, as previously described (31). A novel lpxA forward primer was designed here for C. hyointestinalis (lpxAC.hyo [CAA AGA CGC GGT TTT GGG CGA TGA AGT CGT GG]). This primer is compatible with the lpxA primers described previously (31), yielding (in conjunction with the previously described reverse primer) a product of 285 bp. The lpxAC.hyo and lpxARKK2m primer set was used to verify the presence of C. hyointestinalis (both subspecies) in fecal DNA.
Detection of a new Campylobacter species. Illumina reads were initially trimmed to remove adapters and quality filtered (ϾQ20) using Trimmomatic (v0.39) (62), and human genome contamination was then removed by mapping reads to the human GRCh37/hg19 reference genome with Bowtie2 (v2.1.0) (63). Next, filtered reads were submitted to MG-RAST (64) for gene function prediction and taxonomic abundance determination, while the reads were assembled and the taxonomic abundance was confirmed using the following programs: metaSPAdes (assembly; v3.12.0) (65), Kraken2 (taxonomic abundance; v2.0.8) (66), and Bracken (taxonomic abundance; v2.0.0) (67). The metaSPAdes assembly resulted in 33 contigs identified as Campylobacter, including 24 contigs of Ͼ5,000 bp, while an additional assembly with Newbler (v2.6) yielded 75 contigs identified as Campylobacter, including 56 contigs of Ͼ5,000 bp. Both sets of assembled Campylobacter contigs from metaSPAdes and Newbler were independently analyzed for completeness and contamination using CheckM software (v1.0.12) (68), which demonstrated that the 33 contigs from metaSPAdes were 94.2% complete, with 4.9% contamination, whereas the 75 contigs from Newbler were 98.5% complete, with only 0.19% contamination (classified as an extremely low level of contamination and within the margin of error for the CheckM software). Therefore, all additional metagenomic Campylobacter analyses were conducted with the contigs from the Newbler assembly. Coding sequences present within the 75-contig set were determined using GeneMark (69). A custom protein database was constructed, which combined the proteomes derived from all currently validated (and fully annotated) Campylobacter and Arcobacter genomes (one to six proteomes per taxon) present in GenBank. The metagenomic proteome was compared to this custom database through an all-versus-all pairwise BLASTP analysis. Genes within the metagenome were assigned a function based on matches above the minimum requirement of 40% similarity and where the match length was Ն75% of the subject and query protein lengths. The sequences of 20 core genes (aroC, atpA, dnaN, eno, fabH, frr, glnA, groEL, hemB, ileS, lpxA, miaB, mrp, nrdB, pnp, prfA, queA, speA, spoT, and tkt) and their cognate proteins were extracted from the metagenome and 50 Campylobacter, Arcobacter, and Sulfurospirillum genomes. These gene and protein sets were concatenated in the order mentioned above for each genome and aligned using CLUSTALX. Neighbor-joining phylogenetic trees were constructed using MEGA v6 (70): the nucleotide tree was constructed using the Kimura two-parameter model, and the amino acid tree was constructed using the Poisson model. Additionally, the 75 Campylobacter contigs were concatenated and used for an average nucleotide identity (ANI) analysis (71), along with the genomes of 45 Campylobacter, Arcobacter, Sulfurimonas, Sulfurospirillum, and Helicobacter taxa. Using the metagenome sequence, novel lpxA (lpxAC.GEMS [GCA AAA AGC AGT GGC GAG GGT TG], compatible with lpxARKK2m) and atpA (atpA-F/R [GTG AGC GCA AAA TTA AAA GCA G/TTA TTC AGC ACT AAA AGT AGC C]) primers were designed and used to verify the presence of this putative new Campylobacter species in the other 25 fecal samples. The atpA sequences were aligned using CLUSTALW, and neighbor-joining phylogenetic trees were constructed with MEGA v7 using the Tamura three-parameter method (72).
Statistics. For analysis of the fecal microbiome, a Kruskal-Wallis test was used to compare alpha diversity values between different groups; differences in the relative abundances of bacterial phyla between groups were calculated by a Kruskal-Wallis H test and corrected by the false discovery rate (FDR) in STAMP (73). For analysis of strains capable of metabolizing fucose, a chi-square test was used to determine if the proportion of samples that can metabolize fucose in both cases and controls significantly differed from 50%.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only.