Ecological Stability Properties of Microbial Communities Assessed by Flow Cytometry

Microbial communities drive many processes which affect human well-being directly, as in the human microbiome, or indirectly, as in natural environments or in biotechnological applications. Due to their complexity, their dynamics over time is difficult to monitor, and current sequence-based approaches are limited with respect to the temporal resolution. However, in order to eventually control microbial community dynamics, monitoring schemes of high temporal resolution are required. Flow cytometry provides single-cell-based data in the required temporal resolution, and we here use such data to compute stability properties as easy to interpret univariate indicators of microbial community dynamics. Such monitoring tools will allow for a fast, continuous, and cost-effective screening of stability states of microbiomes. Applicable to various environments, including bioreactors, surface water, and the human body, it will contribute to the development of control schemes to manipulate microbial community structures and performances.

The original inoculum contained 15 known OTUs distributed over 14 genera while the pre-cultured inoculum contained 19 OTUs distributed over 15 genera.
Thirty samples in addition to the two mock strains were sequenced by Illumina® to a maximum of 50 000 reads per sample. This low number of reads was chosen because we were interested to verify the species richness in the AMC (which consists of only three organisms), in the CMC, and in the sorted gates of both microbial communities. To get a high resolution in diversity of samples was not the aim of this experiment. All samples (besides the mock strains) were subsampled to the minimum reads (2 631 reads, green line; maximum reads were 49 397) to enable for comparison of samples according to the workflow and testing of Props (Props et al. 2016). The rarefaction curves of all samples are presented in Figure S5.2 and the minimal amount of reads are marked by the green line ( Fig. 5.2). Table S5.1: List of the samples sorted (controls and gates) with their respective information; sampling time, gate number and the number of cleaned sequences obtained for each of them. The sample which has been taken as a reference for the normalization procedure is marked in grey. The gates (Fig. S5.4) that were investigated by Illumina® sequencing were sorted using the sort option of the MoFlo by sorting 500 000 cells at a rate not higher than 2 500 particles per second and using the 'single-sort high-purity mode'. Figure S5.3 represents example gates which cover diverse numbers of OTUs that were phylogenetically affilitated by using the SILVA data base version 123. It needs to be mentioned that even by using the 'single-sort high-purity mode' (99 %) a sorting error of 1 % can be expected. Figure S5.3 clearly shows that some of the gates almost entirely comprise only one pure OTU.

S5.2: DNA Extraction and quality testing
The DNA extracted from the sorted 500 000 cells following the Chelex protocol (Koch et al. 2013) was not enough in quantity to be detected by the Qubit® 3.0 for quality control (Life Technologies, Carlsbad, California, USA). Therefore a PCR step was performed to evaluate the quality of the isolated DNA by testing their amplified products by gel-electrophoresis. The PCR step was done with 35 cycles in a S1000 Thermal cycler (Biorad) by using the universal primers Forward 27F 5'-AGAGTTTGATCMTGGCTCAG-3' and Reverse 1492R 5'-TACGGYTACCTTGTTACGACTT-3' following the recommandations of Lane (1991).

S5.3: Library preparation for Illumina®
DNA directly isolated from sorted cells was used for the PCR reaction to prepare a library for the Illumina® sequencing. Primers were designed to amplify the V3-V4 regions of the bacterial 16S rRNA Miseq machine using the V3 kit, 2 x 300 bp (V3 kit, Illumina, San Diego, California, USA).

S5.4: Evaluation of sequence data
Regarding the average quality score of the sequencing data set (q = 27.25) the reduction of sequencing and PCR errors was done by merging the Forward and Reverse sequences before the library demultiplexing and cleaning procedure by using Mothur version 1.36 (Schloss et al. 2009). After removing chimeras by Uchime (Edgar et al. 2011) the remaining sequences were gathered into OTUs using Mothur's average neighbor clustering algorithm and a 97 % sequence similarity cut off. The rarefaction curves were drawn using ggplot2 package in R (Wickham 2009). The analysis was done from a sequencing set of 615 096 Forward-Reverse overlapped sequences. Mothur was also used for the normalisation procedure that enabled to compare samples of different read numbers (samples from AMC and CMC) which were subsampled to 2 631 reads (see also Section S5.1).
OTU classification was done in Mothur using the data base SILVA version 123 (Quast et al. 2013).
Obtained rarefaction curves contained a certain number of OTUs per sample, when a plateau was reached for each of them. The two mock strains showed only one OTU, respectively, while the mock community ended up with eight OTUs under the conditions that 2 631 reads were used as part of the normalization procedure and a relatively high threshold of 0.8 % was chosen. This threshold was defined according to recommendations given by Bokulich (Bokulich et al. 2013) towards an adequate OTU abundance threshold. This threshold allowed us to detect the three OTUs for the samples belonging to the AMC and to define a maximum list of 31 different OTUs found in the bioreactor (435 h, Table S5.2). The Illumina® sequencing data was used to verify the cytometric data. For the two mock strains only one OTU respectively was found while the whole CMC mock community was represented by eight most abundant OTUs which were present at the sampling time of 242 h. For a natural microbial community, OTU number expectation can be higher than what we found in this study. However, we choose a procedure that aimed at the determination of the most abundant instead of rare and unknown OTUs.
The PCR reaction was set to only 28 cycles to avoid over-amplification of the abundant species and to exclude the rare biosphere which would need more cycles to appear in our analysis. In addition, samples were pooled and only a maximum of 50 000 -200 000 reads per sample were sequenced. The OTU abundance threshold was set to 0.8 % in our application which is rather conservative (Degnan & Ochman 2012 Cell sorting was used to test for presence and location of organisms (Section S5.1). The sorting scheme is shown Figure S5.4. For the CMC mock community a maximum of eight OTUs were found above the defined 0.8 % threshold ( Fig. S5.6). 12 gates were sorted and comprised between one and eight OTUs which are listed in Table   S5.2. The data show that the establishment of the CMC (at 216 h) was not as prevailing as expected from the start because the Bacillus was still found dominant in gates G10, G16, G20, and G21 at 220 h. Later on, after 238 h very different organisms can be detected in the gates which are also different between gates. The members of the original AMC were not present any more besides the very low abundancies of the Comamonadaceae (G5, 1.06 %) at the end of the continuous reactor experiment (412 h).

S5.5: Influence of PFA and ethanol treatments on sequencing results
The influence of PFA stabilization and ethanol storage was tested using a wastewater sample. Only minor differences were found between fresh and PFA/ethanol fixed samples (Fig. S5.7).