Key sub-community dynamics of medium-chain carboxylate production

Background The carboxylate platform is a promising technology for substituting petrochemicals in the provision of specific platform chemicals and liquid fuels. It includes the chain elongation process that exploits reverse β–oxidation to elongate short-chain fatty acids and forms the more valuable medium-chain variants. The pH value influences this process through multiple mechanisms and is central to effective product formation. Its influence on the microbiome dynamics was investigated during anaerobic fermentation of maize silage by combining flow cytometric short interval monitoring, cell sorting and 16S rRNA gene amplicon sequencing. Results Caproate and caprylate titres of up to 6.12 g L−1 and 1.83 g L−1, respectively, were achieved in a continuous stirred-tank reactor operated for 241 days. Caproate production was optimal at pH 5.5 and connected to lactate-based chain elongation, while caprylate production was optimal at pH 6.25 and linked to ethanol utilisation. Flow cytometry recorded 31 sub-communities with cell abundances varying over 89 time points. It revealed a highly dynamic community, whereas the sequencing analysis displayed a mostly unchanged core community. Eight key sub-communities were linked to caproate or caprylate production (rS > | ± 0.7|). Amongst other insights, sorting and subsequently sequencing these sub-communities revealed the central role of Bifidobacterium and Olsenella, two genera of lactic acid bacteria that drove chain elongation by providing additional lactate, serving as electron donor. Conclusions High-titre medium-chain fatty acid production in a well-established reactor design is possible using complex substrate without the addition of external electron donors. This will greatly ease scaling and profitable implementation of the process. The pH value influenced the substrate utilisation and product spectrum by shaping the microbial community. Flow cytometric single cell analysis enabled fast, short interval analysis of this community and was coupled with 16S rRNA gene amplicon sequencing to reveal the major role of lactate-producing bacteria. Electronic supplementary material The online version of this article (10.1186/s12934-019-1143-8) contains supplementary material, which is available to authorized users.


S2:
Feed composition 3 S3: Substrate degradation 6 S4: Gas production and composition 6 S5: Miscellaneous reactor parameters 8 S6: Gas chromatography detection and calibration limits 9 S7: Concentrations of non-target carboxylates 10 S8: Flow cytometric analysis -gating strategy 11 S9: Flow cytometric analysis -controls 12 S10: Flow cytometric analysis -microbial community dynamics 13 S11: Correlation analysis 18 S12: Flow cytometric cell sorting 16 S13: Sequencing protocols and details 13 References 33 S1: Fermenters system setup Figure S1: Reactor system consisting of A fermenter vessel, B overhead stirrer, C heating system, D substrate input port, E gasbag, F gas flow meter, G pH-sensor, H pH controller, I sodium hydroxide reservoir, J peristaltic pump and waste port below the fermenter vessel. The fermenter was run semicontinuously and fed every 24 h. Prior to the feeding procedure 4 L of the fermentation broth was drained and recirculated into the feeding port to prevent build up of a floating layer and to homogenise the fermentation broth.

S2: Feed composition Corn Silage
Over the course of the experiment three different corn silage batches from a nearby farm in Neichen, Germany, were fed into the fermenter. The first batch was fed from day 1 to 55, the second from day 56 to 139, and the third from day 140 to the end on day 241. The freshly obtained corn silage was manually compressed in 80 L plastic barrels with gastight caps and stored at 4 °C. The barrels were opened biweekly to prepare the daily substrate portions and the barrel headspace was flushed with nitrogen to reduce the oxygen exposure and the risk of mould fungus infestation. The prepared substrate was stored in vacuum bags at 4 °C. The soluble ( Table 1 S2) and insoluble ( Table 2 S2) compounds of the substrate as well as its pH value were analysed.   [1]) and measured by headspace gas chromatography according to pubslished protocols [2], detection and calibration limits in Additonal file 1 S6). The measurements were perfomed in triplicates.  Table 2 S2: Composition of the corn silage batches fed into the reactor over the course of the experiment as determined by substrate analysis according to standard procedures [3,4]. The pH value, total solids (TS), volatile solids (VS) corrected according to [5], raw water, raw ash, raw protein, raw fat, raw fiber, nitrogen free extractives (NfE), non-fibrous carbohydrates (NFC), hemicellulose, cellulose, lignin, neutral detergent fiber (NDF) and acid detergent fiber (ADF) are given. ) and urea (up to 10 g L -1 d -1 ) added before did not result in increased TAN concentrations. Table S3: Average degrees of substrate degradation during the experimental stages with their respective standard deviation. VS values in the fermentation broth were measured two to three times a week. The degrees of degradation was calculated according to [6] and are shown as averages over an experimental stage.        Table S10: Relative cell abundances of the 31 sub-communites of the master gate template (gating strategy in Additonal file 1 S8) over all sampling points in %. The time periods of the eight experimental stages are colourcoded along the x axis: 1 -start-up •, 2 -TAN-shortage •, 3 -consolidation •, 4 -pH 5.75 •, 5 -pH 6.0 •, 6 -pH 6.25 •, 7 -pH 6.5 • and 8 -pH 7 •. The relative cell abundances of sorted sub-communities are shaded in light green (Additonal file 1 S 11, 12). :  1  3  6  8  13  15  17  20  22  24  27  29  31  34  36  38  Day 41  45  48  50  52  55  57  62  64  66  69  71  73  76  83  90  92  94 Table 1 S11. Sub-communities chosen for sorting are marked in bold and additionally provided in Additional file 1 Table 3 S11. The strong correlations (r S > |±0.7|) these sub-communities were chosen for are marked with a white dot.      Flow cytometry is able to detect the alternating physiological states of proliferating populations and communities [7,8]. This is elucidated by G16, G17, G18, G21 and G23, which were dominated by the same pair of organisms, namely Bifidobacterium and Olsenella ( Figure 5, main manuscript). Single species can develop multiple distinctly clustering subpopulations comprising cells of different size (linked to FSC) and DNA content (linked to DAPI fluorescence). This effect is generally caused by the proliferation cycle of the respective species, which is in turn tightly connected to its metabolic activity. The respective gates are typically situated close to each other in the gate template and display cells with one, two and multiple chromosomes. This has also been observed in this study (Figure 1 S12) and was clearly exemplified by the E. coli BL21 (DE3) strain used as biological control during cytometer set up (Additional file 1 S9). S13: Miseq amplicon sequencing protocols and details Figure S13: Rarefaction analysis of 16S RNA gene amplicon sequencing reads with A: raw data and B: data normalised to 69,760 reads (green line) and classified to genus level. with 35 cycles in a S1000 Thermal cycler (Biorad, Hercules California, USA) by using the universal primers forward 27F 5'-AGAGTTTGATCMTGGCTCAG-3' and reverse 1492R 5'-TACGGYTACCTTGTTACGACTT-3'

Mean SD Day
according to [10]. All utilised primers were synthesised by Eurofins (Eurofins Scientific, Luxembourg City, Luxembourg).

Mock communitiy
A mock community MBARC26 [11] was used as positive control for the sequencing run as well as the analysis pipeline. MBARC26, composed of 26 cultivable species (23 bacteria, 3 archaea) in different abundances was designed to mimic the diversity of a natural microbial community.

Library preparation for Illumina MiSeq sequencing technique
The V3-V4 region of the bacterial 16S rRNA gene region was amplified with the primers Pro341F 5'-CCTACGGGNBGCASCAG-3' [12] and Pro805R  kit, 2 x 300 bp, 600 cycles. To minimise the technical bias, the PCRs were done in triplicates. The triplicates were purified and quantified separately. The MBARC26 mock community was processed accordingly to ensure the quality of the PCR steps and the following data analysis.

Sequencing data analysis
The Illumina dataset was quality-trimmed with PRINTSEQ [14] from the 3' side at a minimum of Q=30 in a window of 20 bases. The remaining sequences were demultiplexed, merged and pre-clustered using Mothur version 1.39 [15]. Chimeras were removed using UCHIME [16]. The OTU classification was done with Mothur's average neighbour clustering algorithm with a 97% sequence similarity cut off on the SILVA database version 128 [17]. The analysis was based on 3,261,920 forward-reverse overlapped sequences out of 4,146,676 raw sequences. All raw data are available under the BioProject accession number: PRJNA504543.
Rarefraction curves were plotted using ggplot2 package in R [18]. The cleaning procedure yielded a total of 1,883,520 forward-reverse overlapped cleaned sequences. The individual samples were represented with 69,760 (G17 sorted on day 52) to 226,958 (G27 sorted on day 185) cleaned reads. 69,760 reads were chosen as subsampling threshold for the normalisation procedure to allow diversity comparison between samples.
The OTU abundance threshold was set to 0.1%. As the study did not focus on the rare biosphere but the identification of process-relevant active species, the data visualisation focused on organisms displaying over 1% of abundance. 24 out of the 26 species comprising the MBARC26 mock community were recovered. The two species that were not detected in our data set, i.e. N. dassonvillei and and S.
bongori, were of very low abundance in the mock community and also barely detected by [11].