Investigation of Bar-seq as a method to study population dynamics of Saccharomyces cerevisiae deletion library during bioreactor cultivation

Background Despite the latest advancements in metabolic engineering for genome editing and characterization of host performance, the successful development of robust cell factories used for industrial bioprocesses and accurate prediction of the behavior of microbial systems, especially when shifting from laboratory-scale to industrial conditions, remains challenging. To increase the probability of success of a scale-up process, data obtained from thoroughly performed studies mirroring cellular responses to typical large-scale stimuli may be used to derive crucial information to better understand potential implications of large-scale cultivation on strain performance. This study assesses the feasibility to employ a barcoded yeast deletion library to assess genome-wide strain fitness across a simulated industrial fermentation regime and aims to understand the genetic basis of changes in strain physiology during industrial fermentation, and the corresponding roles these genes play in strain performance. Results We find that mutant population diversity is maintained through multiple seed trains, enabling large scale fermentation selective pressures to act upon the community. We identify specific deletion mutants that were enriched in all processes tested in this study, independent of the cultivation conditions, which include MCK1, RIM11, MRK1, and YGK3 that all encode homologues of mammalian glycogen synthase kinase 3 (GSK-3). Ecological analysis of beta diversity between all samples revealed significant population divergence over time and showed feed specific consequences of population structure. Further, we show that significant changes in the population diversity during fed-batch cultivations reflect the presence of significant stresses. Our observations indicate that, for this yeast deletion collection, the selection of the feeding scheme which affects the accumulation of the fermentative by-product ethanol impacts the diversity of the mutant pool to a higher degree as compared to the pH of the culture broth. The mutants that were lost during the time of most extreme population selection suggest that specific biological processes may be required to cope with these specific stresses. Conclusions Our results demonstrate the feasibility of Bar-seq to assess fermentation associated stresses in yeast populations under industrial conditions and to understand critical stages of a scale-up process where variability emerges, and selection pressure gets imposed. Overall our work highlights a promising avenue to identify genetic loci and biological stress responses required for fitness under industrial conditions.

Background Saccharomyces cerevisiae is one of the most widely used microbial hosts in biotechnological processes and has been engineered to produce a variety of industrially relevant compounds ranging from pharmaceuticals to biofuels [1][2][3][4]. Production strains are typically engineered and optimized in small scale cultivations, while the bioproduction processes take place in large scale bioreactors. Even though it is understood that microbial physiology at larger scales differ from that in shake flask batch cultures, it is typically left to later project stages to optimize the strain and conditions for production at scale [5,6]. Previous work has indicated that before any candidate strains are employed in industrial scale production environments, it is prudent to adopt de-risking steps that characterize and optimize the strain performance at industrially relevant scales [7]. Given the low throughput and high cost of large-scale bioconversion and fermentation, this step represents a bottleneck for the development of industrially useful microbial factories [5]. A better understanding of the differences between the culturing conditions in shake flask and large-scale bioreactors will aid our ability to preemptively engineer better microbial hosts before attempting costly and risky scale up.
Multiple functional genomics methods (such as transcriptomics, proteomics, metabolomics, fitness profiling, and fluxomics) have been used to examine critical aspects of strain development such as carbon source utilization, tolerance to toxic substrates or final products, and metabolic flux optimization [8][9][10][11][12][13]. However, few studies applying omics-level techniques have investigated the differences in process scales on the physiology of a microbial production strain [14][15][16][17]. Furthermore, most reports focus on individual aspects of the fermentation process, including oxygen supply [18][19][20] as well as substrate heterogeneity [21][22][23][24]. While these studies shed light on specific physiological changes due to known stresses encountered during the scale up process, such as mass transfer and nutrient heterogeneity, there still remains a dearth of knowledge regarding the biological impact of stresses and bottlenecks on the microbial population at various scales.
Among genome-wide techniques that have proven valuable, massively parallel fitness profiling techniques such as Transposon-Sequencing (Tn-Seq) [25], Barcode-Sequencing (Bar-seq) [26], and Random Barcode Transposon Sequencing (RB-Tn-Seq) [27] allow rapid identification of genetic loci controlling myriad important phenotypes. In this study, we use Bar-seq, which quantifies changes in the population by measuring changes in the abundance of short nucleotide barcodes associated with a known mutation. The Bar-seq methodology has been used extensively since the advent of barcoded deletion and overexpression yeast collections. These collections have led to impressive findings in a wide array of genome-wide phenotypic assays aimed towards increased understanding of biological functions, stress responses, and mechanisms of drug action [28][29][30][31][32]. Similar approaches have been used to assign functions to thousands of genes across many bacteria [33], as well as to identify and eliminate metabolism detrimental to production of a desired molecule [34,35].
Most industrial biotechnological processes employing microbial production hosts are performed using batch or fed-batch cultivation, which are typically more costeffective compared to continuous cultivations [36,37]. In this study, we explored the selective pressures that microbial populations encountered in shake flasks as well as batch and fed-batch bioreactor cultivations to characterize the timing and the identity of these selective pressures. We performed scale-up processes from seed train stage to batch and fed-batch bioreactor cultivation and analyzed the population dynamics of a pooled S. cerevisiae deletion collection using Bar-seq. Our results show that the response to these conditions using the yeast Barseq library strongly depends on the conditions examined and that certain conditions impose a higher selective pressure.

Results
We employed the pooled S. cerevisiae deletion library to examine the impact of different cultivation conditions on the physiology of S. cerevisiae and compared potential global population differences between cultivations in shake flasks versus bioreactors. We were able to characterize the impact of process parameters commonly subject to adjustments during fed-batch cultivations on the library. To better simulate industrial processes, we included a two-stage seed train to generate an adequate amount of actively growing cells to inoculate a production bioreactor, as part of our workflow. To avoid any potential impact from amino acid insufficiencies on the Conclusions: Our results demonstrate the feasibility of Bar-seq to assess fermentation associated stresses in yeast populations under industrial conditions and to understand critical stages of a scale-up process where variability emerges, and selection pressure gets imposed. Overall our work highlights a promising avenue to identify genetic loci and biological stress responses required for fitness under industrial conditions. fitness of the mutant pools, a prototrophic deletion collection was used in all experiments performed in this study [38].

Design of Bar-seq scale up fermentation experiments
It has long been recognized that most strains do not perform the same way when cultivated in a bioreactor compared to a shake flask [6]. The main differences between these two general scenarios are dictated by the geometries of vessels and impellers along with enhanced control capabilities available during cultivations in bioreactors. Better gas transfer rates [5,39,40] can be achieved through presence of impellers and air spargers, better control of nutrient feed rates through external feeding capabilities as well as robust control over pH through acid or base addition. We examined a selection of these conditions between a shake flask cultivation, a bioreactor cultivation in batch mode as well as in fed-batch format, and individually across different fed-batch strategies. To assess the effect of any given variable on the population dynamics of a pooled S. cerevisiae deletion library, we performed growth competition experiments in two sets, each comprising two seed train stages followed by different final cultivation environments varying in either vessel architecture (set 1: bioreactor versus shake flask) or cultivation parameter (set 2: feeding mode, pH) ( Fig. 1, for Additional file 1: Fig S1). For each of the two sets, we used an individual aliquot of the pooled library to inoculate the first seed train stage (seed 1) and measured the relative abundances of mutants within the population at least once every 24 h throughout the course of each cultivation using barcode sequencing [41].
In contrast to microbial cultivation in shake flasks, where gas transfer and mixing are achieved predominantly by shaking, bioreactors employ oxygen spargers and agitators to improve oxygen transfer in the cultures. To test if the difference in gas transfer impacts the population dynamics of the mutant pool, we performed batch cultivations without DO or pH control. All substrate was added to the culture media at the beginning of the cultivation and the composition of the resulting mutant pool after 72 h was compared. To assess potential differences in oxygen transfer in shake flask cultivations, we tested two different shake flask filling volumes (20% V f /V max and 40% V f /V max ) (Additional file 1: Figs. S1, S2).
Additionally, bioreactors allow for pH control to ensure an optimal production environment as well as external feed of additional substrate to extend the process by minimizing nutrient limitations. To assess the impact of a feeding regime and pH on the population dynamics of  [38] was used to inoculate a two stage shake flask seed train in YPD. Growth competition experiments were performed in two sets, each consisting of two seed train stages (Seed 1 and Seed 2) followed by different main cultivation environments. Set 1 (left) varied in vessel architecture that included shake flasks with different culture volumes (SF1 and SF2) and a batch bioreactor (BR). Set 2 (right) varied in cultivation parameter settings that included four fed-batch mode bioreactors (CF4, CF6, DF4 and DF6) with two different feeding modes (CF = constant feed, DF = DO-signal based feed) and two different pH (4 and 6). All cultures were run at 30 °C, the dissolved oxygen (DO) was not controlled. Samples were taken at the end of each seed stage and throughout the bioreactor runs the mutant pool, we performed four fed-batch cultivations: (i) two pH set points, pH 4 and pH 6, and (ii) two glucose feeding schemes, constant rate feeding and a DO signal-based pulse feeding (Fig. 1, Additional file 1: Figs. S1, S3). In a DO signal feeding regimen, a feeding solution is added upon complete exhaustion of available carbon sources and thus is responsive to the metabolic activity of the cells. Whereas, under constant rate feeding regimen, feed solution is continuously added to the fermentation broth independent of the metabolic activity. If the feed rate is not optimized for the process, then this feeding strategy may result in large accumulation of fermentative by-products, including ethanol, due to potential overfeeding of the culture. In all cases, the fed-batch phase was triggered after initial DO spikes, indicating full consumption of the carbon sources available during the batch phase.

Mutant population succession is dictated by fermentation regime
We interrogated the changes detected in the mutant population structure of each cultivation over time and visualized the data using a correlation matrix showing the diversity of the mutant pool via the total number of observed mutants (Fig. 3). Of the 6002 total barcoded genes present in this deletion collection, 2949 were detected at t0 with at least 10 counts. To minimize statistical noise, we only included genes with a threshold of at least 10 counts in the Seed 2 population in this analysis as described by Payen et al. [31]. It may be noted that the general trend of observed mutants per condition was robust and independent of this threshold (Additional file 1: Fig. S4).
Generally, the population structure of the two individual seed trains were highly similar (r > 0.99 across all seed populations) when compared via Pearson correlation of mutant barcode counts (Additional file 1: Fig. S5, Table S1). In the studies investigating vessel geometry (set 1), microbial growth, measured by optical density (OD), and glucose consumption profiles look similar for all batch cultivations, independent of the cultivation vessel (BR vs. SF1 and SF2 in Additional file 1: Fig. S2a). The overall structure of the populations cultivated in both shake flask conditions (SF1 and SF2) and the batch bioreactor (BR) were highly similar when compared via Pearson correlation (Fig. 3) of mutant barcode counts (r > 0.99 across all time points), with minor changes at 48 h (for additional details see Additional file 1: Fig. S2b and c). As the overall population structure is highly similar in all conditions and time points tested, we conclude that the difference in vessel architecture is not reflected in the population structure of the deletion mutant pool.
For population changes between batch, and a fed-batch environment for the tested conditions, we analyzed the abundance of individual barcode counts in populations under the respective conditions, comparing BR to CF 4, 6 and DF 4, 6. Overall, only 8.6% (212 barcodes) of mutant barcodes initially present in the respective seed 2 culture were retained in all populations cultivated in a bioreactor environment independent of the cultivation mode (batch versus fed-batch), and 0.53% (13 barcodes) of mutant barcodes were lost in all conditions at the final timepoints ( Fig. 2b). Upon examination of the mutant populations in the fed-batch conditions specifically, we observed a decrease in the overall diversity of our mutant population over the course of the fermentation for all four conditions (Fig. 2b). As expected from findings from the batch bioreactor study (BR) and the correlation matrix (Fig. 3, Additional file 1: Fig. S3b), over 90% of the mutants in the initial Seed 2 mutant population are still present throughout the batch phase (< 33 h EFT, EFT = elapsed fermentation time) with a highly similar population structure across all conditions (CF 4, 6 and DF 4, 6) before the feed start. Even then, all samples taken up to 48 h EFT show high correlation to each other, indicating little change in the pool of detected barcodes. However, large differences in population composition were observed in the populations grown in a constant rate feeding regime, in CF4 and 6, by the end of the fermentation at 119 h. The constant rate feed resulted in four times higher glucose addition (~ 250 g) but only slightly higher OD values (OD 600 ~ 52) than DO signal based feeding (OD 600 ~ 40, < 75 g glucose added). Most of the glucose in CF 1 and 2 accumulated as the fermentative by-product ethanol (~ 37 g/L), much higher compared to that in DF4 and 6 (< 1 g/L ethanol produced) (Fig. 2a, Additional file 1: Fig S1). Both populations grown under a DO signal-based feeding regime, DF4 and 6, remained highly correlated to the initial population detected in Seed 2 and each other throughout the cultivation independent of the culture pH (DF4 = pH 4, DF6 = pH 6, r = 0.98). Further, less than 200 mutants (i.e. less than 7.5% of the mutants present in the Seed 2 population) in reactor DF6 were lost over the course of Correlation matrix of mutant abundance. Heatmap shows pairwise Pearson correlation coefficients of the raw number of mutant barcodes counted in batch and fed-batch cultures tested with constant rate and DO signal feeding at two levels of pH, 4 and 6. Any gene that had less than 10 barcode counts in every condition was not considered for analysis. Scale bar shows Pearson correlation coefficient from 1 to 0 the entire cultivation, indicating that the conditions (pH 6, DO signal based feeding) did not result in sufficient selective pressure for a significant change of the mutant pool population over the course of 119 h. In both reactors cultivated at pH 4, CF4 and DF4, an additional 10% of the mutants in the respective mutant population were lost within a 24 h window in the fed-batch phase between 48 and 72 h compared to DF6. In contrast, the population cultivated in CF6 (pH 6, constant rate feed) exhibited the start of diversity loss earlier, losing 23% of initial mutant barcodes between 33 h EFT and 48 h EFT and additional 55% of these mutants between 48 and 72 h, resulting in a decrease of its diversity to about 35% of its initial mutant pool after 72 h. This observation is also reflected in the correlation matrix: the mutant population of CF4 (pH 4) showed a modest decrease in correlation to DF6 after 48 h (pH 6, r = 0.88), whereas the CF6 population (pH 6) changed dramatically in comparison to DF6 at that time point (r = 0.12), and only showed moderate correlation to CF4 (r = 0.54). Upon comparison of the barcode abundance trends present under the constant feeding conditions in CF4 and 6, we found that the majority of loss in diversity takes place between 48 and 72 h for both of these populations, which is representative of the onset of stationary phase. This time point seems to coincide with significant accumulation of ethanol in these 2 bioreactor environments reaching ethanol concentration of over 12 g/L by 72 h from 3 g/L at 48 h in both reactors (Fig. 2a, red line), which may suggest that the presence of high concentrations of ethanol in CF4 and CF6 contributes to the loss in diversity under these conditions. Next, we applied standard statistical techniques commonly applied in microbial ecology to better understand how mutant populations diverged from each other over time. These techniques and methods include calculations of Bray-Curtis dissimilarities, which is a statistic to quantify the compositional dissimilarity between two different samples, non-metric multidimensional scaling (NMDS), which attempts to accurately represent the dissimilarity of multidimensional data in lower dimensional space, and beta dispersion, which tests for the homogeneity of dispersion within groups.
An NMDS analysis of Bray-Curtis dissimilarities between mutant populations, revealed that samples diverged from one another as fermentations progressed, indicating that elapsed fermentation time may be an important driver for population divergence (Fig. 4a). To test this hypothesis, we applied common statistical tests on our dataset. Indeed, a one-way ANOVA test revealed that time was a highly significant driver of beta dispersion variance between samples (p = 0.001) and a PERMANOVA test, corrected for repeated measures (which is necessary due to repeated sampling of the same samples over time), showed that time accounted for ~ 36% of beta dispersion variance within the data (p < 0.001), while feeding strategy accounted for ~ 7% of the variance (p = 0.327). Over the course of fermentation, the overall beta dispersion of mutant populations tended to increase (Fig. 4b). When time is not taken into account the beta dispersion in populations from continuously fed reactors was significantly different via ANOVA analysis followed by Tukey HSD, compared to seed cultures (Fig. 4c). These observations indicate that, for this yeast deletion collection, the selection of the feeding scheme which affects the accumulation of the fermentative byproduct ethanol may impact the diversity of the mutant pool to a higher degree as compared to feeding regimes that do not accumulate these byproducts.

Deletion mutants of genes associated with mitochondria are commonly lost across all conditions tested
Amongst the deletion mutants that were maintained throughout the entire fermentation, we observed that each condition (shake flask, batch bioreactor as well as fed-batch processes) resulted in overlapping as well as distinct mutant pools with increased or decreased fitness (Fig. 5). Our analysis revealed that 47 mutants were overrepresented specifically in the batch bioreactor, 4 mutants in the shake flask and 30 mutants were uniquely overrepresented in the fed-batch conditions at the final timepoint. While almost 50% of the mutants overrepresented in the shake flask population, were also overrepresented in the batch bioreactor population (3 of 7 barcodes), no overlap was detected for overrepresented mutants between the fed-batch bioreactors and the shake flask. However, out of 47 mutants overrepresented in the fed-batch environment, 17 were also overrepresented in the batch bioreactor but not in shake flasks.
To better understand the biological relevance of these commonly lost and commonly retained mutants, we performed gene ontology enrichment analysis of these gene lists using the Benjamini-Hochberg procedure to decrease the false discovery rate of our analysis. We found that only 3% of the barcodes initially present in Seed 2 (n = 77) were commonly lost across all the fedbatch bioreactors while 31% of the barcodes (n = 796) were commonly retained by the end of the experiment (119 h) (Fig. 2b). Of the 77 deletion mutants commonly lost across all fed-batch bioreactors, almost 50% (28 genes) are annotated to be associated with mitochondria. Of these 28, the majority of genes (20) are annotated to associate to the mitochondrial envelope and 9 genes are associated with mitochondrial organization including the assembly of the mitochondrial respiratory complex. Of the 796 deletion mutants retained in all fed-batch bioreactors with 787 unique gene identifiers, the annotated genes associated with response to salt stress as well as diphosphate activity were observed as significantly enriched.
Amongst the 10 genes annotated to be associated to cellular response to salt stress (GO ID: 0071472) are MSN4, a stress-responsive transcriptional activator involved in the yeast general stress response as well as all four genes MCK1, RIM11, MRK1, and YGK3, that all encode homologues of mammalian glycogen synthase kinase 3 (GSK-3). Yeast GSK-3 homologues are suggested Fig. 4 Beta-diversity of mutant populations in different generalized feeding regimes over time. a NMDS plot of feeding regimes over time. Colors show timepoints when samples were taken, and symbols show culturing condition. Greater distance between samples (points) indicates greater population dissimilarity (based on Bray-Curtis dissimilarity). b Boxplots show beta dispersion of all samples in the study over time. As time increases there is a general trend of increased beta dispersion. c Boxplots show beta dispersion of all samples in the study grouped by feeding regime, "a" denoted statistical significance between continuously fed and seed populations to promote formation of a complex between the stressresponsive transcriptional activator Msn2p, a paralog to Msn4p and DNA, which is required for the proper response to different forms of stress [42,43]. As these five deletion mutants (msn4 − , mck1 − , rim11 − , mrk1 − , ygk3 − ) were also retained throughout the study performed in the batch bioreactor and SF2, these genes may be promising candidates for gene deletion to improve fitness independent of glucose availability, pH or vessel architecture. However, further experiments need to be performed to confirm this assumption. A summary of identified mutant barcodes and associated annotated gene functions are listed in Additional file 1: Table S2.

Bioreactor-specific stress responses revealed by Bar-seq
In reactors CF4 and 6, which were run under a constant rate feeding scheme, the drastic loss of diversity coincided with the onset of ethanol accumulation. In DF4, the implementation of a DO signal feeding scheme resulted in no substantial accumulation of ethanol but rather a frequent switch from respiratory to fermentative metabolic state triggered by the addition of small glucose boluses that are characteristic of a DO signal feeding scheme (Additional file 1: Fig. S3). To determine whether the groups of mutants that were either lost or selected against during this cultivation period in the respective populations are indicative of the specific stresses imposed on the microbial population, we performed a direct comparison of the mutants specifically reduced or lost in each of the bioreactors followed by gene ontology enrichment analysis of those mutant groups. To allow the determination of statistically significant enrichments of genes in these groups, we excluded the population in bioreactor CF6 from further analysis, as it was reduced to less than 50% of its initial mutant diversity by 72 h.
Overall, 174 mutants were uniquely selected against in bioreactor CF4, 253 in bioreactor DF4 and only 39 mutants were either lost or appreciably selected against in the population cultivated in bioreactor DF6 (Fig. 6). Our analysis revealed that 112 mutants were selected against in both populations cultivated at pH 4 (CF4 and DF4), while only 2 mutants were commonly lost or reduced in all bioreactors at this time period. Glucose addition in DF6 was controlled by measurements of dissolved oxygen within the culture broth (DO signal), which is an indirect indicator for metabolic activity of the cells in suspension. The frequency of DO spikes, and thus the rate at which the cells are consuming all the carbon delivered per feed bolus, significantly decreases over time in case of DF6 compared to DF4, starting at around 50 h EFT. The observed low number of mutants lost in bioreactor DF6 could be attributed to the comparably lower metabolic activity of that culture (Additional file 1: Fig.  S3).
The culture in bioreactor CF4 was maintained at pH 4 and was fed at a constant rate with glucose, which resulted in a high accumulation of the fermentative byproduct ethanol starting at 48 h. The group of mutants selected against during this period of cultivation in CF4 is enriched with genes associated with cellular stress (5.14-fold enrichment, p = 0.0415), autophagy (5.9-fold enrichment, p = 0.0023), and endosomal transport (6.66fold enrichment, p = 0.00144). Of those genes associated with cellular stress, the majority is either associated with DNA damage (MSH4, TSA1, GEM1, DPB4, CMR1, RAD57) or protein misfolding (CMR1, SSM4, CNE1). Additionally, we observed an overrepresentation of genes involved in the Cytoplasm-to-vacuole targeting (Cvt) pathway, a constitutive and specific form of autophagy that uses autophagosomal-like vesicles for selective transport of hydrolases to the vacuole, in the pool of mutants that were specifically selected against in bioreactor CF4 between 48 and 72 h (ATG2, ATG3, ATG4, ATG8, ATG9. ATG31, VPS30). Similar to bioreactor CF4,  Fig. 2b, the majority of population changes occurred before 72 h in fed-batch cultivations. Venn diagrams show the number of mutants common or unique between parameters tested the culture in bioreactor DF4 was also maintained at pH 4 but was fed intermittently based on DO spike signals. Here, the pool of mutants that were significantly selected against was enriched for genes associated with the TCA cycle (6.09-fold enrichment, p = 0.0182) as well as oxidative stress (4.68-fold enrichment, p = 0.0142). The group of mutants selected against predominantly comprised of genes associated with the mitochondria (CIT3, IDH1, COX8, TIM18, UPS3, AIM25, DIC1) as well as general stress response genes (GPH1, GPD1) and oxidative stress responses (MXR1, YBP1, CTA1).
In addition, we found that 220 mutants were underrepresented specifically in the batch bioreactor, 79 in the shake flask and 131 were uniquely underrepresented in the fed-batch conditions at the final timepoint (Fig. 5). Unlike the overrepresented mutant pool analysis, a subset of underrepresented mutants is shared in two conditions and 7 mutants are underrepresented in all three conditions. The majority of these mutants, namely Rps0A − , Rpl8B − , Esa1 − , Ric1 − , Msc6 − , can be associated with ribosomal and translational activity [44][45][46][47]. Similar to the overrepresented mutant pool at the final time point, we observed a large overlap of the underrepresented mutant pool in the shake flask population and batch bioreactor population (47%, 85 of 183 underrepresented in shake flasks). The pool of 131 mutants, specifically underrepresented but still present in fed-batch bioreactors, was enriched with genes associated with mitochondria (p = 0.01, n = 39, GO ID = 0,005,739). Of the 39 mitochondria associated genes, 11 play a role in mitochondrial organization while 7 are associated with cellular respiration.

Discussion
In this study, we demonstrate the feasibility to correlate physiological changes observed under controlled cultivation conditions, including different pH set points and fed-batch regimes, with significant changes in population structures of yeast barcoded mutants. Specifically, we found that the choice of feeding strategy generating toxic by-products had a great effect on the population structure, highlighting the importance of feed rate adjustments and dynamic feeding strategies during the biocatalysis phase in fed-batch processes.
We found the S. cerevisiae population structure to be robust throughout the entire seed train as indicated by a Pearson correlation of population makeup and by evaluation of beta diversity between samples (Figs. 3, 4). This Fig. 6 Mutants selected against in bioreactors from 48 to 72 h. Venn diagram shows mutants that either were lost (had a total barcode count < 10) or were 50% less relatively abundant from 48 to 72 h. Red text shows GO terms that were enriched in lost or less abundant mutants in bioreactor CF4, while green test shows enriched GO terms from bioreactor DF4. Specific genes from each GO term are written below in black finding is critical as significant perturbations in the population structure at seed train stage would render the analysis of subsequent scale-up experiments infeasible. Interestingly, while no significant change in population structure was observed in the seed train, changes in the mutant pool composition were observed in all conditions mimicking production cultivations, including shake flask and bioreactor experiments, allowing us to identify mutants in genes that were specifically selected against in each of the tested conditions.
We found that the pool of deletion mutants specifically selected against in fed-batch conditions tested in this study was enriched with genes associated with mitochondria, specifically in mitochondrial organization and cellular respiration. As a Crabtree-positive organism, S. cerevisiae predominantly metabolizes glucose by fermentation even under aerobic conditions [48], which causes the cells to be in a non-respiratory metabolic state during typical batch fermentation processes that employ glucose as a carbon source [49]. We observed that deletion mutants associated with the mitochondria are specifically selected against in fed-batch regimes and not in the batch bioreactors. Generally, we did not observe significant changes in the population in any tested environment until glucose depletion occurred, which constitutes the initiation of the feeding phase in fed-batch processes (Fig. 2, Additional file 1: Fig. S3). The apparent selection against mutants harboring deletions of genes associated with mitochondrial organization and cellular respiration in fed-batch processes could be an artifact of the extended growth phase caused by the addition of glucose in the fed-batch processes: While we did not detect an enrichment of genes associated in the first "wave" of genes selected again, there might be a "second round of losers" which only became apparent due to the additional generations. As mitochondria harbor important metabolic pathways for various native and non-native bioproducts [50][51][52], better understanding of the cause of this negative selection is required to enable better process and strain designs aimed at maintaining functional mitochondria during scale up.
An important metric to determine the suitability of Bar-seq to elucidate stresses involved in industrial scaleup was that loss of population diversity correlated to the onset of observable stress. Temporal analyses of mutant populations showed a greater than 10% loss of overall mutant diversity in 3 out of 4 fed-batch bioreactors from 48 to 72 h of fermentation runs (Fig. 2). In the constant rate feed bioreactors CF4 and 6, this observation correlated with the onset of ethanol production, which may have imposed a strong selective pressure against many mutants. Our results suggest that DO signal feeding may lead to a less selective environment than constant rate fed-batch fermentations. As DO signal based feeding leads to several occasions of complete glucose starvation, it is expected to impose a stronger selective pressure on the microbial culture [53]. Fluctuations in glucose and oxygen availability are known to result in global changes in gene expression patterns including initiation of cellular starvation responses as well as cell-cycle arrest amongst others [14,22,54], which is thought to result in an increased cellular stress. Studies investigating the effect of increased concentrations of ethanol on S. cerevisiae cells have shown that the presence of ethanol affects the intracellular pH as well as membrane fluidity, resulting in growth inhibition [55,56]. Our study suggests that yeast cells may be more resilient to fluctuating glucose and dissolved oxygen levels resulting from a DO signal based feeding strategy as compared to the presence of ethanol stress caused by overfeeding using a static feed [57].
We detected that deletion mutants of genes associated with autophagy and a general stress response were specifically selected against under constant rate feeding conditions (CF4, Fig. 6). Autophagy is the process whereby cytoplasmic components and excess organelles are degraded and is known to be initiated upon starvation for nutrients such as carbon, nitrogen, sulfur, and various amino acids, or upon endoplasmic reticulum stress [58,59]. Our findings are in agreement with Piggott et al., who found genes that function in autophagy to be required for optimal survival during fermentation when performing a genome-wide study of S. cerevisiae gene requirements during grape juice fermentation [60]. The authors concluded that the recycling of cellular components by autophagy enables yeast to survive the stressful conditions of fermentation and maximize fermentative output [60]. In S. cerevisiae, the general stress response is regulated by two homologous transcription factors, Msn2p and Msn4p, which bind to the stress response element (STRE) located in the promoters of ~ 200 genes in response to several stresses, including heat shock, osmotic shock, oxidative stress, glucose starvation, and high ethanol concentrations [61][62][63]. Specifically, expression of genes associated with metabolic homeostasis including proteolysis, protein repair including heat shock proteins, prevention of oxidative damage, and cellular reorganization is found to be upregulated during the general stress response [62,64]. Hirata et al. observed that the deletion of all 4 genes encoding GSK-3 homologues resulted in reduced cell viability in response to different types of stress, including heat, salt, and oxidative stresses, which has also been observed for a msn2 msn4 mutant and suggested that GSK-3 is necessary for the formation of a complex between Msn2p and STRE to initiate a proper stress response [42,61]. In the majority of the conditions tested in this study, individual deletion mutants of all four homologues of mammalian glycogen synthase kinase 3 (GSK-3), which interact with Msn2p, as well as the deletion mutant of MSN4p were retained. This finding indicates that a weakened general stress response caused by unstable binding of Msn2p and/or Msn4p to the STRE in stress regulated genes may be beneficial for cellular fitness under the conditions tested in this study. Weakening the initiation of the stress response may result in less significant changes in gene expression patterns associated with the general stress response, which in turn may decrease the metabolic burden that comes with transcription and translation machinery required to initiate the full change in gene expression patterns stress response. Further investigations are necessary to fully understand the effect of a potentially weakened general stress response for cultivations in a controlled environment and if this phenotype would translate between benchtop bioreactors and large scale production bioreactors, which typically exhibit drastic nutrient and oxygen gradients [65,66].
In addition to the Bar-Seq technology applied in this study, there are other genome-wide engineering techniques developed for strain engineering, functional genomics or fitness profiling studies that may be applicable to study the population dynamics of a microbial culture during cultivation in bioreactors. To date, MAGE (Multiplex Automated Genome Engineering) [67], TRACE (Tracking Combinatorial Engineered Libraries) [68] and CREATE (CRISPR EnAbled Trackable genome Engineering) [69] have been successfully tested and applied in bacterial systems (e.g. E. coli), while MAGIC (Multi-functional Genome-wide CRISPR) [70] is the only other technology, besides Bar-Seq, that has been successfully applied for engineering in yeast. Feasibility assessments to use CREATE in yeast have been performed [69]. While these high-throughput multiplexing technologies hold immense potential in the future they have some challenges that need addressing before they are successfully applied in yeast or other higher eukaryotic systems. As a result of these associated challenges and the proven track record of successful applications of the Bar-Seq technology in S. cerevisiae [26,31,71], we have employed the Bar-Seq method to de-risk our study of population dynamics in bioreactor settings. Challenges regarding use of any recombineering based engineering strategies, specifically for S. cerevisiae, include comparatively transformation efficiency in yeast, constraints in balancing library coverage with number of edits per cycle as well as trackability of the mutant population. For example, tracking genomic mutations in MAGE becomes increasingly challenging with an increasing number of edits as each targeted loci has to be individually amplified and sequenced for analysis. Further, the instability associated with plasmid-based barcodes might pose a tracking limitation in techniques such as CREATE and MAGIC. Molecular barcodes used in TRACE and Bar-Seq are commonly regarded as more stable and easier to track.
Our work was able to identify genetic loci that are selected for or against in various bioreactor schemes. However, the time and cost associated with each individual fermentation run precluded biological replicates and established stochastic variability occurring over prolonged bioreactor studies. While our study confirms earlier findings that genes involved in autophagy can be associated with increased fitness during fermentation, future work should focus on robust replication to increase statistical relevance and allow identification of deletion mutants with more subtle impact on fitness in an industrial setting.

Yeast strain collections
The prototrophic yeast deletion collection in the haploid S. cerevisiae BY4742 background was obtained as individual colonies from Prof. Amy Caudy (Univ of Toronto, Canada) plated on Omnitrays containing YPD agar with 200 mg/mL G418. Balanced pools of the deletion collection were obtained by harvesting all individual colonies into a total of 100 mL YPD containing 200 mg/mL G418 resulting in a cell suspension of OD 600 = 50. 1 mL aliquots of the yeast deletion pools with a final concentration of 25% glycerol were prepared and stored at − 80 °C until further use.

Seed cultivation
The yeast strain collection was cultured in a two-tiered seed train. The first seed culture was grown in 250 mL baffled shake flasks containing 50 mL YPD media (10 g/L yeast extract, 20 g/L peptone and 20 g/L glucose) inoculated with a 0.7% (v/v) inoculum directly from the glycerol stock. The second seed culture was grown in 500 mL baffled shake flasks containing 100 mL YPD media using a 10% (v/v) inoculum size. All seed cultures were incubated at 30 °C, 200 rpm (1″ throw) for 24 h.

Shake flask experiments
Shake flask experiments were carried out in 500 mL baffled Erlenmeyer flasks containing 100 mL (V f / V max = 20%) and 200 mL (V f /V max = 40%) of YPD media with an initial pH of 6.1. V f is defined as the volume of media in the flask and V max is the total volume of the flask. Flasks were inoculated from second seed cultures (4% (v/v) and incubated at 30 °C and 200 rpm (Orbit diameter 25 mm). Cultures for serial dilution experiments were grown for 24 h and back diluted into a flask with fresh YPD media. This was done two times until 72 h total fermentation time was reached. Samples were taken once a day and centrifuged at 15,000×g for 5 min. Supernatant was stored at 4 °C and the cell pellet was stored at -80 °C.

Bioreactor experiments
Batch and fed-batch bioreactor experiments were performed in 2 L bench top glass fermentors (Biostat B, Sartorius Stedim, Göttingen, Germany) equipped with two 6-blade Rushton impellers. All tanks were batched with 960 mL of YPD media and autoclaved at 121 °C for 30 min. The bioreactors were inoculated with second seed cultures with inoculum size of 4% (v/v) initial batch volume, i.e. 40 mL in 1 L. Temperature, agitation and air flow were maintained constant at 30 °C, 400 rpm and 0.5 vvm, respectively and pH was controlled to 4.0 and 6.0 using 2 N NaOH and 2 N HCl. The batch bioreactor experiment started at 20 g/L glucose, which was consumed within the first 20 h of the study. Ethanol and other carbon sources generated in the first few hours lasted for up to 72 h, with DO not reaching 100% until then. To mimic the shake flask studies, for BR, glucose was not replenished.
Glucose was replenished in fed-batch experiments: CF1, CF2, DF1, and DF2. Two different feeding strategies, i.e. a linear feed profile (y = 0.117 mL/h 2 x + 4 mL/h) and a dissolved oxygen (DO) signal-triggered pulse feeding loop (∆DO = 30%, Flow rate = 40 mL/h, Pump duration = 6 min). Linear feed was conducted at a constant rate that was chosen based on previous experience to avoid glucose starvation of the microbial culture. The DO signal-based pulse feeding can be explained as follows: Upon depletion of total available carbon (glucose, ethanol, organic acids, etc.), the metabolic activity stalls and consumption of oxygen is discontinued. This leads to a sharp increase in dissolved oxygen in the culture media. Once a certain threshold value in ΔDO is detected, a customized LabVIEW (National Instruments, Austin, TX, USA) feed control software is triggered to administer predetermined bolus of substrate. This type of DO signal based pulsed feeding continues until the fermentation process is ended. The initial DO spikes for CF2, DF1, and DF3 occurred within a 2-h time period. The DO spike for CF1, however, occurred 8 h later due to the presence of excess ethanol in the media. The feed media composition was as follows: 10 g/L yeast extract, 20 g/L peptone and 250 g/L glucose. 1 ml samples were taken in regular intervals (5 h, 24 h, 33 h, 48 h, 72 h and 119 h) and centrifuged at 15,000×g for 5 min. The supernatant was filtered (0.2 mm) and stored at 4 °C. The cell pellet was stored at − 80 °C.

Preparation of genomic DNA and Bar-seq analysis
Genomic DNA was extracted from dry, frozen cell pellets using the "Smash-and-Grab" method published by Hoffman and Winston [72] and used for barcode verification. Amplifications of the barcodes using previously published primers [31] and sequencing of the libraries using an Illumina MiSeq was performed at the Vincent J. Coates Genomics Sequencing Laboratory, California Institute for Quantitative Biosciences (QB3) (Berkeley, California, USA). Each barcode was reassigned to a gene using a standard binary search program as described by Payen et al. [31]. Only reads that matched perfectly to the reannotated yeast deletion collection were used [25]. Multiple genes with the same barcodes were discarded. Strains with less than 10 counts in the starting pool (t0) were discarded. The numbers of strains identified in the conditions are summarized in Additional file 1: Table S1. To avoid division by zero errors, each barcode count was increased by 10 before being normalized to the total number of reads for each sample. Distributions of barcode counts per mutants in each sample can be found in Additional file 1: Fig S6.

Statistical analysis
Pearson correlations were calculated with the SciPy Python library [73]. To identify the time points of "mass extinction" / biggest steps in loss of diversity (here, defined as gene richness), we looked at the number of genes either lost or retained during the time course for each of the bioreactors using different count thresholds for mutants present in the Seed train (> 0, 5, 10 or 20). Beta dispersion was calculated and PERMANOVAs were conducted using the Adonis function in the vegan R library [74].

Gene ontology enrichment
Gene ontology enrichment analysis was performed using the List analysis tool provided by YeastMine (https :// yeast mine.yeast genom e.org), populated by SGD and powered by InterMine [75] against the annotated S. cerevisiae S288C genome. The tests were corrected using the Benjamini-Hochberg procedure to decrease the false discovery rate of our analysis with a maximum acceptable p-value of 0.05 unless indicated otherwise.