Insights into metabolic osmoadaptation of the ectoines-producer bacterium Chromohalobacter salexigens through a high-quality genome scale metabolic model
© The Author(s) 2018
Received: 27 July 2017
Accepted: 20 December 2017
Published: 9 January 2018
The halophilic bacterium Chromohalobacter salexigens is a natural producer of ectoines, compatible solutes with current and potential biotechnological applications. As production of ectoines is an osmoregulated process that draws away TCA intermediates, bacterial metabolism needs to be adapted to cope with salinity changes. To explore and use C. salexigens as cell factory for ectoine(s) production, a comprehensive knowledge at the systems level of its metabolism is essential. For this purpose, the construction of a robust and high-quality genome-based metabolic model of C. salexigens was approached.
We generated and validated a high quality genome-based C. salexigens metabolic model (iFP764). This comprised an exhaustive reconstruction process based on experimental information, analysis of genome sequence, manual re-annotation of metabolic genes, and in-depth refinement. The model included three compartments (periplasmic, cytoplasmic and external medium), and two salinity-specific biomass compositions, partially based on experimental results from C. salexigens. Using previous metabolic data as constraints, the metabolic model allowed us to simulate and analyse the metabolic osmoadaptation of C. salexigens under conditions for low and high production of ectoines. The iFP764 model was able to reproduce the major metabolic features of C. salexigens. Flux Balance Analysis (FBA) and Monte Carlo Random sampling analysis showed salinity-specific essential metabolic genes and different distribution of fluxes and variation in the patterns of correlation of reaction sets belonging to central C and N metabolism, in response to salinity. Some of them were related to bioenergetics or production of reducing equivalents, and probably related to demand for ectoines. Ectoines metabolic reactions were distributed according to its correlation in four modules. Interestingly, the four modules were independent both at low and high salinity conditions, as they did not correlate to each other, and they were not correlated with other subsystems.
Our validated model is one of the most complete curated networks of halophilic bacteria. It is a powerful tool to simulate and explore C. salexigens metabolism at low and high salinity conditions, driving to low and high production of ectoines. In addition, it can be useful to optimize the metabolism of other halophilic bacteria for metabolite production.
Chromohalobacter salexigens is a moderately halophilic gamma-proteobacterium of the family Halomonadaceae, phylogenetically related to the genus Halomonas . It displays one of the broadest known growth salinity ranges , and is considered as a biological model to study prokaryotic osmoadaptation. C. salexigens has been proposed, among other natural producers, as a cell factory for the production of the biostabilizing compatible solutes ectoine and hydroxyectoine [3, 4]. The microorganism has several advantages as a cell factory: (i) it is easily cultured by using conventional media, (ii) it has broad metabolic versatility being capable of using different carbon sources , (iii) a high diversity of genetic tools are available for its study , and (iv) its genome has been sequenced .
To further explore and use C. salexigens as cell factory for ectoine(s), a comprehensive understanding of its metabolism is essential. In C. salexigens, production of ectoines is an osmoregulated process that burdens central metabolic routes by quantitatively drawing TCA cycle intermediaries. The ectoine biosynthetic pathway starts with the phosphorylation of L-aspartate and consumes oxaloacetate (OAA) and acetyl-CoA, the latter of which is produced by oxidative decarboxylation of pyruvate by pyruvate dehydrogenase. This use of intermediaries of the TCA cycle is of special relevance, since active anaplerotic pathways are needed in order to replenish the cycle . Consequently, central metabolism is adapted to the requirements of this biosynthetic route.
Genome-scale metabolic models (GEMs) are currently the only approach that enables the modeling and global analysis of an organism’s metabolic and transport network by a global analysis . A genome-wide constraint-based model consists of a stoichiometric reconstruction of all reactions known to act in the metabolism of the organism, along with an accompanying set of constraints on the fluxes of each reaction in the system . A major advantage of this approach is that the model does not require knowledge on the kinetics of the reactions. These models define the organism’s global metabolic space, network structural properties, potential flux distribution, and provide a framework with which to navigate through the metabolic wiring of the cell [10, 11]. Through various analysis techniques, constraint-based models can help to predict cellular phenotypes, given particular environmental conditions. Constraint-based analysis techniques (i.e. flux-balance analysis [FBA]), have been instrumental in elucidating metabolic features in a variety of organisms  and so far have been used for certain biotechnology endeavors [12, 13].
Here, we describe a highly detailed and curated metabolic reconstruction of Chromohalobacter salexigens DSM 3043, one of the mayor ectoine(s) natural producers. The reconstruction was built using the COBRA approach [10, 14] and validated using flux balance analysis . The in silico metabolic network was further evaluated by comparing predicted growth rate capacities in different carbon sources. The model was used to gain insight into the physiology of C. salexigens metabolic stress response at conditions of low and high ectoines production. In addition to optimize C. salexigens strains for ectoines production, iFP764 may become an important tool to understand the metabolism of halophilic microorganisms, and may serve as a platform to test metabolic capabilities of halophilic bacteria at low and high salinity.
Metabolic reconstruction and refinement
Database sources used for C. salexigens iFP764 metabolic reconstruction
BioMet ToolBox 2.0
Conserved Domain Database
Formulation of biomass reaction
Given the metabolic adaptation to the salinity of C. salexigens , the biomass composition defined in the model was adapted to simulate high and low salinity conditions. For that purpose, two different biomass reaction compositions (BIO) were estimated: “BIO_H” to simulate high salinity (2.5 M NaCl), and “BIO_L” to simulate low salinity (0.6 M NaCl). According to Thiele and Palsson , the detailed biomass composition needs to be experimentally determined. However, in some occasions, it may not be possible to obtain a detailed biomass composition of the organism of interest. In this case, the main alternative strategy is the utilization of the experimentally determined biomass composition of a non-related microorganism (i.e. Escherichia coli) . A second, more accurate, strategy is to partially reformulate an experimentally-determined biomass composition by estimating the relative fraction of the biomass metabolites of the organism of interest that can be experimentally calculated. In this work, both biomass compositions were obtained by partially reformulating Escherichia coli biomass composition [25–27] by using previously published and experimental data from C. salexigens [7, 24].
Appropriate amino acid molar ratio for both biomass reactions were calculated assuming that all proteins encoded by C. salexigens genome were expressed equally [26, 28]. The proportion of total proteins in response to salinity was calculated based on previous results by our group  and the molar ratio of membrane phospholipids in response to salinity were previously determined by Vargas and co-workers . The percentage of G+C in C. salexigens genome was experimentally estimated . This was utilized to calculate the genome nucleotide composition, which was used to establish the deoxyribonucleotide molar ratio. The ribonucleotide composition was deduced from the composition of C. salexigens rRNA and tRNA in the proportions defined previously . Generally, the relative composition of the nucleic acids does not change with osmolarity, as the overall DNA and RNA content related to dry weight does not seem to be affected by osmotic stress . This assertion was assumed when establishing the relative composition of DNA and RNA. The proportions of the other compounds were assumed not to vary between E. coli and C. salexigens, and these coefficients were derived from values reported in the literature . Ectoine and hydroxyectoine were included in the biomass composition, because they are growth-associated metabolites and are accumulated in the cells in response to salinity . The coefficients for these compounds were determined based on previous results . Units of each component of a biomass reaction were mmol gDW−1 (milimoles per gram cell dry weight). Flux through a biomass reaction has h−1 units and is equivalent to the exponential growth rate of the organism . Detailed information on the biomass composition calculation is included in Additional file 1: Tables S1 and S2.
Conversion of the reconstruction into a mathematical model
The reactions and their participating metabolites in the metabolic network were connected via the stoichiometric matrix (S), which contains the stoichiometric coefficients for each metabolite (row) in each of the reactions of the metabolic model (column). The values in this matrix correspond to the stoichiometry of the metabolite in the reaction. A negative number represents consumption and a positive number represents production of the metabolite. This matrix is the base of the metabolic model that considers metabolite mass conservation and that the accumulation of metabolites (b) in the system is equivalent to the product of S matrix and the vector of reaction fluxes (v): S·v = b. When using FBA to solve the differential equations system, one of the requirement is that the organism is in steady-state growth so, b = 0, resulting in a linear system of equations . Then the system is bound to a large solution space which needs to be constrained by applying upper and lower bounds to each individual reaction flux (vi): vi lower ≤ v i ≤ v i upper . Thus, in this process, system boundaries were defined, converting the general reconstruction of C. salexigens into a specific model. The initial model differed in scope and boundaries from the final model, which was accomplished after multiple iterations of validation and refinement, and which was used to simulate phenotypic behaviour in a prospective manner. The upper and lower bounds were defined for each reaction based on the flux they could carry. Reversible reactions have an upper bound value of 1000 mmol gDW−1 h−1 and a lower bound value of − 1000 mmol gDW−1 h−1, making them practically unconstrained, while irreversible reaction have a lower bound value of zero. By default, the biomass reaction was set as the objective to be maximized. All exchange reactions have zero lower bound value except for the carbon source (glucose, − 10 mmol gDW−1 h−1), and oxygen (− 20 mmol gDW−1 h−1), as simulations were run considering growth on a glucose mineral medium in aerobic conditions. Upper and lower bounds of exchange reactions for inorganic ions required by biomass reaction were settled as 1000 and − 1000 mmol gDW−1 h−1, respectively. The default value for lower bound of glucose uptake was based on the typical glucose uptake rates . Biomass reaction was set as objective function and was maximized to calculate the feasible flux distribution. The obtained value for this reaction was equal to maximum cell growth rate .
All model simulations were performed using COBRA Toolbox 1.8  in MATLAB (R2014b) (The MathWorks Inc., Natick, MA), linked to the GLPK or Gurobi linear program solver.
Computational network analysis
Minimal medium simulation
The minimal medium M63  was simulated in silico by allowing entrance of some simple salts and ions, water, and O2 (aerobic simulations) in the system. For that purpose, upper and lower bounds of exchange reactions for those metabolites exchange reactions were settled as 1000 and − 1000 mmol gDW−1 h−1, respectively. The composition of the M63 medium is described in Additional file 1: Table S3. The carbon and nitrogen source was changed according to the objective of simulation as well as the secretion rate of pyruvate and acetate (Additional file 1: Table S4).
All blocked metabolites in the iPF764 model were identified by using the GapFind MILP algorithm  from the COBRA Toolbox. GapFind was run twice, once with each mass balance constraint option. Root no-production and no-consumption metabolites were identified from the model stoichiometric matrix (S) by searching for rows containing only negative or positive coefficients, respectively. Downstream no-production and upstream no-consumption gaps were identified by removing the root gaps from the GapFind outputs .
In silico validation of GEM model: prediction of C. salexigens growth phenotype
To validate the model for the prediction of growth phenotypes of C. salexigens, the ability to utilize different compounds as the sole carbon sources for growth was determined. For this purpose, the lower bound of glucose exchange reaction was constrained to zero, the lower bound of each different carbon exchange reaction, one at a time, was set to − 10 mmol gDW−1 h−1 (a typical uptake rate for growth-supporting substrates), and growth was maximized by FBA. The target substrate was considered growth supporting if the predicted growth rate was above zero. The simulations were carried out in minimal medium using BIO_L as biomass equation (see Additional file 1: Tables S2, S4). The results arising from the simulations were compared with experimental data  and used to validate the model.
To assess the robustness of the metabolic network to genetic perturbations (e.g., knock-out mutations), we carried out an in silico analysis of the essentiality of single genes and reactions. This enabled us to identify the most fragile nodes of the iFP764 network. Gene essentiality simulations were performed by systematically removing each gene from the network and in silico assessing (via FBA) the ability of the model to produce biomass in minimal medium with a sole carbon source (at low and high salinity). In terms of the metabolic simulations implemented in this study, no-growth following perturbation (deletion of gene-associated reaction) indicates the complete disruption of robustness in C. salexigens and indicates that this gene/reaction is essential for growth.
Simulation at low and high salinity conditions
The effect of salinity over C. salexigens metabolism at low and high salinity (conditions for low and high ectoines production, respectively) was examined by flux balance analysis (FBA). Rates for uptake and secretion of metabolites in minimal medium M63 with 0.6 M or 2.5 M NaCl previously described , were used as constraints to simulate aerobic growth at both conditions. For this purpose, metabolites present in the medium were first allowed to enter and leave the network by setting the flux limits on the corresponding exchange reactions vi, min > − 1000 mmol gDW−1 h−1 and vi, max < 1000 mmol gDW−1 h−1. Second, the glucose and ammonium uptake rates, and the acetate and pyruvate secretion rates, were changed according to experimentally measured fluxes at those salinities, and a biomass reaction (“BIO_L” or “BIO_H”) was selected depending on the salinity simulation conditions.
Calculation of fluxes distribution at high and low salinity by Monte Carlo Random sampling
The steady-state flux space is a finite polytope containing all possible steady-state flux distributions . Thus, computing the exact volume for large, n-dimensional polytopes is a hard problem . However, it is possible to solve systems of low dimension to examine the accuracy by Monte Carlo approximations. The exact polytope volume calculation algorithm  was implemented in Matlab (Mathworks Inc, Massachusetts).
The first step in the Monte Carlo calculation was to determine all the independent variables in the system. This essentially ‘projects’ the flux polytope into a full-dimensional object. The full-dimensional polytope was sampled uniformly over the independent variables, and the dependent variables were back-calculated based on the dependencies in the stoichiometric matrix, S. The result was a random flux distribution, V. The next step was to determine if the flux distribution was valid. Back-calculating the dependent flux values often results in invalid flux distributions (i.e. negative flux values or flux values that violate the Vmax constraints). A valid flux distribution was included in the sampled set. Once a set of approximately 20,000 valid points were collected, the process of determining the fractional volumes could start. These valid points were all part of the ‘base’ flux polytope. In this work, Monte Carlo Random Sampling was carried out using the BIO_L and BIO_H to verify the influence of the composition of the formula of the biomass on the internal flows. In addition, specific- salinity constraints based on previous experimental data  (Additional file 1: Table S4) were imposed, and the fraction of points meeting the new criteria determined the reduction in volume. The set of all possible flux values for each reaction were represented as a distribution histogram. Each histogram showed one-dimensional information on its x axis, in terms of the extent of possible values for that particular flux. The y axis represents the “size” of space in the other r–1 dimensions resulting from slicing the metabolic solution space along a specific value of the flux through the indicated reaction . The median of the values obtained was normalized by the salinity-specific glucose uptake rate, and was used to analyze the internal flows in all scenarios calculated.
The variation in the flux distribution between both conditions (low and high salinity) was evaluated. For that purpose, flux vector values obtained in the sampling for each reaction in both conditions were compared and evaluated by a Wilcoxon rank sum test. Those reactions which showed significant different distribution (Z statistic over 50 and p value < 0.01) were selected. Additionally, to see the extent of those differences, 2000 points from each reaction vector were randomly selected from the distribution values obtained after sampling in both conditions. Those random points were subtracted to each other between conditions. Then, the average from all computed differences for each reaction was considered as an estimate of flux distribution variation between both conditions.
Correlation analysis of metabolic fluxes at high and low salinity
To detect correlated reaction sets in the metabolic network within each condition, a Spearman Rank correlation matrix was performed taking as an input the 20,000 values of each reaction flux vector obtained during the Monte Carlo sampling. A Spearman Rank correlation value over 0.7 or below − 0.7 was considered as the minimal threshold to select reactions that were correlated to each other, respectively. The resulting filtered adjacency matrix was visualized and analyzed in CYTOSCAPE 3.2. as a correlation network, were each node represented a metabolic reaction and each edge represented an observed correlation. Nodes were also grouped according to their membership to different pathways subsystems, which were defined in the metabolic reconstruction iFP764.
Results and discussion
Construction of the high-quality metabolic model iFP764
In C. salexigens, the differential use of the periplasmic or cytoplasmic variants of the Entner–Doudoroff pathway contributes to modulating the production of reducing equivalents (NADPH), adapting it to the requirements for ectoine synthesis . For this reason, the C. salexigens metabolic model iFP764 generated in this work included three distinct cellular compartments: the cytoplasm, the periplasm and the extracellular space.
Properties of metabolic reconstructions of C. salexigens
In silico metabolic model
No. of total reactions
No. of total ORFs associated
No. ORFs no associated
No. of total metabolic reactions
No. of total transport reactions
Extracellular to periplasmic
Periplasmic to cytoplasmic
Periplasmic to extracellular
Cytoplasmic to periplasmic
Extracellular to cytoplasmic
No. of total metabolites
Unique functional proteins
Genes involved in complexes
Instances of isozymes
Formulation of two biomass reactions to simulate low salinity and high salinity conditions
Two specific biomass reactions, “BIO_H” and “BIO_L”, were formulated to simulate high and low salinity conditions, respectively, and incorporated into the reconstruction. The formulation of both reactions was partially based on previous C. salexigens experimental data, and on the formulation defined for Escherichia coli . Ectoine and hydroxyectoine were included in both biomass reactions because they are growth-associated metabolites , and their specific contents are affected by the salinity of growth medium . This provided two condition-specific models that will serve as platforms to test metabolic capabilities at low and high salinity. Detailed information on the biomass composition formulation can be found in “Methods” and Additional file 1: Tables S1 and S2.
Properties of the iFP764 reconstruction
Comparison of iFP764 with the previous C. salexigens metabolic reconstruction iOA584
After completion of the iFP764 metabolic network reconstruction, it was compared with the previous semi-automatically-generated, reconstruction of C. salexigens (iOA584) . Table 2 details the number of reactions of iFP764 in each subsystem and the comparison with the previous reconstruction iOA584. Up to 169 genes that were found in the iOA584 model were not include in the iFP764 model, since these routes have not been described in C. salexigens, and the reactions catalyzed by these enzymes were not connected to any pathway.
Whereas periplasmic and extracellular metabolic reactions were considered in iFP764, iOA584 did not include any transport reaction other than exchange reactions. In addition, the semi-automated reconstruction contains just single genes associated with reactions, but genes associated to enzymatic complexes or isoenzymes were not annotated, as in iFP764. This improvement is very relevant to implement strategies for in silico knockout analysis or strain design for metabolic engineering.
The final reconstruction iFP764 included some incomplete metabolic pathways that contained gaps or blocked reactions, which generated 98 dead-end metabolites: 27 root no-production metabolites (metabolites that participates in consuming reactions but not in producing reactions) and 71 no-consumption metabolites (metabolites that are present in producing reactions but not in consuming reactions) (Additional file 1: Table S5). In total, 11.17% of the metabolites in iFP764 were blocked under all conditions due to gaps. The major root no-production metabolites were those implicated in the production of those, like tRNAs, serve for non-metabolic functions once they are produced. This problem is also present in the E. coli iJO1366 metabolic reconstruction . On the other hand, out of the 71 no-consumption metabolites detected mainly belonged to pathways related to tRNA (9.2%), cell envelopes, including cell wall and cell membranes (15.6%), cofactors and prosthetic groups synthesis (8.6%) or carbon metabolism (7.1%). With the exception of the metabolites found in the routes of carbon metabolism, the rest of dead ends metabolites were found in pathways insufficiently described in C. salexigens. An important flaw in the automatically generated iOA584 was the high number of gaps and dead ends metabolites, as we found 272 no-production metabolites and 645 no-consumption metabolites.
The effort in the exhaustive refinement process leading to the iFP764 metabolic network resulted in an improvement of connectivity compared with iOA584. A graphic representation of both networks, performed using CYTOSCAPE, evidenced a connectivity failure in the iOA584, as several disconnected reactions and dead-end metabolites from the central network were found. This was highly solved up in iFP764 (Fig. 4b).
Even the most complete models are not definitive, if they contain gaps or missing information . However, filling in the gaps as much as possible is important to make the model more realistic, improving the predictive capability, and this degree of completion is especially needed when the model does not predict growth under some specific condition (i.e., given sets of nutrients and secretions) when submitted to FBA simulation . This occurs due to the fact that the reactions producing or consuming dead-end metabolites can never carry flux under any condition, so one fundamental requirement for a reconstructed metabolic network is its ability to produce flux through the biomass reaction (i.e., to predict growth) . When the iOA584 model was analyzed by FBA with the COBRA Tool box, it could not produce biomass, and consequently it could not simulate cell growth, in contrast to iFP764. This is likely due to the numerous flaws present in the automated network that are absent (or, at least, present in a much lower number) in the curated iFP764 model. Among these, incorrect reaction stoichiometry, repetitions, dead-end metabolites, problems with proton balances, and severe troubles in connectivity, were found in iOA584.
In summary, in addition to the enlargement, this comparison revealed great differences between the two models, and a large number of drawbacks associated to semi-automatically-generated networks.
Flux balance analysis (FBA) can be used with a constraint-based model to predict metabolic flux distributions, growth rates, and substrate uptake and production secretion rates. Thus, FBA analyses of the model with a set objective function (i.e. biomass reaction that is equal to growth) to generate maximal amounts of biomass precursors (i.e., simulated optimal growth) from substrates available in growth media can generate results that are quantitatively consistent with experimental data [25, 33]. In this context, the utilization of different carbon sources by the iFP764 network was simulated and compared to previous reports [1, 7].
The growth in minimal medium M63 was simulated by fixing the set values for the exchange reactions for the import of metabolites present in the simulated growth medium. Each substrate was taken as the sole carbon and energy source in the simulation and the uptake rates were all set to the same value (10 mmol gDW−1 h−1) in aerobic conditions. The biomass reaction utilized for the validation was BIO_L.
In silico prediction of utilization of various metabolites as carbon sources and comparison with experimental data
In silico simulation
Arahal et al. 
Arahal et al. 
Arahal et al. 
Arahal et al. 
Arahal et al. 
Arahal et al. 
Arahal et al. 
Vargas et al. 
Arahal et al. 
Data not shown
Arahal et al. 
Arahal et al. 
Arahal et al. 
Pastor et al. 
Arahal et al. 
Arahal et al. 
Arahal et al. 
Arahal et al. 
Arahal et al. 
Vargas et al. 
Arahal et al. 
Arahal et al. 
Arahal et al. 
Arahal et al. 
Arahal et al. 
Arahal et al. 
Arahal et al. 
Arahal et al. 
Arahal et al. 
Arahal et al. 
Arahal et al. 
Arahal et al. 
Arahal et al. 
Arahal et al. 
Arahal et al. 
Disagreement between in silico simulations and literature reports was found for five compounds. On the one hand, choline, lysine and glycine betaine support C. salexigens growth, according to experimental data, whereas the simulation did not predict growth with these compounds. Conversely, the iFP764 model predicted growth with L-lactate and D-lactate, which is in disagreement with experimental data , as C. salexigens is not able to grow with any lactate isomer. Disagreements between the computational and experimental data are acceptable and can be divided into two main categories. Instances where experimental growth is observed and no growth is predicted computationally (i.e. with choline, lysine, or glycine betaine) points to areas where further biochemical characterization is needed and define targeted areas for biological discovery . In contrast, cases in which computational growth is predicted and not observed experimentally (i.e., with lactate) indicate possible areas where there are either errors in the reconstruction or, alternatively, where regulation limits the utilization of pathways needed for growth.
Theoretical analysis of C. salexigens physiological capabilities
Growth in different carbon sources
The interest of the in silico prediction is to extend the knowledge about which pathways are active, and for which processes and to what extent different substrates are utilized by the organism. This information can be used to guide labeling studies as well as to interpret results from experiments using similar conditions such as gene and protein expression data. The iFP764 computational model contains exchange reactions for 90 different compounds. It is therefore possible to use iFP764 to predict the growth capabilities of C. salexigens on a very wide range of media conditions. As a demonstration, FBA was used to forecast growth on the 90 possible carbon sources under aerobic conditions using iFP764. As shown in Additional file 1: Table S6, the iFP764 model predicted that 51 out of these 90 carbon sources are growth supporting.
Essentiality of genes and reactions depending on salinity
One of the major applications of GEMs is improving the performance of microbial cell factories by a system wide analysis of the metabolic network. Robustness is generally defined as the cellular ability to maintain biological processes against internal and external perturbations, including disruption of genes and inactivation of enzymes (e.g., knock-out mutations) .
In this regard, iFP764 robustness is crucial for predicting ectoine production under certain conditions. Thus, reaction essentiality simulations were performed by systematically removing each gene from the network and assessing the ability of the model to predict growth in minimal medium with glucose as carbon source. These FBA simulations were conducted at low and high salinity conditions (See "Methods").
Analysis of metabolic osmoadaptation by FBA
In the previous section, we capitalized on the current knowledge of C. salexigens metabolism to compare the in silico prediction of the model with physiological data. Here, we used FBA to investigate the metabolic adaptation of C. salexigens to high and low salinity by using our metabolic network (iFP764).
The in silico growth rate is influenced not only by the structure of the metabolic network, but also by other factors including biomass composition and the use of constraints. Therefore, since both the biomass composition and constraints values vary with salinity, we evaluated the influence of these factors on the predicted growth yield. First, we analyzed the effects of changes in the ratios of biomass components on the iFP764 growth yield. This was done by varying only BIO_L or BIO_H when simulating at low and high salinity with no other constraints, and performing FBA with the biomass reaction set as the objective function. These analyses indicated that maintaining the glucose entry at 10 mmol gDW−1 h−1, without using any other constraints, did not affect growth rate (1.83 and 1.84 h−1 at low and high salinity, respectively). Similar results were obtained when simulating with a Pseudomonas putida model, a 20% increase of any single biomass constituent had an effect of less than 1% on the growth yield . However, although the growth rate was not affected, the internal fluxes distribution could vary according to the formula of the biomass utilized in each condition. This finding is discussed below.
The second factor influencing the efficiency of the in silico optimal growth is the use of constraints. Metris and co-workers simulated osmotic stress using FBA and different biomass reactions that included compatible solutes as metabolites and their respective salinity-specific proportion . They suggested that this was not sufficient, and that additional constraints were needed to explain the chemical-physical limitations of the growth rate simulation. On the other hand, Covert and Palsson introduced additional constraints using Boolean operators, and this approach was shown to eliminate a large number of extreme pathways .
In this work, the simulations performed corroborated previously obtained growth rates in vivo where a higher growth rate was observed at low salinity. This was mainly due to the contribution of the salinity dependent-experimental-based constraints. However, additional experimental constrains could be added in a future to accurate growth rate or fluxes prediction.
In silico prediction of metabolic flux distributions shifts in response to salinity
The above data show that the iFP764 model is coherent and reproduces the major metabolic features of C. salexigens. However, FBA can only generate a single flux distribution corresponding to maximal growth under any given environmental condition. An alternative approach is to characterize all flux distributions that are allowed by the mass balance (stoichiometric) and flux capacity constraints. There are a number of methods that allow characterizing entire flux solution spaces, including various forms of pathway analysis . Among them, Monte Carlo sampling method is a rapid and scalable way to characterize the structure of the allowed space of metabolic fluxes. These analyses can provide all possible steady-state distribution for unknown metabolic fluxes and guide making informative experimental measurements .
In this work, Monte Carlo sampling was used to calculate the probability of flux distributions for all the reactions of iFP764 model in two scenarios. In the first scenario, specific constraints (for glucose and ammonium consumption or pyruvate an overflow metabolites secretion) and biomass composition for high and low salinity, were used to simulate low and high salinity conditions. In the second scenario only specific biomass reactions (but no salinity-specific constraints), were used to simulate both salinities. The set of all possible flux values for each reaction was represented as a flux distribution histogram.
The second scenario was used to evaluate only the effect of each biomass reactions (BIO_L and BIO_H) on flux distribution. After random sampling, modifications of the flux distribution were also observed, but differences were not as evident as in the first scenario (data not shown). This finding suggests that biomass composition formulated at low and high salinity does not significantly influence the flux distribution under these conditions.
Overall, the above results indicate that our model serves as an excellent platform to analyse metabolic adaptation to salinity. It also indicates that although specific-biomass composition is needed to be accurate when performing simulations, specific-experimental constraints are more relevant and necessary to guarantee that in silico fluxes are close to those experimentally observed.
Correlation between metabolic fluxes at low and high salinity
Modularization may be a useful concept to understand large metabolic networks by breaking it apart into manageable pieces . The highly correlated clusters may be considered as functional modules, which mainly include groups of enzymes that are jointly needed to obtain a specific product. We analysed the correlation between fluxes of sets of reactions at high and low salinity obtained by Monte Carlo sampling. This analysis provided us significant clues on the correlations between individual reactions and between sets of reactions, probably involved in metabolic adaptation to salinity and/or ectoines demand.
Regarding nitrogen metabolism, the valine/leucine/isoleucine sub-system was found highly inter-correlated at high and low salinity, both between their internal reactions and also with other subsystems, like the alanine/aspartate, cysteine, TCA cycle, and folate metabolism subsystems. On the contrary, some amino acids subsystems showed to be independent, as they presented a high degree of internal correlation but their reactions did not correlate with other subsystems. This was observed for the histidine and tyrosine/phenylalanine/tryptophan subsystems. This fact occurred in both conditions and no significant variation between the correlation patterns was observed (Fig. 8).
The TCA cycle cluster was more correlated with other clusters at high salinity than at low salinity (Fig. 8a, b). At low salinity, this subsystem showed correlated reactions with the valine/leucine/isoleucine, cysteine, alternate Carbon, and oxidative phosphorylation subsystems. However, at high salinity an important number of additional correlations appeared with other subsystems, like the pentose phosphate (this subsystem included also reactions from ED pathway) and glucolysis/gluconeogenesis clusters. All these clusters were clearly intercorrelated at high salinity (Fig. 8b). Interestingly, the recently described ED-EMP cycle, which recruits activities from the ED, EMP and PP pathways to ensure an appropriate supply of NADPH reducing power for coping with environmental stress , is present in C. salexigens. Thus, the PPi dependent 6-phosphofructokinase (Csal_1534), acting in the reverse reaction as a bisphosphatase, would supply the lack of Pfk, catalyzing the conversion of fructose 1,6- bisphosphate into fructose-6-phosphate , and completing the cycle. However, correlation analysis grouped the ED-EMP cycle reactions in three modules in both salinities. The first cluster was formed by Pgi (GLUC6), Zwf, and Pgl (PGL6), the second cluster was formed by Fda (FRUB) and Fbp, and the third cluster was constituted by the Edd and Eda reactions. Interestingly, the first cluster of reactions [two of them from pentose phosphate (PP) pathway and one of the from EMP pathway], were correlated with some TCA cycle reactions at high salinity [Aconitase (ACOHY2), aconitate hydratase (ACOHY1), Fumarate hydratase (FUM1), and Cs (CITSY)], and with additional reactions of the pentose phosphate pathway. The third cluster of reactions (both of them belonging to the ED pathway, but included in the pentose phosphate subsystem) showed correlation at high salinity with Pdh (Glycolysis/Gluconeogenesis subsystem). The ED-EMP cycle has been suggested to be an essential distributor of carbon through the central metabolic metabolism and an enhanced supply of reducing equivalents . Therefore, the above results could pave the way for future experimental design, in order to figure out how this cycle is functioning in C. salexigens, and if it is partially coupled with the TCA cycle at high salinity, where high ectoines demand requires high levels of reducing power and carbon.
Regarding ectoines metabolism, the correlation analysis distributed this subgroup into four cyclic modules. The first module included reactions for the synthesis and degradation of hydroxyectoine (EutC, EutB and EctDE), the second module corresponded to the futile cycle for ectoine synthesis and degradation (EctC, EctA, DoeA, DoeB and EctA) , the third module was responsible for the l-aspartate semialdehyde conversion into diamino butyric acid, or viceversa, with an interconversion coupled reaction of glutamate to 2-oxoglutarate (EctB and DoeD). The last module included reactions responsible of the synthesis of L-aspartate to L-aspartate semialdehyde, or viceversa, [DoeC, AsD, aspartate kinase (ASPK)]. Interestingly, all the four modules were independent both at low and high salinity conditions, as they did not correlate to each other, and they were not correlated with other subsystems (Fig. 8).
Regarding other compatible solutes, trehalose synthesis and degradation reactions (OtsA, OtsB and TreF) formed a separate module including internal reactions of central C subsystem, particularly those involved in the conversion of glucose-1-P into UDP-glucose (UTPGLPHO, UTP-glucose 1-P-uridylyl transferase) and glucose-6P (PGM, phosphoglucomutase). Both products are precursors of trehalose-6-phosphate. As observed for ectoines metabolism, this correlation pattern of the trehalose metabolism module did not vary with salinity (Fig. 8).
The predicted modular distribution of correlated compatible solutes metabolism sets of reactions is very intriguing, and deserves experimental work to better understand C. salexigens osmoadaptive mechanisms, and especially ectoines production. Most of them were cyclic (not only the ectoine futile cycle), independent from other subsystems, and with no interdependence pattern variation in response to salinity. It is plausible that reactions included in each module respond to a similar regulation, and/or that a coordinated functioning of cyclic modules is necessary to adapt solute compatible metabolism, and therefore metabolic osmoadaptation, to external conditions. Our results are only an example (“the tip of the iceberg”) of the potential of the joint rational use of computational tools, together with experimental data, in order to explore metabolism in halophilic microorganisms.
The information resulting from this work showed that the iFP764 model is a powerful tool to simulate and explore C. salexigens metabolism at different salinity conditions. This halophilic bacterium has one of the broadest salinity range of growth, and is considered as a model to study osmoadaptation. Notwithstanding the existing limitations in reconstructing and validating a C. salexigens specific network, the iFP764 model is one of the most complete and curated networks of halophilic bacteria, and accounts for all existing relevant experimental information. Thus, it may become an important tool to understand the metabolism of halophilic microorganisms, helping to drive the generation of new experimental hypotheses. In addition, C. salexigens is considered as a good cell factory for ectoines, whose production is salinity-dependent. Therefore, this model could be a robust and reliable tool to be used for the design of new C. salexigens strains optimized for the production of ectoines and other biotechnologically interesting compounds, avoiding the expenditure of resources.
MA and CV conceived and supervised the study. FP, MS, and VB performed the model construction and simulations. FP, MS, MA, VB, JMP, JJN, MC and CV were involved in the analysis and discussion of results. FP, MA and CV drafted the manuscript. All authors read and approved the final manuscript.
We thank Dr. L.N. Csonka for language editing of the manuscript.
The authors declare that they have no competing interests.
Availability of data and materials
All data generated or analyzed during this study are included in this published article and its additional files.
Consent for publication
Ethics approval and consent to participate
This work has been partly funded by Projects BIO2014-54411-C2-1-R and BIO2015-63949-R (MINECO/FEDER, UE), Junta de Andalucía (Spain; Project P08-CVI-03724), and Spanish National Network on Extremophilic Microorganisms (BIO2015-71815-REDT).
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Arahal DR, García MT, Vargas C, Cánovas D, Nieto JJ, Ventosa A. Chromohalobacter salexigens sp. nov., a moderately halophilic species that includes Halomonas elongata DSM 3043 and ATCC 33174. Int J Syst Evol Microbiol. 2001;51:1457–62.View ArticleGoogle Scholar
- Vargas C, Argandoña M, Reina-Bueno M, Rodríguez-Moya J, Fernández-Aunión C, Nieto JJ. Unravelling the adaptation responses to osmotic and temperature stress in Chromohalobacter salexigens, a bacterium with broad salinity tolerance. Saline Syst. 2008;4:14.View ArticleGoogle Scholar
- Fallet C, Rohe P, Franco-Lara E. Process optimization of the integrated synthesis and secretion of ectoine and hydroxyectoine under hyper/hypo-osmotic stress. Biotechnol Bioeng. 2010;107(1):124–33.View ArticleGoogle Scholar
- Rodríguez-Moya J, Argandoña M, Iglesias-Guerra F, Nieto JJ, Vargas C. Temperature- and salinity-decoupled overproduction of hydroxyectoine by Chromohalobacter salexigens. Appl Environ Microbiol. 2013;79(3):1018–23.View ArticleGoogle Scholar
- Argandoña M, Vargas C, Reina-Bueno M, Rodríguez-Moya J, Salvador M, Nieto JJ. An extended suite of genetic tools for use in bacteria of the Halomonadaceae: an overview. In: Lorence A, editor. Recombinant gene expression: reviews and protocols. Methods molecular biology, vol. 824. Ciudad: SpringerLink; 2012. p. 167–201.View ArticleGoogle Scholar
- Copeland A, O’Connor K, Lucas S, Lapidus A, Berry KW, Detter JC, Del Rio TG, Hammon N, Dalin E, Tice H, Pitluck S, Bruce D, Goodwin L, Han C, Tapia R, Saunders E, Schmutz J, Brettin T, Larimer F, Land M, Hauser L. Complete genome sequence of the halophilic and highly halotolerant Chromohalobacter salexigens type strain (1H11T). Stand Genomic Sci. 2011;5(3):379–88.View ArticleGoogle Scholar
- Pastor JM, Bernal V, Salvador M, Argandoña M, Vargas C, Csonka L, Sevilla A, Iborra JL, Nieto JJ, Cánovas M. Role of central metabolism in the osmoadaptation of the halophilic bacterium Chromohalobacter salexigens. J Biol Chem. 2013;288(24):17769–81.View ArticleGoogle Scholar
- O’Brien EJ, Monk JM, Palsson BO. Using genome-scale models to predict biological capabilities. Cell. 2015;161(5):971–87.View ArticleGoogle Scholar
- Lewis NE, Nagarajan H, Palsson BO. Constraining the metabolic genotype-phenotype relationship using a phylogeny of in silico methods. Nat Rev Microbiol. 2012;10(4):291.View ArticleGoogle Scholar
- Price ND, Reed JL, Palsson BO. Genome scale models of microbian cells: evaluating the consequences of constraints. Nat Rev Microbiol. 2004;2:886–97.View ArticleGoogle Scholar
- Papin JA, Price ND, Wiback SJ, Fell DA, Palsson BO. Metabolic pathways in the post-genome era. Trends Biochem Sci. 2003;28:250–8.View ArticleGoogle Scholar
- Lee KH, Park JH, Kim TY, Kim HU, Lee SY. Systems metabolic engineering of Escherichia coli for l-threonine production. Mol Syst Biol. 2007;3:149.View ArticleGoogle Scholar
- Pharkya P, Burgard AP, Maranas CD. Exploring the overproduction of amino acids using the bilevel optimization framework OptKnock. Biotechnol Bioeng. 2003;84:887–99.View ArticleGoogle Scholar
- Reed JL, Famili I, Thiele I, Palsson BO. Towards multidimensional genome annotation. Nat Rev Genet. 2006;7(2):130–41.View ArticleGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10.View ArticleGoogle Scholar
- Marchler-Bauer A, Panchenko AR, Shoemaker BA, Thiessen PA, Geer LY, Bryant SH. CDD: a database of conserved domain alignments with links to domain three-dimensional structure. Nucleic Acids Res. 2002;30(1):281–3.View ArticleGoogle Scholar
- Schultz J, Milpetz F, Bork P, Ponting CP. SMART, a simple modular architecture research tool: identification of signaling domains. Proc Natl Acad Sci. 1998;95:5857–64.View ArticleGoogle Scholar
- Snel B, Lehmann G, Bork P, Huynen MA. STRING: a web-server to retrieve and display the repeatedly occurring neighborhood of a gene. Nucleic Acids Res. 2000;28(18):3442–4.View ArticleGoogle Scholar
- Despalins A, Marsit S, Oberto J. Absynte: a web tool to analyze the evolution of orthologous archaeal and bacterial gene clusters. Bioinformatics. 2011;27(20):2905–6.View ArticleGoogle Scholar
- Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1990;22(22):4673–80.View ArticleGoogle Scholar
- Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol. 2013;30:2725–9.View ArticleGoogle Scholar
- Schomburg I, Chang A, Schomburg D. BRENDA, enzyme data and metabolic information. Nucleic Acids Res. 2002;30:47–9.View ArticleGoogle Scholar
- Thiele I, Palsson BO. A protocol for generating a high-quality genome-scale metabolic reconstruction. Nat Protoc. 2010;5(1):93–121.View ArticleGoogle Scholar
- Vargas C, Kallimanis A, Koukkou AI, Calderon MI, Canovas D, Iglesias-Guerra F, Drainas C, Ventosa A, Nieto JJ. Contribution of chemical changes in membrane lipids to the osmoadaptation of the halophilic bacterium Chromohalobacter salexigens. Syst Appl Microbiol. 2005;28(7):571–81.View ArticleGoogle Scholar
- Orth JD, Conrad TM, Na J, Lerman JA, Nam H, Feist AM, Palsson BO. A comprehensive genome-scale reconstruction of Escherichia coli metabolism. Mol Syst Biol. 2011;7:535.View ArticleGoogle Scholar
- Neidhardt FC, Van Bogelen RA. Heat shock response. Escherichia coli and Salmonella typhimurium; cellular and molecular biology. In: Neidhardt FC, Ingraham JL, Low KB, Magasanik B, Schaechter M, Umbarger HE, editors. American society for microbiology. Washington: American Society for Microbiology; 1987.Google Scholar
- Cayley S, Record MT. Roles of cytoplasmic osmolytes, water, and crowding in the response of Escherichia coli to osmotic stress: biophysical basis of osmoprotection by glycine betaine. Biochemistry. 2003;42(43):12596–609.View ArticleGoogle Scholar
- Feist AM, Henry CS, Reed JL, Krummenacker M, Joyce AR, Karp PD, Broadbelt LJ, Hatzimanikatis V, Palsson BO. A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol Syst Biol. 2007;3:121.View ArticleGoogle Scholar
- Feist AM, Palsson BO. The biomass objective function. Curr Opin Microbiol. 2010;3:344–9.View ArticleGoogle Scholar
- Reed JL, Patel TR, Chen KH, Joyce AR, Applebee MK, Herring CD, Bui OT, Knight EM, Fong SS, Palsson BO. Systems approach to refining genome annotation. Proc Natl Acad Sci USA. 2006;103:17480–4.View ArticleGoogle Scholar
- Becker SA, Feist AM, Mo ML, Hannum G, Palsson BO, Herrgard MJ. Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox. Nat Protoc. 2007;2(3):727–38.View ArticleGoogle Scholar
- Csonka LN. A third l-proline permease in Salmonella typhymurium which functions in media of elevated osmotic strength. J Bacteriol. 1982;151:1433–43.Google Scholar
- Kumar SV, Dasika MS, Maranas CD. Optimization based automated curation of metabolic reconstructions. BMC Bioinform. 2007;8:212.View ArticleGoogle Scholar
- Orth JD, Palsson BO. Systematizing the generation of missing metabolic knowledge. Biotechnol Bioeng. 2010;107(3):403–12.View ArticleGoogle Scholar
- Wiback SJ, Famili I, Greengerg HJ, Palsson BO. Monte Carlo sampling can be used to determine the size and shape of the steady-state flux space. J Theor Biol. 2004;228:437–47.View ArticleGoogle Scholar
- Lawrence J. Polytope volume computation. Math Comput. 1991;195:259–71.View ArticleGoogle Scholar
- Bartell JA, Blazier AS, Yen P, Thøgersen JC, Jelsbak L, Goldberg JB, Papin JA. Reconstruction of the metabolic network of Pseudomonas aeruginosa to interrogate virulence factor synthesis. Nat Commun. 2017;8:14631.View ArticleGoogle Scholar
- Oh Y-K, Palsson BØ, Park SM, Schilling CH, Mahadevan R. Genome-scale reconstruction of metabolic network in Bacillus subtilis based on high-throughput phenotyping and gene essentiality data. J Biol Chem. 2007;282:28791–9.View ArticleGoogle Scholar
- McAnulty MJ, Yen JY, Freedman BG, Senger RS. Genome-scale modeling using flux ratio constraints to enable metabolic engineering of clostridial metabolism in silico. BMC Syst Biol. 2012;6:42.View ArticleGoogle Scholar
- Diken E, Ozer T, Arikan M, Emrence Z, Oner ET, Ustek D, Arga KY. Genomic analysis reveals the biotechnological and industrial potential of levan producing halophilic extremophile, Halomonas smyrnensis AAD6T. SpringerPlus. 2015;4:393.View ArticleGoogle Scholar
- Tatusov RL, Koonin EV, Lipman DJ. A genomic perspective on protein families. Science. 1997;278(5338):631–7.View ArticleGoogle Scholar
- Ates O, Oner ET, Arga KY. Genome-scale reconstruction of metabolic network for a halophilic extremophile, Chromohalobacter salexigens DSM3043. BMC Syst Biol. 2011;5:12.View ArticleGoogle Scholar
- Ponce-de-León M, Montero F, Peretó J. Solving gap metabolites and blocked reactions in genome-scale models: application to the metabolic network of Blattabacterium cuenoti. BMC Syst Biol. 2013;7:114.View ArticleGoogle Scholar
- Larocque M, Chénard T, Najmanovich R. A curated C. difficile strain 630 metabolic network: prediction of essential targets and inhibitors. BMC Syst Biol. 2014;8:117.View ArticleGoogle Scholar
- Schatschneider S, Persicke M, Watt SA, Hublik G, Puhler A, Niehaus K, Vorholter FJ. Establishment, in silico analysis, and experimental verification of a large-scale metabolic network of the xanthan producing Xanthomonas campestris pv. campestris strain B100. J Biotechnol. 2013;167:123–34.View ArticleGoogle Scholar
- Kitano H. Towards a theory of biological robustness. Mol Syst Biol. 2007;3:137.View ArticleGoogle Scholar
- Ye YN, Hua ZG, Huang J, Rao N. Guo FB1. CEG: a database of essential gene clusters. BMC Genom. 2013;14:769.View ArticleGoogle Scholar
- Oren A. Bioenergetic aspects of halophilism. Microbiol Mol Biol Rev. 1999;63:334–48.Google Scholar
- Puchałka J, Oberhardt MA, Godinho M, Bielecka A, Regenhardt D, Timmis KN, Papin JA, Martins dos Santos VA. Genome-scale reconstruction and analysis of the Pseudomonas putida KT2440 metabolic network facilitates applications in biotechnology. PLoS Comput Biol. 2008;4(10):e1000210.View ArticleGoogle Scholar
- Pfeiffer T, Schuster S. Game-theoretical approaches to studying the evolution of biochemical systems. Trends Biochem Sci. 2005;30:20–5.View ArticleGoogle Scholar
- Schuster S, Pfeiffer T, Fell DA. Is maximization of molar yield in metabolic networks favored by evolution? J Theor Biol. 2008;252:497–504.View ArticleGoogle Scholar
- Metris A, George S, Baranyi S. Modelling osmotic stress by Flux Balance Analysis at the genomic scale. Int J Food Microbiol. 2012;152(3):123–8.View ArticleGoogle Scholar
- Covert MW, Palsson BO. Transcriptional regulation in constraints-based metabolic models of Escherichia coli. J Biol Chem. 2002;277(31):28058–64.View ArticleGoogle Scholar
- Papin JA, Reed JL, Palsson BO. Hierarchical thinking in network biology: the unbiased modularization of biochemical networks. Trends Biochem Sci. 2004;29(12):641–7.View ArticleGoogle Scholar
- Nikkel PI, Chavarría M, Fuhrer T, Sauer U, de Lorenzo V. Pseudomonas putida KT2440 strain metabolizes glucose through a cycle formed by enzymes of the Entner–Doudoroff, Embden–Meyerhof–Parnas and Pentose phosphate pathways. J Biol Chem. 2015;43:25920–32.View ArticleGoogle Scholar
- Schwibbert K, Marin-Sanguino A, Bagyan I, Heidrich G, Lentzen G, Seitz H, Rampp M, Schuster SC, Klenk HP, Pfeiffer F, Oesterhelt D, Kunte HJ. A blueprint of ectoine metabolism from the genome of the industrial producer Halomonas elongata DSM 2581T. Environ Microbiol. 2011;13(8):1973–94.View ArticleGoogle Scholar