Skip to main content

HYBMFA: a bioinformatics' tool for batch-to-batch bioprocess optimisation supported by elementary flux analysis


The central metabolic pathways of many biological systems with industrial interest are currently known. Knowledge of intracellular fluxes is crucial to understand cell metabolism. Bioreactor dynamic optimisation schemes could profit from the incorporation of this knowledge [1, 2]. A number of methods have been developed to study the structure of biochemical networks. The elementary flux modes (EFMs) method is particularly attractive since it allows to reduce network complexity to a minimal set of reactions [3].

In previous studies [4], a bioprocess batch-to-batch optimisation scheme supported by a hybrid model was developed and applied to the optimization of a BHK culture expressing the fusion glycoprotein IgG1-IL2. The main contribution of the present study is to improve the previous method by incorporating the knowledge of the metabolic network. The incorporation of the metabolic network in the form of EFMs, may increase the generalization properties of the model and may thus contribute to the increase of the rate of success of the optimization method.


The proposed methodology is based on the premise that the biological system under consideration is only partially known in a mechanistic sense. Following this principle, a hybrid parametric/nonparametric representation of the biological system was adopted to support a batch-to-batch optimization scheme (Figure 1).

Figure 1
figure 1

Proposed optimisation scheme.

In the first step, the metabolic network structure of the biological system under study is analyzed using the elementary flux modes technique. Elementary flux modes are the simplest paths within a network that connect substrates with end-products [3], thus they define the minimum set of n species that must be considered for modelling and how they are connected in a simplified reaction mechanism. The EFM analysis of a given biosystem results in m elementary flux modes and the corresponding n × m stoichiometric matrix K, with n the number of compounds that must be considered for modelling. The BHK metabolic network analyzed in this work considers the most relevant pathways involving the two main nutrients (glucose and glutamine) within the central metabolism of BHK cells. The FluxAnalyzer software [3] was used to determine the EFMs of BHK metabolic network. There are seven EFMs describing the BHK metabolic network. Assuming the balanced growth condition it is possible to eliminate the intermediate metabolites from each EFM resulting in a set of simplified reactions connecting extracellular substrates (glucose and glutamine) with end-products (lactate, ammonia, alanine, carbon dioxide, purine and pyrimidine). Furthermore, some assumptions concerning the fluxes were made based on literature, resulting in five EFMs. The following stoichiometric matrix was obtained

Note that K also accounts for cell growth and product formation as completely independent fluxes since the stoichiometry of theses reactions is not accurately known.

The state space vector is formed by the n concentrations of compounds of the final reactions set and additionally, the concentrations of viable cells, Xv, and product, IgG:

c = [Xv, Glc, Gln, Lac, Amm, Ala, IgG]T.     (2)

Once a reaction mechanism has been established using the EFM method, the next step is the identification of the EFM kinetics from data. Here we adopted a hybrid parametric/nonparametric model structure assuming that that reaction kinetics of EFM are partially known or even completely unknown. This model structure can be formulated mathematically by the following two equations [4]:

r(c, w) = K<φ j (c) × ρ j (c, w)>j = 1, ...,m    (3b)

with r a vector of n volumetric reaction rates, K a n × m coefficients matrix obtained from the elementary flux modes analysis, ϕ,(c) are m kinetic functions established from mechanistic knowledge, ρj(c,w) are m unknown kinetic functions, w a vector of parameters that must be estimated from data, D is the dilution rate, u is a vector of n volumetric input rates (control inputs).

For the system under study the vector of known kinetic functions is given by:

ϕ(c)= [Xv XvGlc XvGlc XvGln XvGln XvGlnGln Xv]T,     (4)

whereas the vector of unknown kinetics is given by:

ρ = [μ - kd r1 r2 r3 r4 r5 rIgG]T = ρ(Glc, Gln, Amm, w).     (5)

A backpropagation neural network with a single hidden layer was used for the identification of ρ i (c, w):

ρ(c, w) = ρmaxs(w2s(w1c+b1)+b2)     (6)

with ρmax a vector of scaling factors with dim(ρmax) = m, w1, b1, w2, b2 are parameter matrices associated with connections between the nodes of the network, w is a vectored form of w1, b1, w2, b2 and s(.) the sigmoid activation function defined as follows:

Finally, the last term in eq. (3a), the control input vector is u = [0 F Glc F Gln 0 0 0 0] with F Glc and F Gln the volumetric feeding rates of glucose and glutamine respectively.

Off-line measurements of the seven state variables from five experiments were used for model training and validation. The neural network had three inputs: glucose and glutamine, the main limiting nutrients, and ammonia, the main toxic by-product. The output vector was formed by the seven unknown specific kinetics: μ-kd, r1, r2, r3, r4, r5, rIgG. The criterion to stop the training was the minimum modelling error of the validation data set. The best result was obtained with five hidden nodes. Figure 2 presents the hybrid modelling results for one of the training and one of the validation data sets. A relevant result is the fact that the hybrid model was able to describe simultaneously all five batches with high accuracy.

Figure 2
figure 2

Hybrid model results for a training data set (a) and a validation data set (b).

Optimisation results

With the hybrid model just developed, the process performance (described as the glycoprotein quantity at the end of the bioreaction) is optimized with respect to control inputs F Glc and F Gln , using a micro-genetic algorithm [5]

The optimization (8) is constrained by the hybrid dynamical model and by the risk of ANN inputs being outside the trust region. The optimisation results are presented in Figure 3 showing the optimal trajectories of viable cells, glucose, glutamine and product concentrations. The final product titre is 25 mg/l representing a 67% improvement of performance obtained in the fed-batch experiments so far.

figure 3

Figure 3

According to the iterative batch-to-batch optimisation scheme shown in Fig. 1, the next step is to perform a new experiment to validate this optimization results. If measured data and predicted optimal process trajectories deviate considerably, additional iterations are performed until convergence of model and process performance is achieved.


A bioinformatic tool was developed that integrates classical optimal control and elementary flux analysis tools. A hybrid parametric/nonparametric modelling framework was adopted that does not require detailed knowledge of intracellular kinetics. A dynamic optimisation method is employed constrained by the risk of nonparametric components unreliability. The method was applied to a recombinant BHK-21 cell line expressing the fusion glycoprotein IgG2-IL1. The final hybrid model was then used to optimise conditions that favour product formation showing that high productivity increments are likely for the process at hand.


  1. Provost A, Bastin G: Dynamic metabolic modeling under balanced growth condition. J Process Control. 2004, 14: 717-728. 10.1016/j.jprocont.2003.12.004.

    Article  CAS  Google Scholar 

  2. Mahadevan R, Burgard A, Famili I, Van Dien S, Schilling C: Applications of metabolic modeling to drive bioprocess development for the production of value-added chemicals. Biotechnol Bioprocess. 2005, 10: 408-417.

    Article  CAS  Google Scholar 

  3. Klamt S, Stelling J, Ginkel M, Gilles E: FluxAnalyser: exploring structure, pathways, and flux distributions in metabolic networks on interactive flux maps. Bioinformatics. 2003, 19: 261-269. 10.1093/bioinformatics/19.2.261.

    Article  CAS  Google Scholar 

  4. Teixeira A, Cunha A, Clemente J, Moreira J, Cruz H, Alves P, Carrondo M, Oliveira R: Modelling and optimisation of a recombinant BHK-21 cultivation process using hybrid grey-box systems. J Biotechnol. 2005, 118: 290-303. 10.1016/j.jbiotec.2005.04.024.

    Article  CAS  Google Scholar 

  5. Krishnakumar K: Micro-Genetic Algorithms for Stationary and Non-Stationary Function Optimization. SPIE: Intelligent Control and Adaptive Systems. 1989, 1196: Philadelphia, PA.

    Google Scholar 

Download references


The authors acknowledge the financial support provided by the Fundação para a Ciência e Tecnologia through project POCTI/BIO/57927/2004 and PhD grant SFRH/BD/13712/2003.

Author information

Authors and Affiliations


Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Teixeira, A., Alves, C., Alves, P. et al. HYBMFA: a bioinformatics' tool for batch-to-batch bioprocess optimisation supported by elementary flux analysis. Microb Cell Fact 5 (Suppl 1), P51 (2006).

Download citation

  • Published:

  • DOI: