Versatile modeling and optimization of fed batch processes for the production of secreted heterologous proteins with Pichia pastoris
© Maurer et al; licensee BioMed Central Ltd. 2006
Received: 25 October 2006
Accepted: 11 December 2006
Published: 11 December 2006
Secretion of heterologous proteins depends both on biomass concentration and on the specific product secretion rate, which in turn is not constant at varying specific growth rates. As fed batch processes usually do not maintain a steady state throughout the feed phase, it is not trivial to model and optimize such a process by mathematical means.
We have developed a model for product accumulation in fed batch based on iterative calculation in Microsoft Excel spreadsheets, and used the Solver software to optimize the time course of the media feed in order to maximize the volumetric productivity. The optimum feed phase consisted of an exponential feed at maximum specific growth rate, followed by a phase with linearly increasing feed rate and consequently steadily decreasing specific growth rate. The latter phase could be modeled also by exact mathematical treatment by the calculus of variations, yielding the explicit shape of the growth function, however, with certain indeterminate parameters. To evaluate the latter, one needs a numerical optimum search algorithm. The explicit shape of the growth function provides additional evidence that the Excel model results in correct data. Experimental evaluation in two independent fed batch cultures resulted in a good correlation to the optimized model data, and a 2.2 fold improvement of the volumetric productivity.
The advantages of the procedure we describe here are the ease of use and the flexibility, applying software familiar to every scientist and engineer, and rapid calculation which makes predictions extremely easy, so that many options can be tested in silico quickly. Additional options like further biological and technological constraints or different functions for specific productivity and biomass yield can easily be integrated.
Modeling of bioprocesses has been pursued since the 1970s, with the aim to rationally optimize processes. While the mathematical description of processes like growth and product formation have been fairly well achieved, it is still not routine practice to design biotechnological production processes based on model prediction. An especially difficult case in this respect is fed-batch as a dynamic system usually not reaching steady state. General attempts to model fed-batch processes have been described (for an overview see ). Based on these modeling approaches, optimization of fed batch processes has been attempted using Pontryagin's Maximum Principle [2, 3], Green's Theorem , or Dynamic Programming . These approaches are rather complex, and they did not find their way in routine application.
A typical case of fed-batch process is the production of recombinant proteins with microorganisms or mammalian cells. While the description of product concentration in the cell mass is rather straight forward (in the case of an intracellular product), it is more complex to predict the kinetics of a secreted product. A typical case for secretion systems are recombinant yeasts . As the production of many proteins in yeasts is quite cost sensitive, it will be highly desirable to have a tool available that allows a simple yet reliable prediction of productivity, process time and product titers. Approaches to optimize fed batch processes for the methylotrophic yeast Pichia pastoris have been described [3, 5]. The latter employ dynamic programming by dividing the total process time into a discrete number of intervals, and assigning a value of the specific growth rate μ selected from a discrete set of values. The major drawback of this approach is that the process time is fixed and not an issue of optimization. The algorithm used by this group is complex, and not readily available to others. Zhang and coworkers  present an approach based on Pontryagin's Maximum Principle. A general applicability seems hampered by the complex calculation, complicating a simple recalculation with modified data or calculation procedures.
With this work we aimed at the development of an optimization tool for fed batch processes using calculation tools available for every PC. MS Excel allows the approximation of a model by numerically solving equations describing the system, and the optimization of an objective function by modifying defined fields (the decision variables), while different constraints can be defined which have to be complied with. The calculation is based on the generalized reduced gradient method described in . While the general concept of calculation is similar to the approaches above, the definition of the optimization objective – while obviously a crucial step – is not consistently resolved in the existing literature. The variable costs of a bioprocess correlate with the volumetric capacity of the required fermentation unit, and the process time this unit is required to produce a defined amount of the product . Thus the volumetric productivity Q P is the most plausible target for optimization. At a given process time point t, Q P is defined as:
Expanding this concept to total manufacturing costs is feasible but depends on a profound and reliable cost calculation. As outlined below, Q P can be calculated from the specific growth rate μ and the specific production rate q P . μ should be one of the decision variables of the optimization (defining the feed rate profile to be developed), while q P depends on μ. The exact function, q P = f(μ), of this dependence for secreted recombinant proteins has been subject to discussion . These authors provide some evidence that secreted protein productivity is saturated at high μ, but a clear experimental solution of this function and its biological basis has not been achieved yet. Zhang et al. approximate this relation by an empirical 3rd order function , while Ohya and coworkers model it by a two step linear function . Both groups base their model functions on rather few experimental data. To improve the accuracy of the model q P = f(μ), we examined the entire space of μ of a P. pastoris strain in chemostat cultures for the respective values of q P , as well as the observed biomass yield coefficient Y' XS in order to calculate the substrate needed for each increment of biomass increase. It has been discussed whether data derived from steady state are applicable to model transient situations as they usually occur in fed batch. The parameters determining the accuracy of a steady state model are the relaxation time constants of environmental changes and of biological processes based on the change in environment, which become critical at highly transient situations like a shift to growth limiting conditions at the end of batch . However, substrate limited fed batch cannot be considered as highly transient, so that the steady state model should be applicable.
A P. pastoris strain expressing the Fab fragment of the anti-HIV antibody 2F5  was employed as a model. As expression is based on the glyceraldehyde phosphate dehydrogenase (GAP) promoter, glucose is used as a substrate for growth. Modeling of the fed batch process and optimization of Q P was used to predict an optimal feed protocol, which was then evaluated experimentally. The model optimization was also solved analytically in order to prove the accuracy of the Excel approximation.
Results and discussion
Standard fed batch
Optimized fed batch – model and experimental
F L = 0.012 g·h-2·t L + 30.672 g·h-1 (2)
Modeling of the standard fed batch
Using the same equations as for optimization we also attempted to model the standard fed batch process. However, the predicted values of biomass and product concentrations deviated significantly from the experimental data. Therefore we reconsidered the data source used to obtain the functions of Y' XS and q P based on μ. The obvious difference between the standard and optimized fed batch protocol is that the standard culture is performed at very low μ from 0.05 h-1 decreasing to 0.005 h-1, while the optimized culture starts at μ = 0.2 h-1, decreasing to 0.05 h-1. As saturation functions like the Monod function are more susceptible to low values of the x-coordinate, and the majority of the data from chemostat were naturally obtained at higher μ values, we remodeled the function q P = f(μ) for low μ based on data derived from previous fed batch cultures. The best approximation in the range of μ ≤ 0.05 h-1 was a linear function
q P = 0.2051·μ + 0.002 (3)
Similarly, Y XS and m S were remodeled in this range of μ ≤ 0.05 h-1.
Apparently the model based on chemostat data fits well at higher μ, while it needed adjustment at values below μ = 0.05 h-1. Most importantly, it was valid for the optimized feed protocol derived from the model, which led to a 2.2 fold increase of volumetric productivity. The sensitivity of the model to the accuracy of the function q P = f(μ) stresses the importance of an accurate experimental determination of q P both at low and high μ. Importantly, this function can be refined in future utilizing additional data from fed batch (and chemostat) cultures, so that the model acquires features of a self learning model.
The calculus of variations yields a method to derive an analytic formula (containing indeterminate parameters) for our optimization problem. Let X = X(t) be the amount of biomass, P = P(t) the amount of product and μ = μ(t) the specific growth rate of biomass at the point of time t. The process to be controlled is described by the following equations:
The growth of biomass is modeled by the equation
X'(t) = μ(t)·X(t) (4)
with the initial value X(0) = X0.
The yield of product is modeled by the equation
P'(t) = q P (t)·X(t) (5)
with a Monod like formula
and the initial value P(0) = P0.
In a first step we maximize the cumulative yield of product in a fixed time interval [0, T]. Therefore, P(T) → Max is equivalent to P(T) → Max. We therefore consider
by formulas (5) and (6). Inserting X'(t)/X(t) for μ(t) results in maximizing the integral
This integral has an extremum only if the Euler-Lagrange differential equation
Evaluating the Euler Lagrange equation (9) yields
Inserting μ(t)·X(t) for X'(t) (from eq. 4) yields
Calculating the differential and reducing to a common denominator gives
2·k q ·μ'(t) + μ3(t) + k q ·μ2(t) = 0 (13)
We solve this differential equation and get the following equation for μ(t):
We have developed a modeling and optimization algorithm for fed batch cultures of secreted products based on MS Excel. The validity of this iterative calculation, which is highly flexible and versatile, was proven by analytic solution of the equations forming the basis of the fed batch model. While the analytic solution fits exactly to the phase of decreasing specific growth rate of the Excel Solver solution, it is not possible to calculate the duration of the initial μmax phase. As the optimum feed profiles obviously consist of an exponential phase followed by a phase of steadily decreasing μ, the analytic approach could only serve as evidence of the correct solution of the optimization problem obtained with Excel Solver. Both the Euler-Lagrange approach used here and Pontryagin's Maximum Principle depend on data fitting to obtain a numeric solution. Given the perfect match of the two approaches presented here, we consider it much more straight forward to apply a numeric data fitting approach directly to the equations of growth and product formation.
The advantages of the procedure we describe here are the ease of use, applying software familiar to every scientist and engineer, and rapid calculation which makes predictions extremely easy, so that many options can be tested in silico quickly. Additional options like further biological and technological constraints or different functions for specific productivity and biomass yield can easily be integrated.
We could prove that the experimental data basis for the functions behind the algorithm is very important. Different to previous work this was taken into account, and especially the sensitivity at very low specific growth rates needs to be highlighted.
The Excel file containing the model and optimization procedure is provided as accompanying file [see additional file 1].
Materials and methods
Unless stated otherwise, all chemicals were purchased from Merck Eurolab and all antisera were from Sigma.
A P. pastoris strain X33 (wild type strain) expressing extracellularly the Fab fragment of the anti-HIV antibody 2F5 under control of the GAP promoter was used in this study. The development of this strain has been described elsewhere . A cell bank of the strain was prepared, divided in 1.8 mL aliquots and stored at -80°C.
A shake flask containing 100 mL of YPG medium (per liter: 10 g yeast extract, 10 g peptone, 10 g glycerol) was inoculated with one cryovial from the P. pastoris cell bank, and incubated at 28°C for approximately 24 hours and agitated at 180 rpm.
This culture was used to inoculate the starting volume in the bioreactor to a starting optical density (OD600) of 1.0. Depending on the operation mode the starting volume was either 1.2 L for fed batch or 1.4 L for chemostat process.
Fermentations were carried out in a 2.0 L working volume bioreactor (MBR; Wetzikon, Switzerland) with a computer based process control (ISE; Vienna, Austria). Fermentation temperature was controlled at 25°C, pH was controlled at 5.0 with addition of 25% ammonium hydroxide and the dissolved oxygen concentration was maintained above 20% saturation by controlling the stirrer speed between 600 and 1200 rpm, whereas the airflow was kept constant at 100 L h-1.
The media were as follows:
Batch medium contained per liter:
2.0 g citric acid, 12.4 g (NH4)2HPO4, 0.022 g CaCl2·2H2O, 0.9 g KCl, 0.5 g MgSO4·7H2O, 40 g glycerol, 4.6 ml PTM1 trace salts stock solution. The pH was set to 5.0 with 25% HCl.
Glucose fed batch solution contained per liter:
550 g glucose·1H2O, 10 g KCl, 6.45 g MgSO4·7H2O, 0.35 g CaCl2·2H2O and 12 ml PTM1 trace salts stock solution.
Chemostat medium contained per liter:
55 g glucose·1H2O, 2.5 g KCl, 1.0 g MgSO4·7H2O, 0.035 g CaCl2·2H2O, 21.8 g (NH4)2HPO4 and 2.4 ml PTM1 trace salts stock solution, furthermore the pH was set to 5.0 with 25% HCl.
PTM1 trace salts stock solution contained per liter:
6.0 g CuSO4·5H2O, 0.08 g NaI, 3.0 g MnSO4· H2O, 0.2 g Na2MoO4·2H2O, 0.02 g H3BO3, 0.5 g CoCl2, 20.0 g ZnCl2, 65.0 g FeSO4·7H2O, 0.2 g biotin and 5.0 ml H2SO4 (95%–98%). All chemicals for PTM1 trace salts stock solution were from Riedel-de Haën (Seelze, Germany), except for biotin (Sigma, St. Louis, MO, USA), and H2SO4 (Merck Eurolab).
After approximately 24 hours the batch was finished and – depending on the fermentation strategy – the feed and if required the harvest was started.
The continuous fermentation was initiated at a D = 0.15 h-1 and performed at least for 5 resident times τ to reach steady state conditions.
Then the dilution rate was decreased stepwise, always achieving steady state conditions before the next change of the dilution rate. At D = 0.0086 h-1 the procedure was reversed and the dilution rate was increased stepwise up to the critical dilution rate Dcrit = 0.2 h-1. Samples were taken after 3 and 5 τ and analyzed as described below.
The standard fermentation strategy was a fed batch with a constant feed, this means that the batch phase was followed by the glucose fed batch with a feed rate F = 8.925 g h-1. The fermentations were terminated at appr. t = 120 h. Samples were taken frequently and processed as described below.
The optimized fermentation strategy consists of different phases to perform the calculated growth kinetic. The batch phase was followed by an exponential feed phase with a growth rate of 0.2 for 3.6 hours, followed by a linearly increasing feed rate calculated by equation (16), where k = 0.0144 g h-2 and d = 36.8064 g h-1 for 16.0 hours.
F L = k·t L + d (16)
The samples were diluted in ddH2O up to 1:500 to measure the OD at 600 nm.
2 × 5 ml culture were centrifuged and the supernatants frozen for further analysis. The pellets were resuspended in ddH2O, recentrifuged, and the pellets again resuspended in ddH2O, transferred to a weighed beaker, dried at 105°C until constant weight.
Product quantification (ELISA)
To determine the Fab content, 96 well microtiter plates (MaxiSorb, Nunc, Denmark) were coated with anti-hIgG (Fab specific) overnight at RT (1:1000 in PBS, pH 7.4), before serially diluted supernatants of P. pastoris cultures secreting 2F5 Fab (starting with a 1:100 dilution in PBS) were applied and incubated for 2 h at RT. Fab of normal IgG (Nordic) was used as a standard protein at a starting concentration of 200 ng/ml. After each incubation step the plates were washed four times with PBS containing 1% Tween 20 adjusted to pH 7.4. 100 μ l of anti-kappa light chain – AP conjugate as secondary antibody (1:1000 in PBS/Tween + 2% BSA) were added to each well, and incubated for 1 h at RT. After washing, the plates were stained with pNPP (1 mg/ml p-nitrophenyl phosphate in coating buffer, 0.1 N Na2CO3/NaHCO3; pH 9.6) and read at 405 nm (reference wavelength 620 nm).
Method of calculation
1. Setup of calculations
We divide the total feed period in equal intervals [t n , tn+1] (1 ≤ n ≤ N) of length dt. Therefore,
tn+1= t n + dt (17)
We start with an initial value dt = 1 [h]. The best value for dt is determined within the optimization process.
At every point of time t n we denote by X n = X(t n ) the amount of biomass and by P n = P(t n ) the amount of product in the bioreactor. At the beginning of the fed-batch process the initial values are X(0) = X0 and P(0) = P0, as achieved at the end of the batch phase.
First we have to describe the growth of the biomass. We use the simplest model, the exponential growth model,
Since the specific growth rate μ of the biomass depends on time, we calculate (eq. 18) in discrete time steps
where μ n is the specific growth rate during the interval [t n , tn+1]. The initial values for μ n are chosen arbitrarily, for instance μ n ≡ μmax. The optimal values for all of the μ n 's are determined within the optimization process.
Second we have to describe the accumulation of the product. We simply calculate the total product yield during the interval [t n , tn+1] by the following formula
Pn+1= P n + dP n (20)
dP n = q Pn · X n ·dt (21)
The relationship between the specific rate q P of product formation and the specific growth rate μ was experimentally determined in chemostat cultures. The dependence of q P on μ was described analogous to Monod equation:
The values for q Pmax and k q are derived from the experimental data by the method of least squares, i.e. the parameters qP maxand k q are chosen that the sum of the deviations from the experimental data squared is minimal.
Next we have to calculate the amount of substrate dS which we must feed in the time interval [t n , tn+1]. To do this, let S n be the amount of substrate added to the bioreactor until the time point t n . Then the substrate consumption rate depends on the amount and on the increase of biomass, i.e.
where m S is the maintenance coefficient and Y XS is the true yield coefficient of biomass from substrate. Inserting formula (18) in (23) the amount of substrate feed in the interval [t n , tn+1] calculates as
To calculate the parameters Y XS and m S from experimental data of chemostat cultures by the method of least squares, we use the observed biomass yield coefficient Y' XS depending on the specific growth rate μ. This is done by dX = -Y' XS ·dS and inserting formula (18) and the formula for the whole substrate consumption which implies
Formula (25) can be transformed to
From this double reciprocal plot Y XS and m S were determined by linear regression.
Last but not least we need the total volume for the calculation of the volumetric productivity. The model process starts with a batch volume of V0 = 1 L. The total volume at each time interval is then
with the substrate concentration in the feed medium s f and the density of the feed medium ρ f . Due to the high biomass concentrations achieved in P. pastoris fermentations, the cells occupy a significant fraction of the total volume, while the product is secreted to the liquid phase, the culture supernatant. In order to calculate the product concentration, the available liquid volume V l is calculated at each time interval with the specific volume of wet biomass, which is derived from dry biomass as the specific volume per dry biomass νYDM = 0.0033 L g-1.
Vln = V n - X n ·ν YDM (28)
Finally, we calculate the biomass and the product concentrations. The product concentration p at the time point t n is calculated as
and the biomass concentration x at the same time point is
The medium feed rate Fn at each time point is
These values are used to determine the feed rate profile of the optimized fed batch process.
The goal of our optimization problem is to find the best values for the specific growth rates μ n and the best value for dt (which implies that the total feed period undergoes the optimization process too) such that the volumetric productivity Q P calculated at the point of time tN+1as
is maximized under the following constraints:
μmin ≤ μ n ≤ μmax for (1 ≤ n ≤ N) (33)
X(tN+1) = Xmax (34)
Here μmax = 0.2 h-1 is the maximum specific growth rate at just below washout in chemostat cultures. Since below μ = 0.02 h-1 significant product degradation appeared, the lower boundary was set at μmin = 0.03 h-1. Also the biomass concentration needs to be limited. The upper limit is mainly defined by the cell separation step, which is practically limited with approximately 100 gL-1 dry mass.
Additional constraints may be entered, e.g. the final product concentration may be set at a minimum level.
In the Excel sheet we set N = 150. The values t n , μ n , X n , ... are organized in columns, with each time point t n ... a row. The values of X n , P n , V n , ... are calculated from the respective previous row using the equations provided above. The optimization process is performed by the Excel Solver as a black box. It maximizes the final Q P field by varying the μ fields within the boundaries and the dt field.
The Excel file used for this work is provided as an additional file.
To verify the Excel Solver solution, the exact solution of the optimization problem was determined with calculus of variation.
List of symbols
List of symbols
flow rate of linear feed
Monod constant for qP
specific product formation rate
mg g-1 h-1
mg L-1 h-1
maximum specific productivity
mg g-1 h-1
estimated standard deviation of qP
mg g-1 h-1
estimated standard deviation of Y'XS
time of linear feed
total feed time
volume of liquid supernatant
dry biomass concentration
observed biomass yield coefficient
theoretical biomass yield coefficient
specific growth rate
specific volume of biomass
density of feed medium
average residence time
double distilled water
Fab fragment of antibody 2F5
yeast dry mass
The authors wish to thank Prof. Werner Nowak, Institute of Mathematics, University of Natural Resources and Applied Life Sciences Vienna, for support and valuable discussions. This work was supported by the Austrian Research Promotion Agency (Program FHplus), and Polymun Scientific GmbH, Vienna, Austria. Part of the results presented here have been communicated at the 4th Recombinant Protein Production Meeting (Barcelona, 2006).
- Sinclair CG, Kristiansen B, Bu'Lock JD: Fermentation kinetics and modelling. 1987, Milton Keynes: Open University Press
- Modak JM, Lim HC, Tayeb YJ: General characteristics of optimal feed rate profiles for various fed batch fermentation processes. Biotechnol Bioeng. 1986, 28: 1396-1407. 10.1002/bit.260280914.View Article
- Zhang W, Sinha J, Smith LA, Inan M, Meagher MM: Maximization of production of secreted recombinant proteins in Pichia pastoris fed-batch fermentation. Biotechnol Prog. 2005, 21: 386-393. 10.1021/bp049811n.View Article
- Ohno H, Nakanishi E, Takamatsu T: Optimal control of a semibatch fermentation. Biotechnol Bioeng. 1976, 18: 847-864. 10.1002/bit.260180607.View Article
- Kobayashi K, Kuwae S, Ohya T, Ohda T, Ohyama M, Tomomitsu K: High level secretion of recombinant human serum albumin by fed-batch fermentation of the methylotrophic yeast, Pichia pastoris, based on optimal methanol feeding strategy. J Biosci Bioeng. 2000, 90: 280-288.View Article
- Porro D, Sauer M, Branduardi P, Mattanovich D: Recombinant protein production in yeasts. Mol Biotechnol. 2005, 31: 245-259. 10.1385/MB:31:3:245.View Article
- Lasdon LS, Waren AD, Jain A, Ratner M: Design and testing of a generalized reduced gradient code for nonlinear programming. ACM T Math Software. 1978, 4: 34-49. 10.1145/355769.355773.View Article
- Werner RG: Economic aspects of commercial manufacture of biopharmaceuticals. J Biotechnol. 2004, 113: 171-182. 10.1016/j.jbiotec.2004.04.036.View Article
- Hensing MCM, Rouwenhorst RJ, Heijnen JJ, van Dijken JP, Pronk JT: Physiological and technological aspects of large-scale heterologous-protein production with yeasts. Antonie van Leeuwenhoek. 1995, 67: 261-279. 10.1007/BF00873690.View Article
- Ohya T, Morita M, Miura M, Kuwae S, Kobayashi K: High-level production of prourokinase-annexin V chimeras in the methylotrophic yeast Pichia pastoris. J Biosci Bioeng. 2002, 94: 467-473.View Article
- Roels JA: Energetics and kinetics in biotechnology. 1983, Amsterdam, Elsevier Biomedical Press
- Gasser B, Maurer M, Gach J, Kunert R, Mattanovich D: Engineering of Pichia pastoris for improved production of antibody fragments. Biotechnol Bioeng. 2006, 94: 353-361. 10.1002/bit.20851.View Article
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.