Protein solubility and differential proteomic profiling of recombinant Escherichia coli overexpressing double-tagged fusion proteins
Microbial Cell Factories volume 9, Article number: 63 (2010)
Overexpression of recombinant proteins usually triggers the induction of heat shock proteins that regulate aggregation and solubility of the overexpressed protein. The two-dimensional gel electrophoresis (2-DE)-mass spectrometry approach was used to profile the proteome of Escherichia coli overexpressing N-acetyl-D-glucosamine 2-epimerase (GlcNAc 2-epimerase) and N-acetyl-D-neuraminic acid aldolase (Neu5Ac aldolase), both fused to glutathione S-transferase (GST) and polyionic peptide (5D or 5R).
Overexpression of fusion proteins by IPTG induction caused significant differential expression of numerous cellular proteins; most of these proteins were down-regulated, including enzymes connected to the pentose phosphate pathway and the enzyme LuxS that could lead to an inhibition of tRNA synthesis. Interestingly, when plasmid-harboring cells were cultured in LB medium, gluconeogenesis occurred mainly through MaeB, while in the host strain, gluconeogenesis occurred by a different pathway (by Mdh and PckA). Significant up-regulation of the chaperones ClpB, HslU and GroEL and high-level expression of two protective small heat shock proteins (IbpA and IbpB) were found in cells overexpressing GST-GlcNAc 2-epimerase-5D but not in GST-Neu5Ac aldolase-5R-expressing E. coli. Although most of the recombinant protein was present in insoluble aggregates, the soluble fraction of GST-GlcNAc 2-epimerase-5D was higher than that of GST-Neu5Ac aldolase-5R. Also, in cells overexpressing recombinant GST-GlcNAc 2-epimerase-5D, the expression of σ32 was maintained at a higher level following induction.
Differential expression of metabolically functional proteins, especially those in the gluconeogenesis pathway, was found between host and recombinant cells. Also, the expression patterns of chaperones/heat shock proteins differed among the plasmid-harboring bacteria in response to overproduction of recombinant proteins. In conclusion, the solubility of overexpressed recombinant proteins could be enhanced by maintaining the expression of σ32, a bacterial heat shock transcription factor, at higher levels during overproduction.
Under the regulation of strong promoters, as in numerous commercial plasmid-based vectors, heterologous proteins are typically expressed at high levels in Escherichia coli. The overexpression of plasmid-encoded genes can trigger transcription of heat-shock genes and other stress responses and often result in the aggregation of the encoded proteins as inclusion bodies . The formation of inclusion bodies offers distinct advantages for the separation of overexpressed protein, because the aggregates that mostly contain the product in a high concentration can be easily isolated . However, the recombinant proteins found in inclusion bodies are often in a misfolded state, methods that can be used to avoid aggregation to yield a soluble and active product are sometime very desirable. To improve the expression of soluble recombinant proteins, introducing a fusion partner (tag) such as N-utilization substance A (NusA), maltose-binding protein (MBP), thioredoxin (TRX), or glutathione S-transferase (GST), to the recombinant protein is one of the most commonly used methods to increase solubility [3, 4]. We previously constructed two double-tagged gene fusions for overexpressing N-acetyl-D-glucosamine 2-epimerase (GlcNAc 2-epimerase) and N-acetyl-D-neuraminic acid aldolase (Neu5Ac aldolase), two sequential enzymes in the production of sialic acids. Both proteins were tagged with GST at the N-terminus, but at the C-terminus, one was tagged with five contiguous aspartate residues (5D) and the other with five contiguous arginine residues (5R) . The fusions were so designed to yield fusion proteins having charged surfaces at working pH, which allowed isolation and immobilization in a single step with either an anionic or a cationic exchanger that electrostatically bound fusion proteins via the 5D or 5R tag. In contrast to overexpressed GST alone that was totally soluble, however, most of overexpressed fusion proteins were in insoluble fraction. Although these fusion proteins overexpressed in E. coli were enzymatically active in both soluble and insoluble (aggregate) fractions. The present paper thus delineates the proteomic profiles of overproducing bacteria and presents results that could be useful for conceive a strategy to improve the production of soluble recombinant proteins.
Recombinant protein overexpression has been known to induce significant physiological changes such as the stress response to heat-shock in E. coli. The presence of the inducer isopropyl-β-D-1-thiogalactopyranoside (IPTG) alone can even influence E. coli metabolism substantially, altering the synthesis of certain proteins . When a recombinant protein is expressed at high rates, the system of cytosolic chaperones and proteases in bacteria is presumably induced to express in an altered pattern, in comparison with the host cells without overexpressing recombinant proteins. In addition to facilitating the folding of nascent proteins, several molecular chaperones and heat shock proteins are induced to inhibit the formation of inclusion bodies by reducing aggregation and promoting proteolysis of misfolded proteins. The simultaneous overexpression of chaperone/heat shock protein encoding genes and recombinant target proteins proved effective in several instances . To increase the solubility of recombinant proteins, the co-overproduction of individual chaperones as well as the combined overproduction of the functionally cooperating chaperone network of the E. coli cytosol has been attempted . Based on experimental results, Garcia-Fruitos et al. suggested that the so-called E. coli quality control system (made up of chaperones and proteases) acts coordinately to promote solubility at the expense of conformational quality . A study of global changes in protein expression that occur in response to the rapid synthesis of a recombinant protein would therefore help to elucidate the mechanisms regulating recombinant protein solubility.
Proteomic analysis has been employed to compare changes in the expression levels of cellular proteins under particular genetic and environmental conditions. The conventional approach to proteomics is a combination of high-resolution two-dimensional gel electrophoresis (2-DE) to separate the proteins and mass spectrometry (MS) to identify each isolated protein. Proteomic studies have helped elucidate complex cellular responses such as starvation, temperature shock, and stress responses in E. coli and they have facilitated its use in a variety of biotechnological applications . Knowledge of basic cellular processes provides the basis for developing methods to better control heterologous protein expression . Proteomic analysis has been used to disclose cellular protein changes during the overexpression of heterologous proteins in E. coli under different fermentation conditions [13–16]. Strategies to increase the production of serine-rich proteins and enhance cytosolic or secretory protein production have been proposed based on the disclosed proteome profiles [17, 18].
Results and Discussion
Differential expression profiling of E. coli induced with IPTG for protein overexpression
The bacteria used in this study are the host E. coli BL21 and E. coli BL21 strain harboring pGEX-2TK, pGEX-2TK-nanA-5R and pGEX-2TK-2ep-5D, which encoded GST, GST-Neu5Ac-aldolase-(arginine)5 and GST-GlcNAc 2-epimerase-(aspartate)5, respectively . These four bacteria that were cultivated in Luria-Bertani (LB) medium showed a similar growth pattern, even throughout the expression of recombinant proteins induced by the addition of IPTG (Additional file 1). When the OD600 reached 0.8 (denoted as time zero or T0), a sample was taken for proteome analysis and IPTG was added to both host-strain and plasmid-bearing cell cultures. After a 3-h induction, the recombinant protein was overexpressed to a substantial level and at that time another sample (denoted T3) was taken and compared with sample T0 of the same bacteria. To optimize the resolution of the proteome, pH 4-7 IPG strips were employed to cover the range in which most proteins of E. coli focus isoelectrically. Overexpressed GST and double-tagged Neu5Ac-aldolase were seen in the gels, but double-tagged GlcNAc 2-epimerase was not detected in the gels because it was not absorbed into the IEF strip for some unknown reasons. There were totally about 900 protein spots detected on 2-DE gels by MS-compatible silver staining. Three replicate runs were performed on the proteome of each E. coli. Figure 1 shows representative 2-DE images of the cellular proteome of E. coli cells harboring pGEX-2TK-nanA-5R and pGEX-2TK-2ep-5D before and after IPTG induction for 3 h. The representative 2-DE images for the host (E. coli BL21) and E. coli BL21 harboring pGEX-2TK before and after IPTG induction are shown in Additional file 2. The total number of differentially expressed spots appearing in at least one of the four bacteria was 293 based on a significance level of p < 0.05; the number of differentially expressed spots became 136 when the significance level was set at p < 0.01. Two small heat shock proteins, IbpA and IbpB, and six AmpC spots were excluded from these counts. Among these spots, 49 that were differentially expressed (P < 0.05 and a fold-change around two or greater) corresponding to 44 proteins, as shown in Table 1. Also, time courses of the expression levels of these differentially expressed proteins in the host strain and in E. coli BL21 harboring pGEX-2TK-2ep-5D are shown in Additional file 3.
Please note that the leaky expression of recombinant proteins, i.e., protein expression with addition of IPTG, was minimal in the plasmid-harboring cells. Upon IPTG induction, the number of up- and down-regulated spots was comparable in the host strain. In the plasmid-harboring cells, however, the differentially expressed spots were overwhelmingly down-regulated by IPTG induction, suggesting that the overexpression of recombinant proteins caused an unusually low expression of cellular proteins. As shown in Table 1, most down-regulated proteins in the plasmid-harboring cells were also found to be down-regulated in the host strain. Proteins up-regulated by IPTG were mainly the chaperone GroEL, galactoside O-acetyltransferase (LacA) and α-galactosidase (MelA). These results obtained with a proteomic approach are in good agreement with gene expression data of RNA levels of E. coli that was cultured in LB medium and treated with IPTG, which resulted in a high level induction of the lacZYA and melAB operons . In addition, oligopeptide transport (OppA) and dipeptide ABC transporter (DppA) were up-regulated, suggesting an unusual requirement for nutrients by the bacterium during IPTG induction. Similarly, the strong up-regulation of the oligopeptide-binding protein OppA was found in a recombinant Bacillus megaterium strain overexpressing dextransucrase .
According to the study by Peng and Shimizu  protein abundance detected by 2-DE correlated well with enzyme activity in E. coli K12. Our proteomic analysis revealed that some enzymes like deoxyuridinetriphosphatase (Dut), 7-alpha-hydroxysteroid dehydrogenase (HdhA) and ketol-acid reductoisomerase (IlvC) showed reduced expression only in the plasmid-harboring cells, suggesting that cellular activities were blocked to some extent by the overproduction of plasmid-encoded proteins. The overproduction of recombinant proteins also caused a decrease in the level of a key enzyme (IlvC) involved in valine and isoleucine biosynthesis, similarly to results from a microarray study of overproduction of the α-subunit of luciferase in E. coli. Furthermore, some proteins like the gluconeogenic enzyme NADP-dependent malic enzyme (MaeB) and the TCA cycle enzyme succinate dehydrogenase (SdhA) were significantly up-regulated after IPTG induction in plasmid-harboring cells. The up-regulation of the TCA cycle enzyme demonstrates the importance of the TCA cycle for the increased biosynthetic activity required by high-level protein synthesis . Elevated levels of SdhA were also found in prior reports on the overproduction of another recombinant protein (leptin) .
For culturing E. coli, LB is a complex rich medium that likely contains glycolytic and gluconeogenic carbon sources . As a gluconeogenic medium, amino acids and other small metabolites in LB fuel directly into the Krebs (TCA) cycle . After IPTG induction, more gluconeogenesis takes place in order to generate glucose phosphate for the pentose phosphate pathway. The gluconeogenesis pathway however, was quite different in host and plasmid-harboring cells. Figure 2 was drawn based on the differentially expressed proteins relating to glycolysis/gluconeogenesis, the TCA cycle, and the pentose phosphate pathway (PPP). In the host strain, gluconeogenesis most likely occurred via malate dehydrogenase (Mdh) and phosphoenolpyruvate carboxykinase (PckA) (gray arrows), while in the plasmid-harboring cells, gluconeogenesis proceeded mainly through MaeB (boldfaced arrows). The enzyme MaeB catalyzes the formation of pyruvate and CO2 from malate in the generation of NADPH from NADP+. Pyruvate is further converted to phosphoenolpyruvate by the enzymatic action of phosphoenolpyruvate synthase (PpsA) in the gluconeogenesis pathway. In the alternative route, the formation of phosphoenolpyruvate from malate via Mdh and PckA is coupled to the oxidation of NAD+ to NADH. These results suggest that plasmid-containing cells chose the gluconeogenesis pathway that generated more NADPH, which was needed by the cells for the overproduction of recombinant proteins. To our knowledge, this is the first report describing pathway alterations in recombinant E. coli grown on gluconeogenic media.
Two proteins involved in steps connected to PPP, ribose-5-phosphate isomerase A (RpiA) and transketolase (TktA), were down-regulated in the plasmid-containing cells but up-regulated in the host. An RpiA level time course indicated that the PPP enzyme did increase initially in response to IPTG induction to provide NADPH for cell synthesis. However, after 1 h, the level of this PPP enzyme decreased with time and reached a low level after 3 h of induction. This behavior is similar to a previous finding on the expression of rpiA, a gene coding for RpiA, which was considerably lower in E. coli carrying a high copy number plasmid (like pGEX-2TK in the present work) relative to E. coli carrying a low copy number plasmid and plasmid-free E. coli. After a 3-h induction, the levels of the PPP-related enzymes TktA and phosphopentomutase (DeoB) decreased to a very low level in the recombinant protein overproduction strains. TktA is the key enzyme for the synthesis of aromatic amino acids and DeoB is in charge of the conversion between ribose-5-phosphate and ribose-1-phosphate. The decrease in the levels of these proteins reflects a slow-down of some cellular process.
Among proteins that were up-regulated in the host strain but down-regulated in plasmid-harboring cells, S-ribosylhomocysteinase (LuxS) is an enzyme indirectly related to protein synthesis. Through the action of LuxS on S-ribosylhomocysteine, the sulfur-containing amino acid homocysteine (Hcy) is produced in E. coli as the last intermediate in the methionine biosynthetic pathway . Hcy can compete with methionine and isoleucine for the binding sites of methionyl- and isoleucyl-tRNA synthase . Higher expression of LuxS can thus lead to a higher concentration of Hcy and consequently become an obstacle for the synthesis of these tRNAs. In contrast to the increase in LuxS level in the host strain, significant down-regulation of LuxS in all three plasmid-harboring bacteria released to some extent the inhibition of methionyl- and isoleucyl-tRNA synthesis and then favored protein overproduction.
Solubility of overexpressed recombinant proteins and up-regulation of chaperones/heat shock proteins
Two double-tagged fusion proteins, GST-Neu5Ac aldolase-5R (535 aa) and GST-GlcNAc 2-epimerase-5D (629 aa), with molecular masses of 59 kDa and 70 kDa, respectively, were overexpressed in E. coli BL21. Cell pellets were disrupted in a rather small volume of lysis buffer to obtain protein fractions of repeated extraction and the insoluble fraction was recovered by completely dissolving the aggregates in high concentrations of urea with added SDS and DTT. The expression profile was revealed by SDS-PAGE and ELISA (Figure 3). As shown in Figure 3, with a limited volume of lysis buffer, most soluble recombinant proteins were recovered in the first extraction and protein concentration decreased gradually in the second and third extractions. The concentration of GST-Neu5Ac aldolase-5R in extractions 1, 2, 3, and P was 287.4, 81.5, 50.5, and 2289.7 μg/mL, respectively; the GST-GlcNAc 2-epimerase-5D concentration in those extractions was 538.5, 373.4, 190.5, 2136.9 μg/mL, respectively. Fusion proteins collected in extractions 1, 2 and 3 were grouped together as the soluble faction, whereas the fusion proteins recovered in extraction P were regarded as insoluble. The soluble percentage of GST-GlcNAc 2-epimerase-5D was 36.1 wt%, which was more than twice that of GST-Neu5Ac aldolase-5R (15.4 wt%). Our previous study showed that both GST-GlcNAc 2-epimerase-5D and GST-Neu5Ac aldolase-5R collected in P fractions were enzymatically active . Thus, a high proportion of recombinant protein possessing enzymatic activity was observed in the insoluble form in both bacteria.
The solubility of overexpressed proteins could be correlated well to the expression level of cellular chaperones/heat shock proteins. Figure 4 shows the expression level of five chaperones/heat shock proteins, GroEL, ClpB, HslU, IbpA and IbpB, in the host strain and recombinant plasmid-harboring cells after a 3-h induction with IPTG. Comparing the 2-DE maps of cellular proteins for pGEX-2TK-nanA-5R-harboring E. coli BL21 and pGEX-2TK-2ep-5D- harboring E. coli BL21 harvested at 0 h and 3 h post-induction without IPTG (Figure 1 and Additional file 2), we found that the expression levels of these heat shock proteins (ClpB, GroEL, IbpA, IbpB and HslU) at T3 were almost identical to that at T0 in the recombinant plasmid-harboring cells. The changes in the expression level of heat shock proteins in the presence of IPTG were thus mainly due to the overexpression of recombinant proteins. According to a proposed mechanism, disaggregation of overexpressed protein in E. coli is carried out by a network of ATPase chaperones consisting of a DnaK core assisted by the cochaperones DnaJ, GrpE, ClpB and GroEL-GroES . ClpB plays a starting role in the sequential mechanism of disaggregation by interacting with aggregates. In the final step of the disaggregation process, GroEL and DnaK complete the refolding of solubilized polypeptide chains into native proteins. Our results indicate that after the induction of plasmid-encoded proteins for overexpression, three chaperone proteins (ClpB, GroEL and HslU) were up-regulated, while the change of these protein levels in the host strain was not significant. However, the degree of up-regulation of these proteins was different among the three plasmid-containing strains. The intensity of up-regulation followed this order: pGEX-2TK-2ep-5D- > pGEX-2TK- > pGEX-2TK-nanA-5R-harboring cells. ClpB was significantly up-regulated in the pGEX-2TK-2ep-5D-harboring cells because of the formation of protein aggregates during the overproduction of recombinant protein, but the alteration of ClpB level in the cells harboring pGEX-2TK was insignificant since the plasmid-encoded protein (GST) was almost fully soluble. Similarly to ClpB expression, the level of GroEL in pGEX-2TK-2ep-5D-harboring cells was also significantly up-regulated. These results could well explain why the overexpressed fusion protein GST-GlcNAc 2-epimerase-5D was more soluble than the double-tagged fusion protein GST-Neu5Ac aldolase-5R. Likewise, the up-regulation of GroEL in pGEX-2TK cells could help the solubilization of plasmid-encoded protein GST.
The expression of HslU in recombinant bacteria in response to IPTG induction was very similar to that of ClpB. Both HslU (also named ClpY) and ClpB are members of the Clp/Hsp100 chaperone family. HslU is a chaperone subunit of a proteasome-like degradation complex. It typically forms a complex with HslV which functions as an ATP-dependent protease . The HslV portion functions as a protease and the HslU is an ATPase. The complex directs ATP-dependent deoligomerization and degradation of substrate proteins. Like ClpB, the promoter for HslU-HslV is also recognized by σ32. HslU was induced to express with ClpB for the disaggregation of inclusion bodies produced in plasmid-harboring cells, especially in the pGEX-2TK-2ep-5D harboring cells.
Two small heat shock proteins, IbpA and IbpB, were produced only with the overexpression of plasmid-coded proteins, and their expression levels increased with induction time. IbpA/IbpB are tightly associated with inclusion bodies formed during heterologous protein production in E. coli cells . In protein aggregates, IbpA/IbpB proteins bind partially folded proteins until disaggregating chaperone ClpB becomes available . Since the pGEX-2TK-encoded protein GST was almost fully soluble, ClpB was not induced to up-regulation by the overexpression of this heterologous protein. The expression of IbpA/IbpB in the GST-overexpressing cells was thus at a relatively low level in comparison with the cells overexpressing GST-GlcNAc 2-epimerase-5D. In the GST-overexpressing cells, IbpA and IbpB were induced to help to bind the nascent polypeptides and prevent them from aggregates.
Neu5Ac aldolase and GlcNAc 2-epimerase here were expressed in the form of double-tagged proteins. The GST tag fused to the N-terminus of the protein of interest allows the fusion protein to be purified to near homogeneity by affinity method. However, the tag may alter protein conformation or affect biologically important functions. Since a linker sequence and a protease cleavage site are built between the tag and the target protein within the expression vector, these shortcomings can be overcome and the tag can be removed after purification if necessary. In the present study, both GST and the polyionic tag (5D or 5R) were soluble, but the overexpressed double-tagged fusion proteins were partially soluble as shown in Figure 3. The addition of polyionic tag did not influence the expression pattern of fusion protein. As shown in Additional file 4, the solubility of GST-fused Neu5Ac aldolase was not changed by the presence of 5R tag. Even with the GST as soluble partners, the insoluble fractions of these two recombinant proteins made up a large amount (more than 60%) of the total recombinant proteins. This result suggests that recombinant proteins exist in different conformational states ranging from insoluble forms to soluble forms, which is in accordance with previous observations . In particular, GST-Neu5Ac aldolase-5R had a very small amount of soluble recombinant protein in the second and third extractions. Comparing these results with the expression profile of heat shock proteins in E. coli BL21 producing GST-Neu5Ac aldolase-5R, all of them went up to a higher level in the strain producing GST-GlcNAc 2 epimerase-5D, which exhibited higher protein amount in the 1st, 2nd, and 3rd extractions. The gene product of the blank plasmid (pGEX-2TK), GST was almost totally soluble (Figure 3), and just a trace of protein could be detected in the 2nd, 3rd and P fractions. The levels of heat shock proteins expressed in E. coli BL21 producing GST in Figure 4 show that GroEL was induced to an extent close to that in the GST-2 epimerase-5D-producing cells, whereas ClpB and IbpA were expressed at a level as low as in the GST-nanA-5R-producing cells. Taken together, these results indicate that expressed GST (in pGEX-2TK-harboring cells), as a solubility enhancer, mainly exists in native states under the demand of GroEL, indicating that the trigger of disaggregating chaperones and small heat shock proteins is not crucial when only a small amount of insoluble protein is present.
Expression level of heat shock proteins and sigma factor 32
Heat shock proteins IbpA and IbpB, HslU, and ClpB are all σ32-dependent heat shock proteins, whereas GroEL can bind σ32 to regulate its activity . The σ32 in E. coli, encoded by the rpoH gene, is a transcription factor enabling RNA polymerase to recognize the promoter of heat shock proteins. The expression levels of σ32 in the recombinant strains as determined by western blot are compared in Figure 5. The results indicate that the expression levels of σ32 just before induction were comparable in all strains. In the following three hours of IPTG induction, σ32 reached a maximum (1.67 AU) at 1 h and decreased to 1.08 AU at 3 h post-induction in the GST-Neu5Ac aldolase-5R-expressing strain. Because of this decrease in σ32 level, the accumulation of aggregated GST-Neu5Ac aldolase-5R seemed not to induce the heat shock proteins essential for the disaggregation process in this study.
In the GST-GlcNAc 2-epimerase-5D expressing cells, the σ32 level increased gradually to 1.8 AU (arbitrary unit) and remained at the higher value. This explains why the proteins IbpA, IbpB, HslU and ClpB were all up-regulated after a 3-h induction. The up-regulation of these heat shock proteins due to the sustained expression of σ32 promotes the solubility of recombinant proteins and helps rescue their active conformation. On the other hand, the response of σ32 was relatively lower in E. coli BL21 harboring pGEX-2TK, which was in agreement with the expression level of the heat shock proteins. The expression profile of σ32 in E. coli producing GST-Neu5Ac aldolase-5R was similar to that in cells carrying plasmid pGEX-2TK, exhibiting a tendency to decline after the first hour of induction, leading to a lower chaperone/heat shock protein expression level. Thus, the aggregated GST-Neu5Ac aldolase-5R proteins were still trapped in the insoluble fraction under conditions of low heat shock response.
Although amplification of the genes encoding IbpA and IbpB could enhance the production of recombinant proteins, overexpression of IbpA and IbpB resulted in more aggregate (inclusion bodies) than soluble protein . A previous study revealed that co-expression of heterologous proteins with a four-chaperone system (GroEL-GroES, DnaK-DnaJ-GrpE, ClpB and IbpA/IbpB) led to a remarkable increase in the solubility of various recombinant proteins . These results suggest that these σ32 -regulated chaperones work together in a cooperative way. In many cases, bacterial inclusion bodies are formed by highly functional enzymatic forms and the solubility of the expressed proteins does not parallel conformational quality . This is also true in the present work with the overexpression of double-tagged fusion proteins of GlcNAc 2-epimerase and Neu5Ac aldolase . Because the protease-based heat shock protein HslU was up-regulated along with ClpB, we agree with the conclusion of Garcia-Fruitos et al. that the E. coli quality control system (composed of cytosolic chaperones and proteases) promotes protein solubility instead of conformational quality through over-committed proteolysis of aggregation-prone polypeptides, irrespective of their conformational status and biological properties . In summary, maintaining the expression of σ32 at a high level could be a useful strategy for promoting the solubility of overexpressed proteins and misfolded polypeptides that restore their biological activity to some extent via enhanced expression of cytosolic chaperones.
Proteome profiles in host and recombinant strains were altered in the presence of the gene expression inducer IPTG. The induction for overexpression of plasmid-encoded proteins caused generally low expression of cellular proteins and down-regulation of enzymes in charge of steroid and amino acid synthesis. Proteins related to the pentose phosphate pathway were also down-regulated in plasmid-harboring cells. Most interestingly, different expression patterns of proteins in the TCA cycle and the gluconeogenesis pathway were found between host and recombinant cells. When cultured in LB medium, host cells underwent gluconeogenesis likely via Mdh and PckA, whereas in the plasmid-harboring cells, gluconeogenesis occurred mainly through MaeB, coupling it to the generation of NADPH for cell biosynthesis. Also, the homocysteine-producing enzyme LuxS, which at higher levels can block t-RNA synthesis, was down-regulated upon the overexpression of plasmid-encoding proteins. Even when the recombinant proteins were overexpressed with soluble tags, (such as GST and polyionic peptide 5R or 5D), most of the overexpressed protein was in the insoluble form. After IPTG induction, chaperones/heat shock proteins, including ClpB, HslU, GroEL, IbpA and IbpB, were up-regulated to varying extents among the strains. The disaggregation chaperone ClpB and HslU were significantly up-regulated for the dissolution of inclusion bodies produced in the pGEX-2TK-2ep-5D-harboring cells. The up-regulation of ClpB was insignificant in the pGEX-2TK-harboring cells since overexpressed GST was fully soluble. In contrast to the pGEX-2TK-2ep-5D-harboring strain, the up-regulation of ClpB and other chaperones/heat shock proteins in the pGEX-2TK-nanA-5R-harboring strain was relatively insignificant, and the soluble fraction of overexpressed protein was lower in the latter. The solubility of overexpressed protein thus correlated well with the expression of these chaperones/heat shock proteins. Furthermore, the expression of soluble protein was enhanced by the up-regulation of ClpB, HslU, IbpA, and IbpB under control of σ32-recognized promoters in the plasmid-harboring E. coli strains. In the double-tagged GlcNAc 2-epimerase-expressing cells, the expression of σ32 remained at a higher level and even increased slightly with induction time, while the σ32 level in double-tagged Neu5Ac aldolase-expressing cells dropped after reaching a peak value at 1 h post-induction. Sustained expression of σ32 at a higher level during the overexpression of recombinant proteins could be crucial to promoting the solubility of overexpressed proteins.
Bacterial strains, plasmids and growth conditions
The bacteria used in this study are the host E. coli BL21 and three plasmid-harboring E. coli BL21 that can respectively overexpressing GST, GST-Neu5Ac-aldolase-(arginine)5, and GST-GlcNAc 2-epimerase-(aspartate)5. Gene sources of Neu5Ac aldolase and GlcNAc 2-epimerase were from E. coli K12 and Synechocystis sp. strain PCC6803, respectively. Flask cultures (400 mL of culture in each 1-L flask) were carried out at 28°C and 150 rpm in LB medium (1% tryptone, 1% NaCl and 0.5% yeast extract). Ampicillin was added at a final concentration of 100 μg/mL to culture of strains harboring plasmids. Cell growth was monitored by measuring the value of OD600 using a Beckman DU-640 spectrophotometer (Beckman, Fullerton, CA). Upon reaching OD600 ~ 0.8, induction of recombinant proteins was started by adding IPTG at a final concentration of 1 mM and cells were further cultivated for three hours. Cells were then harvested by centrifugation (5857 × g, 15 min, 4°C) and the resulting pellets were stored at -20°C until further use.
SDS-PAGE and solubility analysis
Collected cell pellets were resuspended in a lysis buffer containing 100 mM NaCl, 50 mM Na2HPO4, 0.1 mM EDTA, 10 mM β-mercaptoethanol, 0.2% Triton X-100, 25 μg/L PMSF and 40 μg/L lysozyme at a 1:33 volume ratio of lysis buffer to bacterial broth . Cells were lysed with a probe sonicator XL-2020 (Misonix) in an ice bath for 15 min and then centrifuged at 7650 × g for 15 min at 4°C. The supernatant was collected as the first protein extract and the precipitate was treated twice by the same procedure to get the second and third protein extracts. The remaining aggregates was dissolved in 8 M urea, 4% SDS and 1% DTT and centrifuged at 12,000 × g for 2 min at room temperature. The resulting supernatant was denoted as extraction P. All protein extracts were stored at -80°C in aliquots. SDS-PAGE was carried out on a 10% gel and bands were visualized by Coomassie blue staining. The concentration of GST-fused protein in each fraction was determined by using a One-Step ELISA™ GST Detection kit (GenScript, Piscataway, NJ).
The activity of GST-GlcNAc 2-epimerase-5D was assayed based on the formation of N-acetyl-D-mannosamine (ManNAc) from N-acetyl-D-glucosamine (GlcNAc). By incubating the protein preparation with 1 ml of assay solution containing 100 mM GlcNAc, 10 mM MgCl2, 5 mM ATP in 100 mM Tris-HCl buffer (pH 7.5) for 10 min at 37°C. The reaction was then stopped by heating and the amount of ManNAc produced was determined by HPLC using the Aminex-87H column . The activity of GST-Neu5Ac aldolase-5R was determined using Neu5Ac as the substrate and the decrease of Neu5Ac was estimated. The protein preparation was incubated with 1 ml of Tris-HCl buffer (100 mM, pH 7.5) containing 20 mM Neu5Ac for 10 min at 37°C. The reaction was then stopped by heating and the amount of Neu5Ac consumed was determined by HPLC using the Aminex-87H column .
2-DE and image analysis
Cell pellets were washed four times in low salt washing buffer (3 mM KCl, 1.5 mM KH2PO4, 68 mM NaCl and 9 mM NaH2PO4). Washed pellets were resuspended in lysis buffer consisting of 8 M urea, 4% CHAPS and 1% DTT and then disrupted by short bursts of sonication. The cell lysate was clarified by centrifugation (13,000 × g, 30 min, 15°C) and the clear supernatant was stored at -80°C in aliquots. Protein quantification was performed using BioRad protein assay reagent and bovine serum albumin was used as the protein standard.
A 40 μg aliquot of protein extract from host cells (or 48 μg of protein extract from recombinant protein expressing cells) was mixed with rehydration buffer (8 M urea, 2% CHAPS, 1% DTT and 0.2% BioLyte 3-10) to make a final volume of 340 μL and then loaded onto a pH 4-7,18-cm Immobiline DryStrip (GE Healthcare, Fairfield, CT). The loaded strip was then in-gel rehydrated for 16 h at 20°C under 50 V. Proteins underwent isoelectric focusing in a Protean IEF Cell (BioRad, Hercules, CA) programmed as follows: 300 V for 1 h; 1000 V for 1 h; 1000 to 8000 V within 3 h, and then kept at 8000 V until a total voltage-hour of 65 kVh was reached. After isoelectric focusing, strips were equilibrated in two sequential equilibrium buffers containing 2% (w/v) DTT and 2.5% (w/v) iodoacetamide for 15 min. Electrophoresis in the second dimension was carried out on a 12.5% SDS-polyacrylamide gel in a Protean II xi Cell (BioRad, Hercules, CA). Three replicates were performed in this study. Gels were then stained as described  with a modification consisting of the reduction of the concentration of silver nitrate to 0.2% (w/v).
Stained gels were scanned on an ImageScanner densitometer (GE Healthcare, Fairfield, CT) at 300 dpi resolution with a blue filter and images were analyzed by ImageMaster 2D platinum software (version 5, GE Healthcare, Fairfield, CT) . For each protein spot, the spot outline was determined by setting parameters for the software as smooth: 4, min area: 9, saliency: 300 in the spot detection function. The software then computed spot feature area and spot volume automatically once the outline had been decided. The spot volume is defined as the value of the image intensity integration over the feature area of one spot. The relative spot volume (% volume) was calculated using the following formula: % volume = the volume of one spot divided by the sum of the volumes of all spots in a gel. Differences in % volume for each spot between groups were evaluated using t-tests. All statistical analyses were performed using SPSS software. Protein spots showing high differential expression and with p < 0.05 were given priority for identification.
Protein spots of interest were sliced from silver-stained gels followed by in-gel digestion as described . Home-made StageTip  was used to remove salts from the extracted solution. The eluate was evaporated to dryness under vacuum and stored at -20°C for further analysis.
Protein identification by MALDI MS and MS/MS
The dried sample was resuspended in 50% acetonitrile and 0.1% formic acid and then mixed 1:1 with matrix solution consisting of 5 mg/mL α-cyano-4-hydroxycinnamic acid (CHCA) in 50% acetonitrile, 0.1% v/v TFA and 2% w/v ammonium citrate. The mixture was spotted onto the 96-well format MALDI sample plate. Data directed acquisition on the Q-TOF Ultima™ MALDI instrument was fully automated. Within each well, all parent ions meeting the predefined criteria (any peak within the m/z 800-3000 range with intensity above 10 count ± include/exclude list) were selected for CID MS/MS using argon as collision gas and a mass dependent ± 5V rolling collision energy. The instrument was externally calibrated to less than 5 ppm accuracy over the mass range of m/z 800 - 3000 and further adjusted with Glu-Fibrinopeptide B as the near-point lock mass calibrant during data processing. MS and MS/MS survey were processed using Micromass ProteinLynx™ Global Server (PGS) 2.0 data processing software. The output ".txt" files for peptide mass fingerprinting (PMF) and ".pkl" files for peptide fragment fingerprinting (PPF) were searched against the NCBI database using the Mascot program.
Western blot analysis
One mL of broth was collected every hour from the time of induction and centrifuged at 7267 × g to obtain the cell pellet. The pellet was resuspended in SDS sample buffer with a volume equivalent to the value of OD600 × 0.8 mL and the mixture was boiled for 10 min. Equal volumes of the clarified supernatants after centrifugation at 10,464 × g were loaded onto a 12.5% SDS-polyacrylamide gel. After SDS-PAGE, proteins were blotted onto PVDF membranes (GE Healthcare, Fairfield, CT) following the Complete Mini-Genie Blotter (Idea Scientific Company, Minneapolis, MN) instructions. The PVDF membrane was blocked in 5% nonfat dry milk in TBST (20 mM Tris-HCl, 500 mM NaCl and 0.05% Tween 20) for 2 h at room temperature. The washed blot was then incubated with 1:1000 dilution of anti-E. coli σ32 factor monoclonal antibody (NeoClone, Madison, WI) overnight at 4°C. HRP-conjugated rabbit anti-mouse IgG polyclonal antibody (Abcam, Cambridge, MA) was used as the secondary antibody, at a dilution of 1:3000 and incubated with the blot for 1 h at room temperature. Bands of σ32 factor were detected using Novex® ECL Chemiluminescent Reagent Kit (Invitrogen, Carlsbad, CA). Developed films were scanned and analyzed as described in the 2-DE and Image Analysis section. In every transblotting membrane the same protein mixture was loaded together with the samples in other lane as the inter-membrane control. Image intensities of the anti-σ32 band on developed film were normalized to that of this identical control on the same film.
Villaverde A, Mar Carrió M: Protein aggregation in recombinant bacteria: biological role of inclusion bodies. Biotechnol Lett. 2003, 25: 1385-1395. 10.1023/A:1025024104862.
Fahnert B, Lilie H, Neubauer P: Inclusion bodies: formation and utilisation. Adv Biochem Engin/Biotechnol. 2004, 89: 93-142.
Davis GD, Elisee C, Newham DM, Harrison RG: New fusion protein systems designed to give soluble expression in Escherichia coli. Biotechnol Bioeng. 1999, 65: 382-388. 10.1002/(SICI)1097-0290(19991120)65:4<382::AID-BIT2>3.0.CO;2-I.
Dümmler A, Lawrence A, de Marco A: Simplified screening for the detection of soluble fusion constructs expressed in E. coli using a modular set of vectors. Microb Cell Fact. 2005, 4: 34- 10.1186/1475-2859-4-34.
Wang T, Chen Y, Pan H, Wang F, Cheng C, Lee W: Production of N-acetyl-D-neuraminic acid using two sequential enzymes overexpressed as double-tagged fusion proteins. BMC Biotechnol. 2009, 9: 63- 10.1186/1472-6750-9-63.
Parsell DA, Sauer RT: Induction of a heat shock-like response by unfolded protein in Escherichia coli: dependence on protein level not protein degradation. Genes & Development. 1989, 3: 1226-1232.
Kosinski MJ, Rinas U, Bailey JE: Isopropyl-β-d-thiogalactopyranoside influences the metabolism of Escherichia coli. Appl Microbiol Biotechnol. 1992, 36: 782-784. 10.1007/BF00172194.
Sørensen HP, Mortensen KK: Advanced genetic strategies for recombinant protein expression in Escherichia coli. J Biotechnol. 2005, 115: 113-128. 10.1016/j.jbiotec.2004.08.004.
de Marco A, Deuerling E, Mogk A, Tomoyasu T, Bukau B: Chaperone-based procedure to increase yields of soluble recombinant proteins produced in E. coli. BMC Biotechnol. 2007, 7: 32- 10.1186/1472-6750-7-32.
García-Fruitós E, Martínez-Alonso M, Gonzàlez-Montalbán N, Valli M, Mattanovich D, Villaverde A: Divergent genetic control of protein solubility and conformational quality in Escherichia coli. J Mol Biol. 2007, 374: 195-205. 10.1016/j.jmb.2007.09.004.
Han M, Lee SY: The Escherichia coli proteome: past, present, and future prospects. Microbiol Mol Biol Rev. 2006, 70: 362-439. 10.1128/MMBR.00036-05.
Lee PS, Lee KH: Escherichia coli--a model system that benefits from and contributes to the evolution of proteomics. Biotechnol Bioeng. 2003, 84: 801-814. 10.1002/bit.10848.
Champion KM, Nishihara JC, Aldor IS, Moreno GT, Andersen D, Stults KL, Vanderlaan M: Comparison of the Escherichia coli proteomes for recombinant human growth hormone producing and nonproducing fermentations. Proteomics. 2003, 3: 1365-1373. 10.1002/pmic.200300430.
Jürgen B, Lin HY, Riemschneider S, Scharf C, Neubauer P, Schmid R, Hecker M, Schweder T: Monitoring of genes that respond to overproduction of an insoluble recombinant protein in Escherichia coli glucose-limited fed-batch fermentations. Biotechnol Bioeng. 2000, 70: 217-224. 10.1002/1097-0290(20001020)70:2<217::AID-BIT11>3.0.CO;2-W.
Wang Y, Wu S, Hancock WS, Trala R, Kessler M, Taylor AH, Patel PS, Aon JC: Proteomic profiling of Escherichia coli proteins under high cell density fed-batch cultivation with overexpression of phosphogluconolactonase. Biotechnol Prog. 2005, 21: 1401-1411. 10.1021/bp050048m.
Aldor IS, Krawitz DC, Forrest W, Chen C, Nishihara JC, Joly JC, Champion KM: Proteomic profiling of reconbinant Escherichia coli in high-cell-density fermentations for improved production of an antibody fragment biopharmaceutical. Appl Environ Microbiol. 2005, 71: 1717-1728. 10.1128/AEM.71.4.1717-1728.2005.
Han M, Jeong KJ, Yoo J, Lee SY: Engineering Escherichia coli for increased productivity of serine-rich proteins based on proteome profiling. Appl Environ Microbiol. 2003, 69: 5772-5781. 10.1128/AEM.69.10.5772-5781.2003.
Han M, Park SJ, Park TJ, Lee SY: Roles and applications of small heat shock proteins in the production of recombinant proteins in Escherichia coli. Biotechnol Bioeng. 2004, 88: 426-436. 10.1002/bit.20227.
Richmond CS, Glasner JD, Mau R, Jin H, Blattner FR: Genome-wide expression profiling in Escherichia coli K-12. Nucleic Acids Res. 1999, 27: 3821-3835. 10.1093/nar/27.19.3821.
Wang W, Hollmann R, Fürch T, Nimtz M, Malten M, Jahn D, Deckwer WD: Proteome analysis of a recombinant Bacillus megaterium strain during heterologous production of a glucosyltransferase. Proteome Sci. 2005, 3: 4- 10.1186/1477-5956-3-4.
Peng L, Shimizu K: Global metabolic regulation analysis for Escherichia coli K12 based on protein expression by 2-dimensional electrophoresis and enzyme activity measurement. Appl Microbiol Biotechnol. 2003, 61: 163-178.
Oh M, Liao JC: DNA microarray detection of metabolic responses to protein overproduction in Escherichia coli. Metab Eng. 2000, 2: 201-209. 10.1006/mben.2000.0149.
Jannière L, Canceill D, Suski C, Kanga S, Dalmais B, Lestini R, Monnier A, Chapuis J, Bolotin A, Titok M, Le Chatelier E, Ehrlich SD: Genetic evidence for a link between glycolysis and DNA replication. PLoS ONE. 2007, 2: e447- 10.1371/journal.pone.0000447.
Tomenius H, Pernestig A, Jonas K, Georgellis D, Möllby R, Normark S, Melefors O: The Escherichia coli BarA-UvrY two-component system is a virulence determinant in the urinary tract. BMC Microbiol. 2006, 6: 27- 10.1186/1471-2180-6-27.
Wang Z, Xiang L, Shao J, Wegrzyn A, Wegrzyn G: Effects of the presence of ColE1 plasmid DNA in Escherichia coli on the host cell metabolism. Microb Cell Fact. 2006, 5: 34- 10.1186/1475-2859-5-34.
Tuite NL, Fraser KR, O'byrne CP: Homocysteine toxicity in Escherichia coli is caused by a perturbation of branched-chain amino acid biosynthesis. J Bacteriol. 2005, 187: 4362-4371. 10.1128/JB.187.13.4362-4371.2005.
Jakubowski H, R.Fersht A: Alternative pathways for editing non-cognate amino acids by aminoacyl-tRNA synthetases. Nucleic Acids Res. 1981, 9: 3105-3117. 10.1093/nar/9.13.3105.
Ben-Zvi AP, Goloubinoff P: Review: mechanisms of disaggregation and refolding of stable protein aggregates by molecular chaperones. J Struct Biol. 2001, 135: 84-93. 10.1006/jsbi.2001.4352.
Rohrwild M, Coux O, Huang HC, Moerschell RP, Yoo SJ, Seol JH, Chung CH, Goldberg AL: HslV-HslU: A novel ATP-dependent protease complex in Escherichia coli related to the eukaryotic proteasome. Proc Natl Acad Sci USA. 1996, 93: 5808-5813. 10.1073/pnas.93.12.5808.
Allen SP, Polazzi JO, Gierse JK, Easton AM: Two novel heat shock genes encoding proteins produced in response to heterologous protein expression in Escherichia coli. J Bacteriol. 1992, 174: 6938-6947.
Moreno-Bruna B, Baroja-Fernández E, Muñoz FJ, Bastarrica-Berasategui A, Zandueta-Criado A, Rodríguez-López M, Lasa I, Akazawa T, Pozueta-Romero J: Adenosine diphosphate sugar pyrophosphatase prevents glycogen biosynthesis in Escherichia coli. Proceedings of the National Academy of Sciences of the United States of America. 2001, 98: 8128-8132. 10.1073/pnas.131214098.
Ventura S, Villaverde A: Protein quality in bacterial inclusion bodies. Trends Biotechnol. 2006, 24: 179-185. 10.1016/j.tibtech.2006.02.007.
Guisbert E, Herman C, Lu CZ, Gross CA: A chaperone network controls the heat shock response in E. coli. Genes Dev. 2004, 18: 2812-2821. 10.1101/gad.1219204.
Yan JX, Wait R, Berkelman T, Harry RA, Westbrook JA, Wheeler CH, Dunn MJ: A modified silver staining protocol for visualization of proteins compatible with matrix-assisted laser desorption/ionization and electrospray ionization-mass spectrometry. Electrophoresis. 2000, 21: 3666-3672. 10.1002/1522-2683(200011)21:17<3666::AID-ELPS3666>3.0.CO;2-6.
Mishra A, Cheng C, Lee W, Tsai L: Proteomic changes in the hypothalamus and retroperitoneal fat from male F344 rats subjected to repeated light-dark shifts. Proteomics. 2009, 9: 4017-4028. 10.1002/pmic.200800813.
Shevchenko A, Tomas H, Havlis J, Olsen JV, Mann M: In-gel digestion for mass spectrometric characterization of proteins and proteomes. Nat Protoc. 2006, 1: 2856-2860. 10.1038/nprot.2006.468.
Rappsilber J, Mann M, Ishihama Y: Protocol for micro-purification, enrichment, pre-fractionation and storage of peptides for proteomics using StageTips. Nat Protoc. 2007, 2: 1896-1906. 10.1038/nprot.2007.261.
Proteomic mass spectrometry analyses were performed by the Core Facilities for Proteomics Research located at the Institute of Biological Chemistry, Academic Sinica (Taiwan). The authors acknowledge financial support received from the National Science Council (NSC 93-2214-E-194-002) and Ministry of Economic Affairs (98-EC-17-A-13-S1-116) of Republic of China (Taiwan).
The authors declare that they have no competing interests.
CHC carried out all experimental works and WCL was in charge of planning and conducting the study, and writing the manuscript. All authors read and approved the final manuscript.
Electronic supplementary material
Additional file 1: Supplemental Figure 1: Growth curves of host and recombinant E. coli BL21. Bacteria were cultivated in LB medium at 28°C. The arrow indicates the addition of IPTG. (PDF 78 KB)
Additional file 2: Supplemental Figure 2: Representative 2-DE gels of E. coli BL21 host and plasmid-bearing strains. Two-dimensional electrophoresis (2-DE) of the lysate of E. coli BL21 cells (A, B) and pGEX-2TK-harboring (C, D) E. coli BL21 cells harvested before (A, C) and after 3-h IPTG induction (B, D). Subfigures E, F, and G are 2-DE of the lysate of E. coli BL21, pGEX-2TK-nanA-5R-harboring E. coli BL21, and pGEX-2TK-2ep-5D- harboring E. coli BL21, respectively, harvested at 3 h post-induction without IPTG. Spot indicated by an arrow in subfigure D was the recombinant GST protein. (PDF 1 MB)
Additional file 3: Supplemental Figure 3: Time courses of protein expression. Time courses of the expression levels of differentially expressed proteins in E. coli BL21 (solid lines and circles) and E. coli BL21 harboring pGEX-2TK-2ep-5D (dashed lines and open circles). (PDF 233 KB)
Additional file 4: Supplemental Figure 4: Assay of tagged proteins by SDS-PAGE. SDS-PAGE of protein extracts from E. coli BL21 overexpressing double-tagged GST-Neu5Ac aldolase-5R (A) and single-tagged GST-Neu5Ac aldolase (B) after 3 h of IPTG induction. Lane M: protein marker; lanes 1-3: the first, second and third extractions from cell pellets; lane P: extraction from aggregates as described in the Methods section. The double-tagged GST-Neu5Ac aldolase-5R was expressed in bacteria harboring pGEX-2TK-nanA-5R; while the single-tagged GST-Neu5Ac aldolase was expressed in bacteria harboring plasmid pGEX-1λT with an inserted sequence coding for Neu5Ac aldolase. Arrows indicate double-tagged GST-Neu5Ac aldolase-5R (in A) and single-tagged GST-Neu5Ac aldolase (in B). (PDF 388 KB)
About this article
Cite this article
Cheng, CH., Lee, WC. Protein solubility and differential proteomic profiling of recombinant Escherichia coli overexpressing double-tagged fusion proteins. Microb Cell Fact 9, 63 (2010). https://doi.org/10.1186/1475-2859-9-63
- Recombinant Protein
- Heat Shock Protein
- Pentose Phosphate Pathway
- Overexpressed Protein
- Small Heat Shock Protein