Refactoring in E. coli identifies pathway bottlenecks and genetic instability
Based on available literature, we designed a synthetic raspberry ketone pathway (see material and methods), from plant, bacterial and fungal sources. The pathway begins from L-tyrosine, which is an endogenous primary metabolite, for expression in E. coli. The four-step pathway uses tyrosine ammonia lyase (TAL), p-coumaroyl-CoA ligase (PCL), benzalacetone synthase (BAS) and a NADPH-dependent raspberry ketone synthase (RKS). To complement this, we also selected a malonyl-CoA synthetase (MatB) for malonyl-CoA regeneration from malonate, CoA and ATP (17).
To begin, we built a plasmid with each gene under the control of the low-strength J23114 promoter (pJ23114-RK) and a strong RBS (PET-RBS), and tested it in a number of E. coli strains. We rationalised that using a weak promoter strength should provide sufficient levels of raspberry ketone to detect, allowing us to optimise the pathway by testing stronger promoters. Unexpectedly, the recA− deficient E. coli DH10β cloning strain provided the highest yields over other common laboratory strains (Additional file 1: Figure S1). In addition, provisional experiments in different media (M9, M63, LB, 2YT and TB) confirmed that the E. coli DH10β pJ23114-RK plasmid strain was only productive in rich media (data not shown). Raspberry ketone production in minimal media was below the limit of detection on HPLC–MS. While minimal media is preferred for standardisation in synthetic biology, within the context of our experiment, we decided to continue with rich media (2YT) for all experiments including promoter and enzyme characterisation. Poor raspberry ketone production has also been previously reported in minimal media [11, 13]. Finally, we also observed only trace levels of raspberry ketone in either batch or in high cell density fermenter growth in the semi-defined SM6 media (Additional file 1: Figure S2), which we performed with our final optimised strain described later.
Next, we continued with E. coli DH10β as a heterologous host and 2YT medium supplemented with 10 mM malonate (for MatB catalysed malonyl-CoA regeneration) for screening. Then we made a selection of pathway variants to identify potential bottlenecks. Simultaneously, we attempted to assemble five pathways (tal-pcl-matB-bas-rks) with MoClo, each varied with one gene controlled by a strong J23100 promoter, with the remaining four genes under the control of weak strength J23114 promoter. Interestingly, after MoClo assembly of J23100-tal and J23114-pcl, J23114-bas, J23114-matB and J23114-rks, after E. coli transformation, a range of colony sizes were obtained. Despite several attempts, we were unable to isolate the desired pathway combination, suggesting this design was unstable due to overproduction of TAL. However, the remaining four pathway variants were obtained with a weak J23114 promoter for tal expression, while the pcl, bas, matB and rks genes were individually overexpressed with the strong J23100 promoter to increase gene expression. Interestingly, with J23100-pcl (and J23114-tal, J23114-bas, J23114-matB and J23114-rks), the pathway was inactive—no p-coumarate, HBA or raspberry ketone was detected. However, since our LC–MS analysis does not include the intracellular intermediate p-coumaroyl-CoA (pCA-CoA), we are unsure why this variant disrupts pathway flux. To test this further, we also built a three-gene pathway (tal, pcl, and bas) solely with a J23100 promoter, in an attempt to maximise HBA levels. Interestingly, this variant pathway only produced trace levels of either p-coumarate (~ 0.5 μM) or HBA (~ 0.5 μM), while the level of the L-Tyr substrate remained unaltered in comparison to an empty vector control. Next, for the strong J23100-bas variant, HBA (9 μM) levels increased 4.7-fold in comparison to the low strength promoter positive control strain (1.9 μM), confirming that BAS (expression and activity) is rate-limiting [11, 16]. At this level of HBA production, there was a significant (24.7%) decrease in the final OD600 of cultures grown for 48 h (p value = 0.02); we suspect this is due to HBA accumulation and/or gene expression toxicity from overproduction of BAS. Next, with J23100-matB (all other genes J23114), pathway activity remained unchanged. Finally, with J23100-rks, there was complete flux of HBA into raspberry ketone (2.7 μM), which confirmed that RKS strongly favours product formation. In summary, these findings suggested the pathway could be improved using EcoFlex. However, since the BAS enzyme represents a key bottleneck, we next sought to optimise this step.
A rapid pigment screening strategy for HBA/raspberry ketone production
A key problem with fine chemical pathways is the lack of a fast, high-throughput method to optimise pathway design; raspberry ketone is colourless and requires analytical methods such as HPLC for detection. Interestingly, a retro-aldol synthesis of the precursor HBA [18], revealed this to be a yellow-orange powder, which broadly absorbs between ~ 300–450 nm in neutral-alkaline conditions (> pH 7.5) but is colourless below pH 7. Importantly, despite some absorbance between 260–350 nm, none of the other pathway intermediates such as L-Tyr (colourless), p-coumarate (pale yellow) or p-coumaroyl-CoA (pale yellow) or raspberry ketone (colourless) shares this distinct visual spectral property. Based on this observation, we tested if we could monitor HBA production using EcoFlex, by first optimising the concentrations of only the first three genes, tal, pcl and bas to maximise HBA accumulation. To simplify this further, we also omitted the malonyl-CoA regeneration scheme since the E. coli malonyl-CoA pool is ~ 35 µM [15] and the BAS enzyme has a favourable KM (23.3 μM) for malonyl-CoA [16]. Therefore, at low (~ µM) levels of flux, we did not expect malonyl-CoA to be rate-limiting. Next, we initially built the pathway with a random promoter library containing ten low to high-strength EcoFlex σ70 promoters into a medium-copy (ColE1) origin of replication pTU2-A destination vector, while also adding a kanamycin marker as an additional selective pressure to maintain plasmid stability. After three days of incubation at 30 °C on 2YT plates, we observed a range of white-, yellow- and orange-coloured colonies (typically ~ 100–1000 per E. coli transformation). To continue, we picked 32 colonies that displayed strong yellow-orange pigmentation and grew in small-scale liquid culture (Figs. 1, 2). Interestingly, a number of the strains all but depleted the L-Tyr substrate (~ 1.6–2.3 μM) and had high levels of p-coumarate (154–180 μM), but no HBA or raspberry ketone. In addition, two strains had increased HBA levels up to 68 μM, and 14 μM raspberry ketone from an unidentified and non-specific E. coli alkene reductase [11]. Overall, all strains producing HBA had a weak promoter with the pcl gene (J23114, SJM914), while the bas gene had either a medium (SJM908, SJM910) or strong promoter (J23100). However, this new library also contained a number of recombination events between the pcl and bas genes, but only where the J23114 promoter was found preceding the pcl gene. This is probably due to its frequent occurrence within the library, as a weak strength promoter. From recent literature [19], this was due to the repetitive use homology between the promoters and the Bba_B0015 terminator in the Golden Gate assembly. Therefore, we aimed to increase the diversity of terminators used in the next round of assembly. In addition, to fine-tune individual enzyme levels, we also desired a wider selection of promoters. To achieve this, we built and characterised a new σ70 promoter library with degenerate bases to minimise homologous recombination.
Characterisation of a new EcoFlex promoter library
Ultimately, our final goal was to create an optimised pathway using constitutive promoters, rather than inducible promoter control. We were intrigued by the disturbances in growth for some pathway constructs in response to stronger promoters and whether this is a common observation between different pathway enzymes. Therefore, we explored potential design faults at the level of individual gene expression for the first three pathway enzymes (TAL, PCL and BAS), which are responsible for HBA biosynthesis. To provide new variable strength σ70 promoters with reduced homology, we also expanded the EcoFlex promoter library [10]. This is based on the original Anderson promoter library [20] with degeneracy within the − 35 and − 10 boxes. In addition, to reduce homologous recombination, we also included two separate degenerate regions within the promoter. To assess how the new degenerate σ70 promoters control E. coli growth and the synthesis of specific proteins within the raspberry ketone pathway, we characterised eight σ70 promoters, spanning from low (1% activity relative to J23100) to high (SJM935—113%) activity (Additional file 1: Table S1). To monitor relative protein synthesis, we created a C-terminal eGFP translational fusion with the TAL, PCL and BAS proteins and monitored growth and fluorescence (Fig. 3B). As expected, the strongest promoter (SJM935) gave the strongest levels of eGFP fluorescence for both TAL and BAS. However, over two independent measurements, we observed strong variability in both growth and fluorescence (Fig. 3B) between the datasets for the strongest promoters (SJM935 and J23100) in contrast to more consistent growth for the low-medium strength variants. For example, some biological repeats for both SJM921-tal-eGFP and SJM921-bas-eGFP spontaneously lost GFP fluorescence during growth and demonstrated major lag times (> 2–4 h), before late exponential growth and loss of fluorescence. In addition, colonies from these clones were variable in size and smaller on average. This suggested strong promoter-gene combinations accumulate spontaneous mutations or loss of the plasmid during growth. While we did not investigate this observation in detail, we acknowledge that E. coli DH10β has a high background mutation rate, 13.5-fold higher than MG1655 [21]—the likely cause of genetic instability in response to toxicity. Intriguingly, these growth observations were also dependent on the gene studied. For example, for the PCL-eGFP fusions, while the final cell density decreased with increasing promoter strength for PCL variants, we did not observe any major variations in growth or fluorescence between biological repeats. This may be reflective of a specific level of toxicity to the system, which in the case of TAL, is likely due to depletion of L-tyrosine as a core metabolite. The case for the toxicity with single gene overexpression of bas is less clear but may be linked to differences in translation elongation/ribosome occupancy. While there was a clear trend between eGFP characterised promoter strength and fluorescence for the PCL-eGFP and BAS-eGFP fusion proteins, we were intrigued by the distinct and variable findings with TAL-eGFP (Fig. 3A). Therefore, we sought to investigate the gene-eGFP fusions separately using denaturing SDS-PAGE to assess fusion protein solubility. To do this, we selected plasmid strains with either a low, medium or high-strength promoter for the tal-gfp, pcl-gfp and bas-gfp fusions and analysed intracellular proteins by denaturing PAGE (Additional file 1: Figure S3). Firstly, all GFP fusions were synthesised at the expected molecular size. Under a high-strength promoter (SJM935—113%), all three fusion proteins were predominantly located in the insoluble fraction rather the soluble fraction. In particular, in comparison to TAL, the relative levels of insoluble protein for PCL and BAS decreased from the SJM935 promoter plasmids. Non-specific protein aggregation is a common issue with strong heterologous protein production in E. coli [22]. In contrast, both the low-strength (SJM942—18%) and medium-strength (SJM964—56%) promoter combinations gave major bands on SDS-PAGE for the soluble enzyme-GFP fusions (Additional file 1: Figure S4) and there was much less fusion protein observed in the insoluble fractions (Additional file 1: Figure S4). Therefore, as expected, the strongest promoters favour formation of inclusion bodies and drain resources quicker, resulting in decreased growth rates. In summary, these results confirmed that the characterisation of individual gene-fusions differ in relation to enzyme function and the state of protein folding within the cell. This is a form of context dependency, which is an increasingly important factor in the design of genetic circuits and pathways in synthetic biology [23]. Next, with this new knowledge, we set upon refactoring the pathway to optimise raspberry ketone production.
Refactoring an optimised raspberry ketone pathway
To reach a balanced pathway, we next separately optimised the pathway as two modules, HBA biosynthesis and HBA reduction (raspberry ketone synthesis), before joining this together in a final assembly. We started with a focused library to fine-tune HBA synthesis (Fig. 4) applying the following rationale. Firstly, high synthesis of TAL leads to toxicity and the formation of inclusion bodies. However, since TAL has a relatively low catalytic rate and a high KM for L-Tyr, we selected six medium–high strength σ70 promoters (SJM964, SJM926, SJM923, SJM936, SJM942, SJM933) for the tal gene. Secondly, PCL is a highly efficient enzyme, while too much inactivates the pathway. Therefore, six low-strength σ70 promoters (SJM940, SJM924, SJM956, SJM947, SJM952, SJM949) were paired with the pcl gene. Finally, we knew BAS was rate-limiting from all previous experiments, while high synthesis leads to formation of inclusion bodies and plasmid instability. Therefore, bas was paired with six medium–high strength σ70 (SJM928, SJM931, SJM935, J23100, SJM937, SJM941) promoters to provide a trade-off between growth and performance. The advantage of using a focused promoter strategy is that the library can be populated with only a finite range of desirable promoter strengths, whereby the fastest growing clones (e.g., those with low-strength promoters) will not dominate within libraries. The rational selection of parts also reduced the library size from 103 in to 63. Additionally, we also included a different strong terminator (Bba_B0015, L3S1P51, L3S2P21) to minimise homologous recombination [9] and held the RBS position constant (PET-RBS). After assembly and E. coli transformation of the second library, we repeated the screening process by selecting 20 colonies displaying yellow-orange pigmentation. After growth, LC–MS analysis revealed that the strains produced a wide range of p-coumarate levels (21–153 μM) with strong production of both HBA (36–98 μM) and raspberry ketone (31–96 μM) (Fig. 4). From sequencing, we found nine clones were duplicated within the library, but importantly, the level of the pathway intermediates between these clones were very similar. Unexpectedly, we found that clones, 6, 10, 14, and 18 contained a novel promoter sequence driving expression of the pcl gene that was not part of the characterised library. The promoter (SJM965, see Additional file 1) contained a single-base pair deletion and a novel randomised region between the -10 and -35 sequences. Sequencing data from the library was clean, suggesting this promoter must have been present at low levels, but was strongly selected for in our phenotypic screen. Interestingly, despite an average decrease in final OD600 (48 h) of 21% (range 3.7–4.7) in comparison to an empty vector control (OD600 = 5.03) (Additional file 1: Figure S4), the pathways in all clones were stable based on Sanger sequencing (Additional file 1: Figure S5). For example, we did not identify any homologous recombination events and there was a clear absence of mutations with the regions sequenced (e.g., promoter, RBS and terminator regions). A summary of the sequencing is provided within the Additional file 1: Figure S5 and Table S2). Satisfied with this outcome, we next looked at optimising the reductase module, which could then be added to complete the pathway.
To assess the double bond reductase, we paired the rks gene with five low to strong σ70 promoters, PET-RBS and a L2U2H09 terminator into pTU1-D-lacZ. The variants were grown in microtitre plates at 30 °C for 24 h and assessed for growth-rate, final OD600 and activity with 1 mM HBA substrate (Additional file 1: Figure S6). 1 mM HBA led to a 57% reduction in OD600 (Fig. 5), across the different rks promoter and empty-vector control strains (Fig. 5). In terms of HBA flux, four out of five of the promoter variants (high-low strength) gave 99.9% conversion of HBA into raspberry ketone (Fig. 5), with only the lowest strength promoter (SJM961—1% activity relative to J23100) leading to low conversion into raspberry ketone: 21.3 vs 13.9% with an empty vector control. Interestingly, the final OD600 was stable with increasing the RKS promoter strength (Fig. 5).
Next, we formed the final pathway design by combining two of the strongest HBA modules (B6 and B12 from Fig. 4), the optimised reductase module (from Fig. 5) and a separate malonyl-CoA regeneration module (Fig. 6). For the latter module, we aimed to increase the supply of malonate through heterologous expression of the Rhizobium trifolii matC (malonate transporter) gene in combination with matB (described earlier) under a low strength promoter in the final plasmid design. This strategy has previously been employed successfully for other flavonoid natural products [24, 25]. Additionally, we also introduced a sfGFP fluorescence reporter to monitor relative pathway stability, to act as an indicator of genetic stability [5]. To measure pathway stability and the drain of the plasmid system on general protein synthesis, we used a superfolder GFP (sfGFP) reporter, under the control of kasOp* promoter; a Streptomyces σ70 promoter that is strongly active in E. coli at low levels. Next, this final plasmid strain was assessed for growth and fluorescence to monitor for growth and plasmid stability (Fig. 6). The relative fluorescence of the strains was very similar between individual repeats, which suggested the strains were stable. However, relative fluorescence was reduced by 70% (Fig. 6) in comparison to the vector control (pTU1-A-kasOp*-sfGFP), which suggested that regardless of pathway productivity, the extra genes within the plasmid design (eight in total) cause a major drain on net protein synthesis. This was reflected by a decrease in the sfGFP reporter fluorescence and a delay of 4 h for lag growth time. Finally, we characterised the strain for relative protein levels (Fig. 6) and raspberry ketone production. SDS-PAGE revealed TAL accounts for ~ 10–20% of all intracellular proteins (Additional file 1: Figure S7), while BAS and RKS were also clearly overproduced (Additional file 1: Figure S7). At the metabolite level, we observed a clear 67–81% decrease in the L-tyrosine substrate relative to the control, 54–67 μM for p-coumaric acid, ~ 1 μM HBA and 63–78 μM (10.8–12.9 mg/L) of raspberry ketone, respectively, in small-scale (5 mL 2YT) batch conditions. Additionally, we also did separate experiments to test the effect of malonate on malonyl-CoA regeneration, but we found this was not significantly limiting for raspberry ketone production in E. coli (Additional file 1: Figure S8). This might be due to the kinetic properties of the BAS enzyme, which has a low KM (23.3 μM) for malonyl-CoA [16] and is favourable for E. coli to replace under homeostasis. Overall, from starting with an initial design that produced barely trace levels of raspberry ketone (0.2 mg/L), we have achieved a maximum 65-fold (12.9 mg/L) improvement in titre and have produced a stable pathway using only constitutive expression.