Designing next generation recombinant protein expression platforms by modulating the cellular stress response in Escherichia coli

Background A cellular stress response (CSR) is triggered upon recombinant protein synthesis which acts as a global feedback regulator of protein expression. To remove this key regulatory bottleneck, we had previously proposed that genes that are up-regulated post induction could be part of the signaling pathways which activate the CSR. Knocking out some of these genes which were non-essential and belonged to the bottom of the E. coli regulatory network had provided higher expression of GFP and L-asparaginase. Results We chose the best performing double knockout E. coli BW25113ΔelaAΔcysW and demonstrated its ability to enhance the expression of the toxic Rubella E1 glycoprotein by 2.5-fold by tagging it with sfGFP at the C-terminal end to better quantify expression levels. Transcriptomic analysis of this hyper-expressing mutant showed that a significantly lower proportion of genes got down-regulated post induction, which included genes for transcription, translation, protein folding and sorting, ribosome biogenesis, carbon metabolism, amino acid and ATP synthesis. This down-regulation which is a typical feature of the CSR was clearly blocked in the double knockout strain leading to its enhanced expression capability. Finally, we supplemented the expression of substrate uptake genes glpK and glpD whose down-regulation was not prevented in the double knockout, thus ameliorating almost all the negative effects of the CSR and obtained a further doubling in recombinant protein yields. Conclusion The study validated the hypothesis that these up-regulated genes act as signaling messengers which activate the CSR and thus, despite having no casual connection with recombinant protein synthesis, can improve cellular health and protein expression capabilities. Combining gene knockouts with supplementing the expression of key down-regulated genes can counter the harmful effects of CSR and help in the design of a truly superior host platform for recombinant protein expression.

: Confirmation of preserved functionality of L-asparaginase by estimating its specific activity.

Enzymatic activity measurements for L-asparaginase:
The enzymatic activity of L-asparaginase was quantified by measuring the amount of ammonia released during the reaction since it is directly proportional to the rate of hydrolysis of L-asparagine.
A calibration curve of the amount of ammonia released vs OD 436 was determined by Nessler"s reagent using ammonium sulfate solution as standard. The enzymatic activity of the supernatant was quantified by measuring the maximum rate of substrate conversion, where one unit of Lasparaginase (U) is defined as the amount of enzyme required to convert 1 μmol of asparagine to 1 μmol ammonia per minute at 37°C and pH 8.6.

Calibration curve for calculating the amount of ammonia released:
The enzymatic activity was calculated using the formula:

Gel densitometry studies:
Our preliminary studies showed that L-asparaginase starts getting secreted into the extracellular medium only after sufficient accumulation inside the cell. The expression of L-asparaginase in cytoplasmic and peri-plasmic fractions has been measured in some previous studies conducted by our lab (Khushoo et al., 2004;Amardeep Khushoo, 2005).    Table 1) and PCR confirmation of araProm-glpK insert (1.7 kb) by forward araP primer and reverse glpK primers.
Method S8: RNA-seq analysis procedure A total of 3 µg of RNA was used for library construction. Paired end runs were performed on HiSeq 2500 platform (Illumina, Inc., USA) which provided the data in the form of 2 X 100 bp 30 million reads (3GB) per sample. The obtained Illumina HiSeq 2500 raw reads were trimmed for removing adapter sequences using Trimmomatic v.0.36 followed by the read quality assessment for each sample using FastQC v0.10.1. After quality assessment, reads were aligned to the reference genome of E. coli BW25113 which is available in the Ensemble database "Escherichia coli BW25113 ASM75055v1" in the Fasta-GFF3 format. The annotation file was obtained from ENSEMBL database in the GFF format and following steps were performed: (a) Conversion of GFF annotation file to GTF format using "gffread" from Cufflinks suite.
(b) Reference indices were created using Bowtie2 tool and BAM files were generated.
(c) After performing alignment, mapped reads abundance was calculated using RSEM.
(d) Normalization was done by RSEM to rule out the effect of library size and reads length by estimating FPKM values for paired-end reads for each sample.
(e) Differential expression analysis was done using EdgeR software which is based on negative binomial distribution.
Pathway enrichment was done using the KEGG database. The total reads obtained for each sample lied in the range of 30-10 million reads with the mean read length of 100 bp. After trimming and filtering of reads, raw reads were mapped to the reference genome using Bowtie 2 tool with default settings. For both time-point samples of control and double-knock out strain as well as for respective single knockouts, the percentage of reads mapped was ranged from 95% to 65%. The raw reads and the processed data file have been deposited in the NCBI"s GEO Database and are accessible through GEO series accession number GSE108442.
Differential expression analysis was performed using the edgeR software package. The expression of each gene for both strains were calculated and normalized in terms of FPKM values (fragments per kilo base of transcript per million mapped reads). We selected genes having |log2(X IN /X UN )|>1 i.e. a fold change of ≥2≤ and false-discovery-rate (FDR) corrected p value <0.05 for further analysis. was also set up during the same reaction. All reactions were performed in technical triplicates.
Cycling conditions were: initial denaturation at 95°C for 10 minutes, 40 cycles of 95°C denaturation for 15 s each, 60°C annealing for 1 min. After the final cycle, reaction specificity was verified by determining melting profiles over a temperature range of 65˚C to 95˚C in 0.2˚C increments.
Resulted cycle threshold (C t ) values were average of biological duplicates for all genes. Reference gene (rimL, ribosomal-protein-L12-serine-acetyltransferase) C t values were subtracted to achieve normalization of C t values within the control and test sample genes (ΔC t calculation). Fold changes were calculated using 2 -ΔΔCT method and used for relative quantification log2 fold change.