Skip to main content

Advertisement

Table 1 Rare codon frequencies of the coding sequences used in this study

From: Rare codon content affects the solubility of recombinant proteins in a codon bias-adjusted Escherichia coli strain

Coding sequence Total codons RIL codon contenta % of RIL codons % of Rare codonsb GC Content (%) Descriptionc
   Arg AGA/AGG Ile ATA Leu CTA     
fd 97 1/1 0/4 1/6 2.1 24.7 40 pea ferredoxin
clpt1 178 3/4 1/8 2/20 3.4 9.6 48 05-08-O17 – accessory protein of the chloroplastic ClpP protease complex
fnr 308 7/8 2/12 2/21 3.6 19.8 42 pea ferredoxin-NADP(H) reductase
clpp4 235 6/11 4/26 0/18 4.3 20.4 44 04-15-O12 – component of the chloroplastic ClpP protease complex
clpc2 834 32/64 10/61 9/82 6.1 12.6 46 09-19-G11 – chloroplastic chaperone
clpd 865 33/56 16/62 8/85 6.4 15.1 46 05-05-I08 – chloroplastic chaperone
clpr2 224 10/18 4/13 4/19 8.0 21.9 47 06-10-N03 – component of the chloroplastic ClpP protease complex
dsRBD2 82 6/6 0/2 7/19 8.5 23.2 43 Second double-stranded RNA-binding domain of DICER1 (Arabidopsis thaliana)
trx 109 0/1 0/9 0/13 0.0 3.7 52 E. coli thioredoxin
E. coli - - - - - - 51 Escherichia coli K12 NCBI RefSeq Accession NC000913
A. thaliana - - - - - - 36d Whole genome data at NCBI
  1. aRatio of RIL codon(s) for the indicated amino acid to the number of total codons encoding each amino acid, respectively. RIL codons are indicated in parenthesis below each amino acid.
  2. bPercentage of codons previously defined as rare in E. coli because they occur at frequency below 10% [8, 9]
  3. cNumbers indicate clone number from the RIKEN Arabidopsis full length cDNA bank. All clp coding sequences are from A. thaliana
  4. dAll genome data with the exception of mitochondrial DNA.