Skip to main content

Table 1 Rare codon frequencies of the coding sequences used in this study

From: Rare codon content affects the solubility of recombinant proteins in a codon bias-adjusted Escherichia coli strain

Coding sequence

Total codons

RIL codon contenta

% of RIL codons

% of Rare codonsb

GC Content (%)

Descriptionc

  

Arg

AGA/AGG

Ile

ATA

Leu

CTA

    

fd

97

1/1

0/4

1/6

2.1

24.7

40

pea ferredoxin

clpt1

178

3/4

1/8

2/20

3.4

9.6

48

05-08-O17 – accessory protein of the chloroplastic ClpP protease complex

fnr

308

7/8

2/12

2/21

3.6

19.8

42

pea ferredoxin-NADP(H) reductase

clpp4

235

6/11

4/26

0/18

4.3

20.4

44

04-15-O12 – component of the chloroplastic ClpP protease complex

clpc2

834

32/64

10/61

9/82

6.1

12.6

46

09-19-G11 – chloroplastic chaperone

clpd

865

33/56

16/62

8/85

6.4

15.1

46

05-05-I08 – chloroplastic chaperone

clpr2

224

10/18

4/13

4/19

8.0

21.9

47

06-10-N03 – component of the chloroplastic ClpP protease complex

dsRBD2

82

6/6

0/2

7/19

8.5

23.2

43

Second double-stranded RNA-binding domain of DICER1 (Arabidopsis thaliana)

trx

109

0/1

0/9

0/13

0.0

3.7

52

E. coli thioredoxin

E. coli

-

-

-

-

-

-

51

Escherichia coli K12

NCBI RefSeq Accession NC000913

A. thaliana

-

-

-

-

-

-

36d

Whole genome data at NCBI

  1. aRatio of RIL codon(s) for the indicated amino acid to the number of total codons encoding each amino acid, respectively. RIL codons are indicated in parenthesis below each amino acid.
  2. bPercentage of codons previously defined as rare in E. coli because they occur at frequency below 10% [8, 9]
  3. cNumbers indicate clone number from the RIKEN Arabidopsis full length cDNA bank. All clp coding sequences are from A. thaliana
  4. dAll genome data with the exception of mitochondrial DNA.