MARINE-EXPRESS: taking advantage of high throughput cloning and expression strategies for the post-genomic analysis of marine organisms
- Agnès Groisillier†1, 2,
- Cécile Hervé†1, 2,
- Alexandra Jeudy1, 2,
- Etienne Rebuffet1, 2,
- Pierre F Pluchon3,
- Yann Chevolot4,
- Didier Flament3,
- Claire Geslin3,
- Isabel M Morgado5,
- Déborah Power5,
- Margherita Branno6,
- Hervé Moreau7,
- Gurvan Michel1, 2,
- Catherine Boyen1, 2 and
- Mirjam Czjzek1, 2Email author
© Groisillier et al; licensee BioMed Central Ltd. 2010
Received: 9 March 2010
Accepted: 14 June 2010
Published: 14 June 2010
The production of stable and soluble proteins is one of the most important steps prior to structural and functional studies of biological importance. We investigated the parallel production in a medium throughput strategy of genes coding for proteins from various marine organisms, using protocols that involved recombinatorial cloning, protein expression screening and batch purification. This strategy was applied in order to respond to the need for post-genomic validation of the recent success of a large number of marine genomic projects. Indeed, the upcoming challenge is to go beyond the bioinformatic data, since the bias introduced through the genomes of the so called model organisms leads to numerous proteins of unknown function in the still unexplored world of the oceanic organisms.
We present here the results of expression tests for 192 targets using a 96-well plate format. Genes were PCR amplified and cloned in parallel into expression vectors pFO4 and pGEX-4T-1, in order to express proteins N-terminally fused to a six-histidine-tag and to a GST-tag, respectively. Small-scale expression and purification permitted isolation of 84 soluble proteins and 34 insoluble proteins, which could also be used in refolding assays. Selected examples of proteins expressed and purified to a larger scale are presented.
The objective of this program was to get around the bottlenecks of soluble, active protein expression and crystallization for post-genomic validation of a number of proteins that come from various marine organisms. Multiplying the constructions, vectors and targets treated in parallel is important for the success of a medium throughput strategy and considerably increases the chances to get rapid access to pure and soluble protein samples, needed for the subsequent biochemical characterizations. Our set up of a medium throughput strategy applied to genes from marine organisms had a mean success rate of 44% soluble protein expression from marine bacteria, archaea as well as eukaryotic organisms. This success rate compares favorably with other protein screening projects, particularly for eukaryotic proteins. Several purified targets have already formed the base for experiments aimed at post-genomic validation.
The marine environment is highly complex and contains the vast majority of known and unknown biodiversity. It is also the last frontier to understand the control of the global climate and hides a wealth of biological resources still to be tapped for food, health and energy. Up until very recently, few genomic data were available for oceanic organisms, but this panorama is rapidly changing with a number of genomic projects now underway, which focus on marine organisms, ranging from microbes [1, 2] to multicellular eukaryotes including vertebrates or macro-algae , as well as the generation of resources and access to genomes or EST libraries for various eukaryotic systems [4–7]. The wealth of sequence data arising from these projects, means that researchers are confronted with a huge number of putative genes, the function of which are, at best, so far only deduced from sequence comparisons (automatic annotation). The pressing question is how to analyze the genomic data with respect to original biological processes in diverse marine organisms (i.e. their development or stress response), their importance in adaptation to the particular habitat and how to identify new enzymes and/or metabolites of biotechnological interest. Thus, the availability of complete genome data has resulted in the development of transcriptomic and proteomic methods that can be used to study regulatory networks and interactions of thousands of genes in parallel, allowing an efficient global analysis of genomic information. However, there are a number of clear drawbacks with these methods in so far that they are strongly dependent on the quality of the genome annotation, which at present assumes conserved functions across often widely distant taxa. Furthermore, these techniques give at most only an indication of the regulation/metabolic pathway the corresponding gene product belongs to, and little or no information on the precise biochemical function of unknown genes.
To understand the precise biological function of a single gene, the biochemical and physiological characterization of its product is essential and this is often greatly aided by the availability of 3-D structural information. Although the 3D structure does not always reveal the natural substrate, it has been shown repeatedly that it helps at least find the class of compounds among which the substrate will be found [8, 9]. Several bottlenecks exist in the analysis of individual proteins; generally the techniques utilized require systems for the efficient over-expression of the target gene in order to produce sufficient recombinant protein. Furthermore, to constitute an assessment for potential biotechnological applications of the discovered proteins/enzymes, the effective recombinant expression of biologically active proteins is essential. When aiming at the 3D-structure of the protein of interest, a second bottleneck is encountered at the step of crystallization of these proteins. Recent developments in the field of structural genomics have demonstrated that medium/high throughput strategies are most adapted to the production of large numbers of soluble and active gene products and/or protein crystals at a time [10, 11], since they allow simultaneous testing of numerous conditions with an optimized effort.
Availability of genomic data to the Marine Express partners.
Gene families of interest
State of genomic data
Proteins from the DNA replication system
Genome published 
Polysacchride metabolism, sulfatases
SBR Roscoff, France
Stress related genes, carbohydrate active enzymes
14000 ESTs, Genome complete
SBR Roscoff, France
Genome published 
Arago Banyuls, France
Hox-genes, Ci-msx, Ci-RX
Genome published 
SZ A. Dohrn, Napoli, Italia
Hormones, calcium and musculo-skeletal development, stress related
30000 ESTs and 20 full length cDNAs
CCMAR Faro, Protugal
Archaeal virus (PAV1)
Not yet classified
Genome published 
UBO, LM2E, Brest, France
Selection and bioinformatics analysis of the target genes
Comparison of obtained results covering three domains of living organisms
number of soluble proteins
number of insoluble proteins
percentage of soluble proteins
percentage of soluble + insoluble proteins
Amplification and cloning of DNA fragments
To optimize the amplification steps through PCR, specific primers were designed which had the same theoretical Tm value for all targets.
In the present study, two expression vectors were used to test recombinant protein expression and solubility. One vector (pFO4) contains a His6-tag at the N-terminal for affinity purification; the second (pGEX-4T-1) allows the production of a fusion protein with an N-terminal glutathion-S-transferase (GST). This parallel PCR cloning procedure could easily be performed in 96-well format. The approach relies on the use of a single PCR product for each gene that is compatible for ligation to both the expression vectors, pGEX-4T-1 and pFO4. Since the expression vectors were digested with Bam HI and Eco RI, the upstream and downstream PCR primers introduced Bam HI (or its isocaudomer Bgl II) and Eco RI (or its isocaudomer Mfe I) restriction sites, respectively, upon PCR amplification. After transformation in E. coli DH5α strains, the plasmids were validated by PCR screening of colonies using primers specific for the expression vectors and flanking the cloning sites. In this way, we obtained 174 cloned target genes (Figure 1). The efficiency for direct cloning of these target genes from PCR products was 95% (174 out of 183).
Expression and purification of recombinant proteins in small-scale experiments
The validated plasmids were used to transform appropriate E. coli expression strains (Table 3). Fusion proteins in pFO4 vector are under-control of a T7 promoter and in E. coli cells containing a chromosomally located defective prophage DE3 must be used for transformation. For Seventy-seven genes that were cloned from archaea Rosetta or Rosetta (DE3) strains were transformed, which compensate for a number of rare codons in E. coli. For cloned genes that contain several cysteines in their sequences Origami or Origami (DE3) strains (14 genes) were transformed. Indeed, these cells carry mutations for both the thioredoxin reductase (trxB) and glutathione reductase (gor) genes, mutations which greatly enhance disulfide bond formation in the cytoplasm. For all other recombinant vectors BL21 or BL21 (DE3) strains were used to be transformed.
For 167 out of 183 cloned constructs that contained inserts of the expected size small-scale experiments for soluble protein expression were screened. A key step in the automation of the small-scale experimental setup is the development of auto-induction media (ZYP5052 medium used in the present study), which contain differentially metabolized carbon sources that promote growth to relatively high cell densities and then auto-induce by the utilization of lactose. These media remove the need to monitor cell densities or to add an inducer such as IPTG in T7-based expression systems . To further optimize soluble protein expression, we used a culture temperature of 20°C for 3 days, since it has been established by the Structural Proteomics In Europe consortium  and others that lower temperatures tend to be more effective . Moreover, in the present study 24-well plates instead of the usual 96-well plates  were used to obtain better aeration of the cells . The final OD600 values that were reached in the small-scale cultivation vessels ranged between 10 and 16.
Scale-up expression and purification of targets from Pyrococcus abyssi
Up-scale, purification and crystallization of R-Z3597 from Zobellia galactanivorans
Scale-up, purification and crystallization of Staniocalcin1A from Sparus aurata
Crystallisation conditions were screened using three variants of commercial kits (PEGI, PACT and JCSG+). First thin, plate-like crystals grew in the condition, containing PEG 4000 25% 0.1 M di-sodium citrate pH 5.6 and 0.2 M ammonium sulphate, within one or two weeks at 292 K (Figure 5b). The further biochemical characterization and optimization of crystallization conditions to produce crystals suitable for X-ray analysis is currently under progress.
Based on sequence analysis, a large proportion of genes from genomic data of marine organisms have unknown cellular and/or molecular functions. One major challenge is to assign biological function and to elucidate the mechanism of action of such genes. This challenge involves techniques to elucidate the structure and function of the gene products, interactions between proteins and/or global protein changes. For example, the three-dimensional structure of a protein can often provide functional clues, primarily by detecting structural similarity with a protein of known function even when sequence identity is low . Purified protein is generally required in these studies and is at the basis of the development of medium/high throughput strategies to produce a large number of soluble proteins [10, 11]. A key feature to the success of medium/high throughput cloning strategies is the optimization of an identical treatment of all targets. Moreover, multiplying constructs, vectors and targets consequently increase the chances to obtain pure, soluble protein samples to pursue biochemical analyses. This has also been demonstrated more recently by other expression systems, such as the ligation-independent cloning (LIC) method of Mycobacterium tuberculosis gene sequences .
The increased number of genomic projects concerning marine organisms that are available, including prokaryotic organisms [1, 2] or eukaryotic organisms [3–7], as well as projects still in progress (Ectocarpus siliculosus, Zobellia galactanivorans), require the application of medium/high throughput transcriptomic and proteomic methods.
Here, we show that a general scheme for bacterial expression of genes originating from marine organisms could be successfully implemented for the production of soluble proteins. Relatively few studies have been performed to assess medium/high throughput expression of soluble proteins from marine organisms. However, in general efforts have been concentrated on individual organisms. For an example, the Southeast Collaboratory for Structural Genomics has developed high throughput protein production and crystallization of genes originating from Pyrococcus furiosus . More recently, a series of diatom expression vectors based on the Invitrogen Gateway technology for high throughput protein tagging and overexpression in Phaeodactylum tricornutum has been described .
The next stage, analysis of protein expression can be carried out without affinity purification using SDS-PAGE analysis which is a very reliable method, but not very sensitive. Indeed, for direct detection of the His-tagged product in the soluble fraction, a dot-blot procedure with an anti-His antibody is often applied [35–37]. Dot-blot is a fast method to screen expression and solubility of recombinant proteins using a convenient 96-well format. However, the reliability of this method is limited due to lack of specificity of the detection method, and it does not give information about the size and the purity of detected protein. One solution is to couple dot-blot with techniques providing information about the actual size such as capillary electrophoresis, SDS-PAGE or Western blotting. In out study, the use of affinity mini-columns increased the percentage of detected soluble targets by 15% for His-tagged and 60% for GST-tagged targets. Moreover, this method permitted to the expression level of recombinant proteins to be estimated more precisely and confirmed their correct molecular weight. In fact, the number of obtained soluble proteins is generally under evaluated. Indeed, we have seen that for some targets that were judged negative in small-scale experiments, culturing them in a larger volume of auto-inducible medium, such as 50 ml, in some cases allowed soluble expression of these 'negatively' judged targets (data not shown). The triage based on the small-scale results reduces the number of targets that progress to large-scale culture preparation , but in rare particular cases misses potential soluble expression.
Previous studies have indicated that approximately 50% of full-length proteins from the Eubacteria or Archaea and only 10-20% of proteins from Eucarya can be expressed in E. coli in soluble form [38, 39]. This percentage has been significantly increased (nearly 50%) for human targets proteins using a multi-construct approach . In the present study, 44% of soluble proteins were obtained. The best results are obtained for marine bacteria with 67% of soluble proteins, then archaea with 45% and as expected, Eucarya give the smallest percentage with 31%. These differences decrease if we take both insoluble and soluble proteins into account (Figure 6). In summary, we used a parallel production approach for bacterial expression in medium throughput to yield 84 soluble proteins from a total of 192 marine targets (44%).
While expression or crystallization strategies can be generalized to a common factor like their marine bacterial or eukaryotic origin, the setup for the functional screening is intimately bound to the gene family of interest to the different consortium members (Table 1). This latter step, performing the functional/biochemical characterization of the soluble expressed proteins, will therefore be conducted by each partner and will focus on the family of genes of their interest.
In conclusion, the present project provided purified proteins that are key reagents for numerous assays that address fundamental questions about their structure, function and regulation. For the first time, our medium throughput project allowed the expression of various proteins of marine origin in parallel, independent of organism. A rapid and cost-effective small-scale screening method for soluble expression of proteins from marine organisms in E. coli has been established, allowing the different partners to access large quantities of purified protein and to choose among targets of their interest for subsequent functional and/or structural analysis, which is currently underway.
Strains, plasmids and culture conditions
Escherichia coli strains and plasmids used in this study.
Genotypes and relevant properties
Sources or references
F- end A1 gln V44 thi-1 rec A1 rel A1 gyr A96 deo R nup G ϕ80dlacZΔ M15 Δ(lacZYA-argF)U169, hsd R17(rK- mK+), λ-
F- ompT gal dcm lon hsdS B (r B - m B - )
F- ompT gal dcm lon hsdS B (r B - m B - ) λ(DE3 [lac I lac UV5-T7 gene 1 ind 1 sam7 nin5])
F- ompT hsdS B (R B - m B - ) gal dcm
F- ompT hsdS B (R B - m B - ) gal dcm λ(DE3 [lac I lac UV5-T7 gene 1 ind 1 sam 7 nin 5])
F- ompT hsdSB rB- mB-) gal dcm lacY1 ahpC gor522::Tn10 (TcR) trxB::kan pAR5615 (ApR)
F- ompT hsdSB rB- mB-) gal dcm lacY1 ahpC gor522::Tn 10 (TcR) trxB::kan pAR5615 (ApR) λ(DE3[lac I lac UV5-T7 gene 1 ind 1 sam7 nin 5])
AmpR, tac promoter, GST.Tag
GE HealthCare, USA
AmpR, T7lac promoter, His.Tag
Bioinformatics analysis of the target sequences
The potential signal peptides and transmembrane domains have been predicted using SignalP and TMHMM, respectively [15, 16]. The modularity of each target protein has been examined using Blast queries against UniProt database, as well as domain searches with the InterPro server . The precise delineation of each module has been refined using Hydrophobic Cluster Analysis (HCA) . For this study, 192 modules were chosen with predicted masses between 7 and 140 kDa and rearrayed into two 96-well plates. All procedures were performed where possible in this 96-well format.
Primers design and cloning method
Expression vectors (pGEX-4T1 and pFO4) were digested by Bam HI and Eco RI. For each target sequence, we sought the restriction site recognized by Bam HI, Eco RI or their isocaudomers (respectively Bgl II and Mfe I) using BioEdit Sequence Alignment Editor (Ibis Biosciences Inc., USA). The target genes were classified into four compatible cloning strategies (Bam HI/Eco RI, Bam HI/Mfe I, Bgl II/Eco RI and Bgl II/Mfe I) in order to design the correct oligonucleotide primers, and assign targets in 96-wells plate). The standard scheme for primer design was defined as: for the forward primers, 5'-[hexa-G tail]-[Bam HI or Bgl II]-[Hybridization site]-3' and for the reverse primers, 5'-[hexa-C tail]-[Eco RI or Mfe I]-[stop anticodon]-[Hybridization site]-3'. Oligonucleotides for PCR were purchased in 96-well plates from Operon Biotechnologies GmbH (Cologne, Germany). PCR amplification was performed on a GeneAmpR PCR System 2700 (Applied Biosystems, USA). The thermocycle utilized was: denaturation at 95°C for 5 min and thirty cycles of denaturing at 95°C for 30 s, annealing at 50°C for 30 s and polymerization at 72°C for 4 min. Template amplification was performed with Pfu polymerase (PROMEGA, USA) and used with the conditions recommended by the supplier. PCR reactions were analyzed on 1% agarose gels using standard procedures . The resulting PCR products were purified using the QIAquick™ 96 PCR purification Kit (QIAGEN, USA), digested with appropriate restrictions enzymes and cloned in parallel into the pFO4 and pGEX-4T1 expression vectors using standard procedures . PCR-screening was performed directly on the DH5α bacterial colonies to verify clones with inserts on expected size, using PCR primers which annealed upstream and downstream of the insertion site of pGEX-4T1 and pFO4. Target fragments were amplified using 10 μl of PCR Master Mix (PROMEGA, USA) added to 0.2 μl of each primer (100 μM) with the same program described above. Plasmid extraction was performed using MiniPrep SV purification Kit (PROMEGA, USA) and recombinant plasmids were used to transform E. coli expression strains.
Screening for protein expression using 2 ml cultures
E. coli clones, for which the presence of the expression gene had been verified by colony PCR as described previously, were tested for the expression of the desired protein. Screening was done using 2 ml cultures in 24-deep well plates. Cultivation was performed in two phases. First, transformed colonies were grown at 37°C overnight in LB medium containing 100 μg ml-1 ampicillin. Then, cultures were diluted 1:100 with auto-inductible ZYP5052 medium  containing 100 μg ml-1 ampicillin and subjected to further incubation at 20°C until the desired density.
Lysis of cells and detection of proteins
For solubility assay, cell pellets from small-scale expression cultures were resuspended in 500 μl of lysis buffer (Tris-HCl 50 mM, pH 7.5; NaCl 250 mM; EDTA 1 mM; lysosyme 1 mg ml-1 ; DNAse 0.1 mg ml-1) and incubated at 18°C for 1 hour and the soluble and insoluble fractions separated by centrifugation (12000 g, 20 min, 4°C). Insoluble pellets were resuspended in 200 μl of lysis buffer supplemented with urea 6 M. Samples from soluble and insoluble fractions were separated by 12% sodium dodecyl-sulfate polyacrylamide gel electrophoresis (SDS-PAGE) using 12% Criterion precast Bis-Tris gels with 26 wells. Targets were scored as positive for expression and solubility if a detectable fusion protein of the correct molecular weight was observed after Coomassie-staining. In parallel, soluble fractions were purified using His or GST Microspin columns (GE Healthcare Life Science, USA) according to the protocol recommended by the supplier. The results were also analyzed by 12% SDS-PAGE.
This work has been funded by the network of excellence (GOCE-CT-2004-505403) 'Marine Genomics Europe' through a Flagship program entitled 'Marine-express' (2006-2008). DF is supported by grant 4852-REPAR/CREATE from the Brittany Regional Council. This work was also supported by the ANR program ARCREP (contract number: ANR-07-BLAN-0371-01) to DF and MC.
- Cohen G, Barbe V, Flament D, Galperin M, Heilig R, Lecompte O, Poch O, Prieur D, Quérellou J, Ripp R, Thierry JC, Van der Oost J, Weissenbach J, Zivanovic Y, Forterre P: An integrated analysis of the genome of the hyperthermophilic archaeon Pyrococcus abyssi. Mol Microbiol. 2003, 47: 1495-1512. 10.1046/j.1365-2958.2003.03381.x.View ArticleGoogle Scholar
- Glöckner FO, Kube M, Bauer M, Teeling H, Lombardot T, Ludwig W, Gade D, Beck A, Borzym K, Heitmann K, Rabus R, Schlesner H, Amann R, Reinhardt R: Complete genome sequence of the marine planctomycete Pirellula sp. strain 1. Proc Natl Acad Sci USA. 2003, 100: 8298-8303. 10.1073/pnas.1431443100.View ArticleGoogle Scholar
- Peters AF, Marie D, Scornet D, Kloareg B, Cock JM: Proposal of Ectocarpus siliculosus (Ectocarpales, Phaeophyceae) as a model organism for brown algal genetics and genomics. J Phycol. 2004, 40: 1079-1088. 10.1111/j.1529-8817.2004.04058.x.View ArticleGoogle Scholar
- Derelle E, Ferraz C, Rombauts S, Rouze P, Worden AZ, Robbens S, Partensky F, Degroeve S, Echeynie S, Cooke R, Saeys Y, Wuyts J, Jabbari K, Bowler C, Panaud O, Piegu B, Ball SG, Ral JP, Bouget FY, Piganeau G, De Baets B, Picard A, Delseny M, Demaille J, Van de Peer Y, Moreau H: Genome analysis of the smallest free-living eukaryote Ostreococcus tauri unveils many unique features. Proc Natl Acad Sci USA. 2006, 103: 11647-11652. 10.1073/pnas.0604795103.View ArticleGoogle Scholar
- Dehal P, Satou Y, Campbell RK, Chapman J, Degnan B, De Tomaso A, Davidson B, Di Gregorio A, Gelpke M, Goodstein DM, Harafuji N, Hastings KE, Ho I, Hotta K, Huang W, Kawashima T, Lemaire P, Martinez D, Meinertzhagen IA, Necula S, Nonaka M, Putnam N, Rash S, Saiga H, Satake M, Terry A, Yamada L, Wang HG, Awazu S, Azumi K, Branno M, Chin-Bow S, DeSantis R, Doyle S, Francino P, Keys DN, Haga S, Hayashi H, Hino K, Imai KS, Inaba K, Kano S, Kobayashi K, Kobayashi M, Lee BI, Makabe KW, Manohar C, Matassi G, Medina M, Mochizuki Y, Mount S, Morishita T, Miura S, Nakayama A, Nishizaka S, Nomoto H, Ohta F, Oishi K, Sano M, Sasaki A, Sasakura Y, Shoguchi E, Shin-i T, Spagnuolo A, Stainier D, Suzuki MM, Tassy O, Takatori N, Tokuoka M, Yagi K, Yoshizaki F, Wada S, Zhang C, Hyatt PD, Larimer F, Detter C, Doggett N, Glavina T, Hawkins T, Richardson P, Lucas S, Kohara Y, Levine M, Satoh N, Rokhsar DS: The draft genome of Ciona intestinalis: insights into chordate and vertebrate origins. Science. 2002, 298: 2157-2167. 10.1126/science.1080049.View ArticleGoogle Scholar
- Collén J, Roeder V, Rousvoal S, Collin O, Kloareg B, Boyen C: An expresseg sequence tag analysis of thallus and regenerating protoplasts of Chondrus crispus (Gigartinales, Rhodophyceae). J Phycol. 2006, 42: 104-112. 10.1111/j.1529-8817.2006.00171.x.View ArticleGoogle Scholar
- Roeder V, Collén J, Rousvoal S, Corre E, Leblanc L, Boyen C: Identification of stress gene transcripts in Laminaria digitata (Phaeophyceae) protoplast cultures by expressed sequence tag analysis. J Phycol. 2005, 41: 1227-1235. 10.1111/j.1529-8817.2005.00150.x.View ArticleGoogle Scholar
- Thornton JM, Todd AE, Milburn D, Borkakoti N, Orengo CA: From structure to function: Approaches and limitations. Nat Struct Biol Structural Genomics Supplement. 2000, 991-994.Google Scholar
- Watson JD, Laskowski RA, Thornton JM: Predicting protein function from sequence and structural data. Curr Opin Struct Biol. 2005, 15: 275-284. 10.1016/j.sbi.2005.04.003.View ArticleGoogle Scholar
- Sugar FJ, Jenney FE, Poole FL, Brereton PS, Izumi M, Shah C, Adams MW: Comparison of small- and large-scale expression of selected Pyrococcus furiosus genes as an aid to high-throughput protein production. J Struct Funct Genomics. 2005, 6: 149-158. 10.1007/s10969-005-3341-3.View ArticleGoogle Scholar
- Busso D, Poussin-Courmontagne P, Rose D, Ripp R, Litt A, Thierry JC, Moras D: Structural genomics of eukaryotic targets at a laboratory scale. J Struct Funct Genomics. 2005, 6: 81-88. 10.1007/s10969-005-1909-6.View ArticleGoogle Scholar
- Baneyx F: Recombinant protein expression in Escherichia coli. Curr Opin Biotechnol. 1999, 10: 411-421. 10.1016/S0958-1669(99)00003-8.View ArticleGoogle Scholar
- Fox BG, Goulding C, Malkowski MG, Stewart L, Deacon A: Structural genomics: from genes to structures with valuable materials and many questions in between. Nat Methods. 2008, 5 (2): 129-132. 10.1038/nmeth0208-129.View ArticleGoogle Scholar
- , Gräslund S, Nordlund P, Weigelt J, Hallberg BM, Bray J, Gileadi O, Knapp S, Oppermann U, Arrowsmith C, Hui R, Ming J, dhe-Paganon S, Park HW, Savchenko A, Yee A, Edwards A, Vincentelli R, Cambillau C, Kim R, Kim SH, Rao Z, Shi Y, Terwilliger TC, Kim CY, Hung LW, Waldo GS, Peleg Y, Albeck S, Unger T, Dym O, Prilusky J, Sussman JL, Stevens RC, Lesley SA, Wilson IA, Joachimiak A, Collart F, Dementieva I, Donnelly MI, Eschenfeldt WH, Kim Y, Stols L, Wu R, Zhou M, Burley SK, Emtage JS, Sauder JM, Thompson D, Bain K, Luz J, Gheyi T, Zhang F, Atwell S, Almo SC, Bonanno JB, Fiser A, Swaminathan S, Studier FW, Chance MR, Sali A, Acton TB, Xiao R, Zhao L, Ma LC, Hunt JF, Tong L, Cunningham K, Inouye M, Anderson S, Janjua H, Shastry R, Ho CK, Wang D, Wang H, Jiang M, Montelione GT, Stuart DI, Owens RJ, Daenke S, Schütz A, Heinemann U, Yokoyama S, Büssow K, Gunsalus KC: Protein production and purification. Nat Methods. 2008, 5: 135-146. Review 10.1038/nmeth.f.202.View ArticleGoogle Scholar
- Bendtsen JD, Nielsen H, von Heijne G, Brunak S: Improved prediction of signal peptides: SignalP 3.0. J Mol Biol. 2004, 340: 783-795. 10.1016/j.jmb.2004.05.028.View ArticleGoogle Scholar
- Krogh A, Larsson B, von Heijne G, Sonnhammer ELL: Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol. 2001, 305: 567-580. 10.1006/jmbi.2000.4315.View ArticleGoogle Scholar
- Quevillon E, Silventoinen V, Pillai S, Harte N, Mulder N, Apweiler R, Lopez R: InterProScan: protein domains identifier. Nucleic Acids Res. 2005, 33: 116-120. 10.1093/nar/gki442.View ArticleGoogle Scholar
- Lemesle-Varloot L, Henrissat B, Gaboriaud C, Bissery V, Morgat A, Mornon JP: Hydrophobic cluster analysis: procedures to derive structural and functional information from 2-D-representation of protein sequences. Biochimie. 1990, 72: 555-574. 10.1016/0300-9084(90)90120-6.View ArticleGoogle Scholar
- Studier FW: Protein production by auto-induction in high density shaking cultures. Protein Expr Purif. 2005, 41: 207-234. 10.1016/j.pep.2005.01.016.View ArticleGoogle Scholar
- Berrow NS, Büssow K, Coutard B, Diprose J, Ekberg M, Folkers GE, Levy N, Lieu V, Owens RJ, Peleg Y, Pinaglia C, Quevillon-Cheruel S, Salim L, Scheich C, Vincentelli R, Busso D: Recombinant protein expression and solubility screening in Escherichia coli: a comparative study. Acta Crystallogr D Biol Crystallogr. 2006, 62: 1218-1226. 10.1107/S0907444906031337.View ArticleGoogle Scholar
- Schein CD: Production of soluble recombinant proteins in bacteria. Biotechnol. 1989, 7: 1141-1148.Google Scholar
- Nguyen H, Martinez B, Oganesyan N, Kim R: An automated small-scale protein expression and purification screening provides beneficial information for protein production. J Struct Funct Genomics. 2004, 5: 23-27. 10.1023/B:JSFG.0000029195.73810.86.View ArticleGoogle Scholar
- Ren B, Kuhn J, Meslet-Cladiere L, Briffotaux J, Norais C, Lavigne R, Flament D, Ladenstein R, Myllykallio H: Structure and function of a novel endonuclease acting on branched DNA substrates. Embo J. 2009, 28 (16): 2479-2489. 10.1038/emboj.2009.192.View ArticleGoogle Scholar
- Collins BK, Tomanicek SJ, Lyamicheva N, Kaiser MW, Mueser TC: A preliminary solubility screen used to improve crystallization trials: crystallization and preliminary X-ray structure determination of Aeropyrum pernix flap endonuclease-1. Acta Cryst D-Biol Cryst. 2004, 60: 1674-1678. 10.1107/S090744490401844X.View ArticleGoogle Scholar
- Qin H, Hu J, Hua Y, Challa SV, Cross TA, Gao FP: Construction of a series of vectors for high throughput cloning and expression screening of membrane proteins from Mycobacterium tuberculosis. BMC Biotechnol. 2008, 16: 8-51.Google Scholar
- Wang BC, Adams MW, Dailey H, DeLucas L, Luo M, Rose J, Bunzel R, Dailey T, Habel J, Horanyi P, Jenney FE, Kataeva I, Lee HS, Li S, Li T, Lin D, Liu ZJ, Luan CH, Mayer M, Nagy L, Newton MG, Ng J, Poole FL, Shah A, Shah C, Sugar FJ, Xu H: Protein production and crystallization at SECSG -- an overview. J Struct Funct Genomics. 2005, 6 (23): 233-243. 10.1007/s10969-005-2462-z.View ArticleGoogle Scholar
- Siaut M, Heijde M, Mangogna M, Montsant A, Coesel S, Allen A, Manfredonia A, Falciatore A, Bowler C: Molecular toolbox for studying diatom biology in Phaeodactylum tricornutum. Gene. 2007, 406 (1-2): 23-35.View ArticleGoogle Scholar
- Geslin C, Gaillard M, Flament D, Rouault K, Le Romancer M, Prieur D, Erauso G: Analysis of the first genome of a hyperthermophilic marine virus-like particle, PAV1, isolated from Pyrococcus abyssi. J Bacteriol. 2007, 189: 4510-4519. 10.1128/JB.01896-06.View ArticleGoogle Scholar
- Stevens RC: Design of high-throughput methods of protein production for structural biology. Structure. 2000, 8: R177-185. 10.1016/S0969-2126(00)00193-3.View ArticleGoogle Scholar
- Terpe K: Overview of tag protein fusions: from molecular and biochemical fundamentals to commercial systems. Appl Microbiol Biotechnol. 2003, 60: 523-533.View ArticleGoogle Scholar
- Smith DB: Generating fusions to glutathione S-transferase for protein studies. Methods Enzymol. 2000, 326: 54-70.Google Scholar
- Hochuli E: Purification of recombinant proteins with metal chelate adsorbent. Genet Eng (N Y). 1990, 12: 87-98.View ArticleGoogle Scholar
- Lesley SA: High-throughput proteomics: protein expression and purification in the postgenomic world. Prot Expr Purif. 2001, 22: 159-164. 10.1006/prep.2001.1465.View ArticleGoogle Scholar
- Cowieson NP, Listwan P, Kurz M, Aagaard A, Ravasi T, Wells C, Huber T, Hume DA, Kobe B, Martin JL: Pilot studies on the parallel production of soluble mouse proteins in a bacterial expression system. J Struct Funct Genomics. 2005, 6: 13-20. 10.1007/s10969-005-0462-7.View ArticleGoogle Scholar
- Knaust RK, Nordlund P: Screening for soluble expression of recombinant proteins in a 96-well format. Anal Biochem. 2001, 297: 79-85. 10.1006/abio.2001.5331.View ArticleGoogle Scholar
- Busso D, Kim R, Kim SH: Expression of soluble recombinant proteins in a cell-free system using a 96-well format. J Biochem Biophys Methods. 2003, 55: 233-240. 10.1016/S0165-022X(03)00049-6.View ArticleGoogle Scholar
- Vincentelli R, Canaan S, Offant J, Cambillau C, Bignon C: Automated expression and solubility screening of His-tagged proteins in 96-well format. Anal Biochem. 2005, 346: 77-84. 10.1016/j.ab.2005.07.039.View ArticleGoogle Scholar
- Braun P, LaBaer J: High throughput protein production for functional proteomics. Trends Biotechnol. 2003, 21: 383-388. 10.1016/S0167-7799(03)00189-6.View ArticleGoogle Scholar
- Büssow K, Scheich C, Sievert V, Harttig U, Schultz J, Simon B, Bork P, Lehrach H, Heinemann U: Structural genomics of human proteins--target selection and generation of a public catalogue of expression clones. Microb Cell Fact. 2005, 5: 21-10.1186/1475-2859-4-21. 10.1186/1475-2859-4-21.View ArticleGoogle Scholar
- Gräslund S, Sagemark J, Berglund H, Dahlgren LG, Flores A, Hammarström M, Johansson I, Kotenyova T, Nilsson M, Nordlund P, Weigelt J: The use of systematic N- and C-terminal deletions to promote production and structural studies of recombinant proteins. Protein Expr Purif. 2008, 58: 210-221. 10.1016/j.pep.2007.11.008.View ArticleGoogle Scholar
- Sambrook J, Fritsch EF, Maniatis T: Molecular cloning: A Laboratory Manual. 1989, Cold Spring Harbor, Cold Spring Harbor PressGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.