Expression efficiency and sequence-based factors: a comparative view on 79 human genes in Pichia pastoris

High yield expression of heterologous proteins is usually a matter of "trial and error". In the search of parameters with a major impact on expression, we have applied a comparative analysis to 79 homologous strains of Pichia pastoris harbouring different human cDNAs. Recombinant protein expression was monitored in a standardized procedure and classified with respect to the expression level. More than ten sequence-based parameters with a possible influence on the expression level were analysed. Three factors proved to have a statistically significant association with the expression level. Low abundance of AT-rich regions in the cDNA associates with a high expression level, indicating that correct transcript processing is a major determinant for the expression success in this yeast. A comparatively high isoelectric point of the recombinant protein associates with failure of expression and, finally, the occurrence of a protein homologue in yeast is associated with detectable protein expression. Interestingly, some often discussed factors like codon usage or GC content did not show a significant impact on protein yield.

These results could provide a basis for a knowledge-oriented optimisation of gene sequences both to increase protein yields and to help target selection and the design of high-throughput expression approaches.

Lang, C., Böttner, M. Expression efficiency and sequence-based factors: a comparative view on 79 human genes in Pichia pastoris. Microb Cell Fact 5, S34 (2006).

