Homogeneous production and characterization of recombinant N-GlcNAc-protein in Pichia pastoris

Background Therapeutic glycoproteins have occupied an extremely important position in the market of biopharmaceuticals. N-Glycosylation of protein drugs facilitates them to maintain optimal conformations and affect their structural stabilities, serum half-lives and biological efficiencies. Thus homogeneous N-glycoproteins with defined N-glycans are essential in their application in clinic therapeutics. However, there still remain several obstacles to acquire homogeneous N-glycans, such as the high production costs induced by the universal utilization of mammalian cell expression systems, the non-humanized N-glycan structures and the N-glycosylation microheterogeneities between batches. Results In this study, we constructed a Pichia pastoris (Komagataella phaffii) expression system producing truncated N-GlcNAc-modified recombinant proteins through introducing an ENGase isoform (Endo-T) which possesses powerful hydrolytic activities towards high-mannose type N-glycans. The results showed that the location of Endo-T in different subcellular fractions, such as Endoplasmic reticulum (ER), Golgi or cell membrane, affected their hydrolytic efficiencies. When the Endo-T was expressed in Golgi, the secreted IgG1-Fc region was efficiently produced with almost completely truncated N-glycans and the N-GlcNAc modification on the glycosite Asn297 was confirmed via Mass Spectrometry. Conclusion This strategy develops a simple glycoengineered yeast expression system to produce N-GlcNAc modified proteins, which could be further extended to different N-glycan structures. This system would provide a prospective platform for mass production of increasing novel glycoprotein drugs.

At present, therapeutic glycoproteins have occupied an increasing proportion in the market of biopharmaceuticals. Glycoprotein drugs have been widely used to fight against diverse diseases, such as pathogenic microbial invasive diseases, autoimmune disorders and cancers. It has been shown that N-glycosylation and N-glycan structures can affect the biophysical and pharmacokinetic properties of therapeutic glycoproteins [9][10][11]. Several novel approaches have been attempted to engineer N-glycosylation pathway to decrease the microheterogeneity of therapeutic proteins via in vitro chemoenzymatic methods or in vivo engineered expression systems [11][12][13][14][15][16][17][18].
Pichia pastoris, which was reassigned to the genus Komagataella spp. in 1995 [34], is an organism commonly employed to produce a variety of active proteins [35][36][37] with N-and/or O-linked glycans [38][39][40]. The N-linked glycans of the P. pastoris-produced proteins was high-mannose type without core fucose [41], which leads to reduced in vivo half-life and therapeutic function. The engineered P. pastoris have been constructed to produce glycoproteins with N-glycosylation profiles similar to human [39,42], but the products are still heterogeneous with lower yield [39,40,43].
In this study, we construct a P. pastoris system expressing truncated N-GlcNAc-modified recombinant proteins through introducing an ENGase isoform (Endo-T) which possesses powerful hydrolytic activities towards highmannose type N-glycan in intracellular environment, into different subcellular fractions. We believe the application of this easy and low-cost glycoprotein synthetic method would provide a prospective platform for mass production of increasing novel glycoprotein drugs with diverse homogeneous N-glycan structures.

Expression of Endo-T on the surface of Pichia pastoris
Endo-T is the first fungal member of glycoside hydrolase family 18 with ENGase-type activity secreted from Hypocrea jecorina (Trichoderma reesei) [44]. In the Gly-coDelete glycoengineering strategy, Endo-T has been successfully expressed in the Golgi of mammalian cells and plants to produce recombinant protein with homogenous N-glycan structures [17,18], or to enhance integral membrane protein with homogenous N-GlcNAc expression in P. pastoris [45]. Here, we first expressed Endo-T on the surface of P. pastoris using the Pir1-based surface display system [46]. To detect the surface expression of Endo-T, immunofluorescence staining with anti-Flag antibody was performed. P. pastoris cells anchored with Endo-T were clearly labeled, while no immunofluorescence was observed in the cells transferred with an empty plasmid (Fig. 1a). This result indicated that the Endo-T could be successfully expressed on the cell surface. Human IgG1-Fc region and GalNAc-T1 recombinantly expressed in P. pastoris and Ribonuclease B (RNase B, Sigma) were used as the substrates to detect the deglycosylation activity of the immobilized Endo-T. Endo-T on the cell surface exhibited hydrolysis activity to remove high mannosetype N-glycans from different glycoproteins (Fig. 1b, Additional file 1: Figure S1). Compared with the commercial PNGase F, the surface displayed Endo-T showed lower deglycosylation efficiency (Fig. 1b, Additional file 1: Figure S1). PNGase F could release most of the glycans from IgG Fc domain in 1 h, while approximate 40% of the glycoprotein left after treatment with surface displayed Endo-T. We also tried to co-express human IgG1-Fc region in P. pastoris with surface displayed Endo-T and found most of the proteins still maintained the N-glycans (data not shown).

Expression of ENGase in the ER or Golgi of Pichia pastoris
Endo-T has been expressed in the Golgi to produce recombinant protein with homogenous N-glycan structures [17]. Here, we first fused Endo-T with the trans-membrane region of S. cerevisiae MNN9 (mannosyltransferase) [47] or MNS1 (endoplasmic reticulum mannosyl-oligosaccharide 1,2-alpha-mannosidase) [48,49] respectively, to ensure that Endo-T could be localized to the Golgi or Endoplasmic reticulum (ER). The fused proteins were expressed in P. pastoris to make a platform for the production of homogeneous N-GlcNAc modified proteins instead of heterogeneous high-mannose type N-glycans ( Fig. 2a, b). In this study, human polypeptide N-acetylgalactosaminyltransferase 1 (GalNAc-T1) containing two N-glycans was selected to characterize the engineered yeast strains. The reporter protein construct built on the plasmid pPIC9K (Invitrogen) included the Saccharomyces cerevisiae α-mating factor signal at N-terminus to direct the protein to the ER membrane and a hexa-histidine tag at the C-terminus. Upon expression of the human GalNAc-T1 in the GS115 background, it was clear that the protein demonstrated only one protein band of approximately 70 kDa (Fig. 2c). By transferring to the engineered host strain, which expressed ENGases (Endo-T) in the ER or Golgi, the target proteins were produced with a similar yield, but exhibited three protein bands as shown in the SDS-PAGE and Western blot results (Fig. 2c). After in vitro treatment with PNGase F, all the samples showed a single band with similar MW (Fig. 2d), providing evidence that the lower bands in the samples from the engineered strains were the proteins deglycosylated of one or two N-glycans by Endo-T, although the deglycosylation efficiency is not high enough to remove all the N-glycans.
Different fermentation conditions, such as the pH of culture medium (BMMY), methanol concentration and incubation temperature, were tested for the production of total and deglycosylated GalNAc-T1 (Additional file 1: Figures S2, S3, S4). The culture temperature showed great influence on the stability of GalNAc-T1 protein and low temperature (20 °C) was preferred. More deglycosylated GalNAc-T1 proteins was produced in P. pastoris MNN9-EndoT strains cultured in BMMY (with pH 6.0) for 4-5 days at 20 °C with 0.5% methanol (v/v) added to the culture every 24 h.

Characterization of IgG1-Fc region with N-GlcNAc
The IgG1-Fc region harboring an N-glycan moiety at Asn-297 [50] was selected to be expressed in the engineered strains. The full-length of human IgG1-Fc including the hinge region was cloned into the pPIC9k vector (Invitrogen) and the resulting recombinant plasmid was transformed into the engineered P. pastoris expression strain. After 4 or 5 day induction with 0.5% methanol, the supernatant of the medium were precipitated with acetone and detected by SDS-PAGE. The IgG1-Fc produced from P. pastoris wild type appeared as a protein band at ~ 38 kDa (Fig. 3a), which was in agreement with the calculated heterogeneous glycosylated monomeric IgG1-Fc (33-34 kDa). But when we expressed IgG1-Fc in the engineered yeast strains, the IgG1-Fc appeared a slightly smaller molecular weight (Fig. 3a). Thus, we estimated that the IgG1-Fc region expressed in the Endo-T-harboring strains could be deglycosylated. Moreover, more than 95% of the IgG1-Fc in P. pastoris MNN9-EndoT strains was deglycosylated, while approximate 10% of the IgG1-Fc in P. pastoris MNS1-EndoT was attached with N-glycans (Fig. 3a). The recombinant protein harvested from the P. pastoris MNN9-EndoT strains was then purified by affinity chromatography on a protein G column and approximate 200-250 mg of recombinant IgG1-Fc were obtained from 1 L of fermentation medium (Fig. 3b, Additional file 1: Figure S5), which was higher than the previous reports (from 10 to 100 mg/L) [51][52][53]. The purified IgG1-Fc from WT and MNN9-EndoT strain were detected by ConA blot (Additional file 1: Figure S6), suggesting the truncated N-glycan in engineered strain. To define whether the N-glycan structure was a single GlcNAc moiety, IgG1-Fc region proteins produced from E. coli and P. pastoris MNN9-EndoT strain were digested with Endoproteinase Glu-C and analyzed with MALDI-TOF MS (Fig. 3c) and LCMS-IT-TOF (Additional file 1: Figure S7). The protein from P. pastoris WT with the huge heterogeneous N-glycans was not easy to detect and compare with the protein from engineered strain  Table S2). On the other hand, N-GlcNAc-IgG1-Fc from P. pastoris MNN9-EndoT strain assigned 3053.68 (m/z), indicating a HexNAc (an MW increase of 203 Da) addition in this peptide (Fig. 3c).

Structural conformation of N-GlcNAc IgG1-Fc
The hinge-containing IgG1-Fc region should be covalently linked as a homodimer through the formation of a disulfide-bond [54]. SDS-PAGE with or without reduction was used to assay the forming of the dimer.
On SDS-PAGE gel, the IgG1-Fc appeared as a protein band at ~ 38 kDa (from the WT strain) or ~ 34 kDa (from the engineered strain) under reducing conditions (with DTT treatment), while ~ 60 kDa (from the WT strain) or ~ 55-kDa (from the engineered strain) under non-reducing conditions (without DTT treatment) (Fig. 4a). The results were consistent with the previous observations [28]. We also found that the dimer appeared smaller in size on SDS-PAGE than the calculated molecular weight [28]. These results indicate that both P. pastoris recombinant IgG1-Fc proteins with or without the N-glycans were obtained as homodimers.
The secondary structures of IgG1-Fc regions expressed in P. pastoris were determined using far-UV circular dichroism (CD) spectroscopy (Fig. 4b). The IgG1-Fc region purified from P. pastoris WT strain and engineered P. pastoris were tested and compared. The secondary structure of the Fc fragment at 25 °C is populated primarily of beta-strands and a wavelength of 218 nm was chosen for unfolding by CD measurement [53]. For the WT-Fc, the spectra obtained at 25 °C showed a maximum negative peak at 218 nm, which was similar with previous reports [53]. Moreover, the CD spectrum of N-GlcNAc-Fc showed only minor differences to the WT spectrum (Fig. 4b), which was consistent with deglycosylated IgG [55] or aglycosylated Fc [56]. It can be seen that the Fc fragments with truncated glycans have intact secondary and tertiary structures that are very similar to the wild-type Fc fragment, with a characteristic minimum at 218 nm.

Discussion
Glycoproteins are an important class of biomolecules involved in many physiological and pathological processes. Several strategies have been developed to produce glycoproteins with homogeneous glycan structures [11][12][13][14], of which ENGase-mediated N-glycan remodeling was a powerful approach to prepare defined glycoconjugates. The major limitation of this method is the difficulty to obtain N-GlcNAc proteins in large quantities. In this study, we constructed a P. pastoris expression system, which localized recombinant ENGases in the cell membrane, ER or Golgi, to produce secreted N-GlcNAcmodified proteins. Our results showed the location of ENGase in different subcellular fractions affected their hydrolytic efficiencies.
Pichia pastoris is an expression strain widely utilized to produce functional N-glycoproteins [35][36][37] with high yields [57]. The expression levels of recombinant proteins in P. pastoris were even up to 10 g/L [58]. The N-linked glycans from P. pastoris are of high mannose type without core fucose, which could be preferred as substrates by a variety of ENGase isoforms. We attempt to build up an expression system, which localized the recombinant ENGases in the cell surface membrane, ER or Golgi. As an immobilized enzyme on cell surface, the ENGase could hydrolyze glycans from N-glycoproteins in in vitro reaction system, while few deglycosylated proteins were found in the cultured medium containing methanol. When the ENGase was expressed in Golgi or ER, the secreted target glycoprotein could be efficiently deglycosylated. Fused with MNN9, the hydrolysis activity of ENGase against IgG Fc domain and GalNAc-T1 proteins is higher than fused with MNS1. It is assumed that the Endo-T preferred the microenvironment of yeast Golgi, such as the intracellular pH, as well as the glycan structure.
Human IgG1 carries a conserved N-glycan at Asn-297 of its Fc region. The presence and precise structures of this N-glycan plays an important role in determining antibody's structure and effector functions. For example, the deglycosylated IgG1 are highly flexible and more prone to aggregation [59,60]; removal of the core fucose from N-glycans increases the Fc's affinity towards FcγRIIIA [14,[61][62][63]; the terminal α2, 6-sialylation is critical for its anti-inflammatory activity [64][65][66]. Fc region-containing fusion proteins are also influenced by the structure of N-glycans [67][68][69]. Both full length of human IgG1 and the IgG1-Fc region have been expressed in P. pastoris for glycan remodeling, in which the N-glycans need removing by in vitro reactions [14,28]. When IgG1-Fc was expressed in our engineered strain (MNN9-EndoT), > 95% of secreted IgG1-Fc harbored only one GlcNAc moiety. Our results also showed that the total yield, the secondary structure and the protein conformation were not affected by the removal of the N-glycans. As the secreted proteins have been folded to the native state in the ER apparatus, the deglycosylation in the Golgi should only slightly affect the secretion of glycoproteins. Thus, N-GlcNAc IgG1-Fc protein produced from engineered P. pastoris should have the same properties as the in vitro deglycosylated proteins used for further N-glycans remodeling [14,27,30]. In our strategy, the N-GlcNAc proteins could be obtained with high yield via simple purification step from the culture medium.
Combined with the in vitro glycan remodeling or enzymatical elongation methods, this engineered P. pastoris system provides a prospective platform for powerful production of recombinant glycoprotein drugs. On the other hands, this system was not efficient enough to remove all the N-glycans when more than one oligosaccharide was attached on the target proteins. Some reasons might be responsible for the decrease of ENGase hydrolysis activity, such as (1) the spatial hindrance caused by localization expression; (2) the intracellular pH in Golgi was a non-optimal pH for Endo-T; (3) the cultured temperature  (20-25 °C) was too low. But, the lower pH (pH 6.0) of the medium and the lower cultured temperature (20-25 °C) were important for higher yields of secreted recombinant proteins. The precise optimum pH of ENGases generally corresponds with the catalytic carboxylic acid residues in the enzyme active sites [70][71][72], and depends on the individual ENGase isoform [27]. The hydrolytic activity of ENGase was pH-dependent and drops rapidly as the pH is either higher or lower than the optimum pH [70]. The temperature was another factor to affect ENGases' hydrolytic activity. Most of the novel ENGase isoforms are derived from microbes. Thus the optimum temperature is 30-37 °C and the lower temperature would decrease the hydrolytic activity. We supposed the temperature was the major reason for the lower deglycosylation efficiency of the fungal ENGase (Endo-T) in P. pastoris than in mammalian cells or plant cells. In the further work, we would screen and apply some novel ENGase isoforms which possess powerful hydrolytic activities towards high-mannose type N-glycan in the cultured condition of P. pastoris, such as pH 6.0, 20-25 °C.

Conclusions
In this work, we developed a simple glycoengineered yeast expression system to efficiently produce homogeneous N-GlcNAc modified glycoproteins which could be further elongated to different N-glycan structures. We believe the application of this easy and low-cost glycoprotein synthetic method would provide a prospective platform to efficiently produce a growing number of novel glycoprotein drugs.

Bacterial strains, media and chemicals
Pichia pastoris GS115 (his4 − ), pGAPZa and pPIC9K used for the protein expression were obtained from Invitrogen (Thermo Fisher Scientific). Escherichia coli TOP10 or DH5α strain was used as the host for recombinant DNA construction work. E. coli was grown in Luria-Bertani (LB) medium at 37 °C with 100 μg/mL ampicillin or 50 μg/mL zeocin where necessary. Buffered minimal glycerol (BMGY) medium, buffered minimal methanol (BMMY) medium and minimal dextrose (MD) medium were prepared following the P. pastoris expression manual (Invitrogen). Mouse anti-His monoclonal antibody and mouse anti-Flag monoclonal antibody were purchased from Genscript Bio-Technologies (Nanjing, China). Con A-Biotin was purchased from Vector Laboratories. HRP-conjugated secondary antibody and HRPconjugated Streptavidin was purchased from ZSGB-Bio (Beijing, China). All other chemicals and solvents were bought from Sangon-Biotech (Shanghai, China).

Plasmid construction and transformation
The genes (sequence in Additional file 2: Table S1) and primers (Table 1) used in this study were synthesized by Genscript Bio-Technologies. PCR was performed using relevant pairs of primers listed ( Table 1). The EndoT gene was cloned into pPIC9K-Pir1 with EcoRI and MluI to make the constructs pPIC9K-Pir1-EndoT and introduced into P. pastoris GS115 as previously reported [46]. The DNA encoding the transmembrane region of S. cerevisiae MNN9 (mannosyltransferase) or MNS1 (endoplasmic reticulum mannosyl-oligosaccharide 1,2-alpha-mannosidase) was fused with EndoT gene and cloned into pGAPZa with EcoRI and NotI to make the constructs pGAPZa-MNN9-EndoT or pGAPZa-MNS1-EndoT respectively. The plasmids were linearized with BspHI Table 1 The primers used in this study and introduced into P. pastoris GS115 via the Gene Pulser Xcell Electroporation System (Bio-Rad). The multicopy insert transformants were selected with YPD plates containing 1 mg/mL Zeocin. The Zeocin-resistant clones were confirmed by PCR with pGAP-F and EndoT-R. The cDNA encoding the human GalNAc-T1 and IgG1-Fc region were subcloned into the pPIC9K vector respectively. Resultant clones, named pPIC9k-GALNT1 and pPIC9K-Fc, were selected and confirmed by DNA sequencing. The plasmid pPIC9k-GALNT1 and pPIC9K-Fc were linearized with SacI and introduced into P. pastoris GS115 WT and obtained pGAPZa-MNN9-EndoT and pGAPZa-MNS1-EndoT strains. The multicopy insert of transformants were selected with MD plates and subsequently YPD plates containing different concentrations of G418 (0.5 mg/mL, 1 mg/mL, 2 mg/mL or 4 mg/mL). The G418-resistant clones were confirmed by PCR with GalNAc-T1-F or Fc-F and 3′-AOXI primers. The PCRpositive clones from 4 mg/mL G418 plates were selected for the expression. Besides, the pET28a-IgG1-Fc was transferred into E. coli BL21 (DE3) as a control.

Analysis of engineered P. pastoris strains
The engineered P. pastoris Pir1-EndoT strains were cultured in BMMY medium with 0.5% methanol (v/v) for 12 h and washed with PBS. For immunofluorescence staining, the P. pastoris WT and Pir-EndoT strains were incubated with anti-Flag antibody and subsequently FITC-conjugated rabbit antibody against mouse Ig for 45 min and mounted with antifade reagent (BBI Life Sciences). Fluorescence microscopy was performed using a Zeiss Axioskop 2 plus with an AxioCam MR3. Bit depth and pixel dimensions were 36 bits and 1388 × 1040 pixels, respectively. For western blot, the P. pastoris strains were lysed with glass beads and analyzed by Western blot with anti-Flag antibody.

Expression and purification of recombinant proteins
Recombinant yeast clones were grown at 30 °C in 50 mL BMGY until the OD 600 reached 2-6. For the fermentation condition screen, Cells were harvested and cultured in BMMY (with pH 6.0, 6.5 or 7.0) for 4-5 days at different temperature (20 °C or 25 °C) and 0.5% or 1% methanol (v/v) was added to the culture every 24 h. The fermentation culture was precipitated by cold acetone after 2-5 days respectively and Coomassie-stained SDS-PAGE was used to test the production of total and glycosylated proteins.
After fermentation, secreted recombinant proteins were purified using Ni-NTA agarose (for GlalNAc-T1) or Protein G column (for IgG1-Fc region). For GalNAc-T1, the cell-free supernatant was loaded onto the Ni-NTA column pre-equilibrated with binding buffer (20 mM Tris, pH 8.0, 150 mM NaCl, 20 mM imidazole). After washed with 30 mL of binding buffer, the purified proteins were eluted with binding buffer containing 250 mM imidazole. For IgG1-Fc region, the cell-free supernatant was diluted 5 times by PBS buffer, and was loaded onto the Protein G column pre-equilibrated with PBS buffer. After washed with 30 mL of PBS buffer, the purified proteins were eluted with 0.1 M Glycine Buffer pH 2.7. The eluted protein was neutralized immediately with 1 M Tris-HCl (pH 7.0). The positive fractions (determined by SDS-PAGE) were desalted and stored at − 20 °C. Recombinant IgG1-Fc region produced in E. coli was purified following the same Ni-NTA protocol.

SDS-PAGE and western blot
Purified IgG1-Fc region and GalNAc-T1 proteins were treated with peptide N-glycosidase F (PNGase F, New England Biolabs), following the manufacturer's protocol. Samples were run on 12% SDS-PAGE gels with or without DTT reduction, and transferred onto polyvinylidene fluoride membranes for 90 min. After blocked in 5% BSA or 1% polyvinylpyrrolidone (Sigma) the membranes were incubated with His-tag antibody or ConA-B respectively at 4 °C overnight. Blots were developed with DAB Substrate kit (Solarbio, China) following incubation with HRP-conjugated secondary antibody for 1 h at room temperature.

Mass spectrometric analysis of IgG1-Fc protein
Approximately 20 μg of Fc protein was reduced with 10 mM DTT in 50 mM ammonium bicarbonate (AmBic) for 45 min at 60 °C and alkylated by 20 mM iodoacetamide at room temperate for 30 min. Then, 10 mM DTT was added to terminate alkylation before the protein was subjected to proteolysis by Glu-C (Promega). The treatment was terminated by boiling, and the digested peptides were desalted via a standard C18 Zip-Tip procedure and analyzed by MALDI-TOF MS (Shimadzu, Tokyo, Japan) or LCMS-IT-TOF system (Shimadzu, Tokyo, Japan) operated in the positive linear mode.

Circular dichroism spectroscopy
The secondary structure of the IgG1-Fc domian (from P. pastoris WT and MNN9-EndoT strains) were determined by circular dichroism using J-815 Jasco spectropolarimeter (Jasco Co., Tokyo, Japan) equipped with a PTC-348 WI thermostat under a constant nitrogen flow. A 0.1-cm path length cell was used to collect data in the far ultraviolet region (200-250 nm) at a scan speed of 20 nm/ min and a response time of 1 s. Spectra were acquired at 25 °C and measured in PBS buffer. The spectrum of a blank containing buffer alone was subtracted from all spectra. The CD data were analyzed using the CDtoolX