Two novel deep-sea sediment metagenome-derived esterases: residue 199 is the determinant of substrate specificity and preference

Background The deep-sea environment harbors a vast pool of novel enzymes. Owing to the limitations of cultivation, cultivation-independent has become an effective method for mining novel enzymes from the environment. Based on a deep-sea sediment metagenomics library, lipolytic-positive clones were obtained by activity-based screening methods. Results Two novel esterases, DMWf18-543 and DMWf18-558, were obtained from a deep-sea metagenomic library through activity-based screening and high-throughput sequencing methods. These esterases shared 80.7% amino acid identity with each other and were determined to be new members of bacterial lipolytic enzyme family IV. The two enzymes showed the highest activities toward p-nitrophenyl (p-NP) butyrate at pH 7.0 and 35–40 °C and were found to be resistant to some metal ions (Ba2+, Mg2+, and Sr2+) and detergents (Triton X-100, Tween 20, and Tween 80). DMWf18-543 and DMWf18-558 exhibited distinct substrate specificities and preferences. DMWf18-543 showed a catalytic range for substrates of C2–C8, whereas DMWf18-558 presented a wider range of C2–C14. Additionally, DMWf18-543 preferred p-NP butyrate, whereas DMWf18-558 preferred both p-NP butyrate and p-NP hexanoate. To investigate the mechanism underlying the phenotypic differences between the esterases, their three-dimensional structures were compared by using homology modeling. The results suggested that residue Leu199 of DMWf18-543 shortens and blocks the substrate-binding pocket. This hypothesis was confirmed by the finding that the DMWf18-558-A199L mutant showed a similar substrate specificity profile to that of DMWf18-543. Conclusions This study characterized two novel homologous esterases obtained from a deep-sea sediment metagenomic library. The structural modeling and mutagenesis analysis provided insight into the determinants of their substrate specificity and preference. The characterization and mechanistic analyses of these two novel enzymes should provide a basis for further exploration of their potential biotechnological applications. Electronic supplementary material The online version of this article (10.1186/s12934-018-0864-4) contains supplementary material, which is available to authorized users.

Pairs of enzymes showing high sequence identity but some phenotypic differences are good materials for studying the mechanisms of catalysis. Their amino acid sequences and crystal structures or structural models can be compared to reveal the sequence or structural differences that cause phenotypic differences. For example, a comparison of the structures of the esterases RmEstA (preferring longer-chain esters) and RmEstB (preferring shorter-chain esters) has revealed that residues Phe222 and Trp92 of RmEstB are related to the shape of the substrate-binding pocket, thereby affecting substrate specificity [17]. Additionally, structural and biochemical analyses of the esterases EstFa_R (pH optimum of 5.0) and SshEstI (pH optimum of 8.0) have been carried out to investigate the mechanism underlying low-pH adaptation, and the unique extended hydrogen bond network around the catalytic triad and the negatively charged surface around the active site have been found to result in a low-pH optimum of EstFa_R [18].
A metagenomic library has previously been constructed from Pacific Ocean deep-sea sediment, from which several lipolytic enzymes have been screened [10]. In the present study, further activity-based screening for lipolytic enzymes was performed; two novel esterase genes (dmwf18-543 and dmwf18-558) were obtained using high-throughput sequencing method. The encoded enzymes, DMWf18-543 and DMWf18-558, showed high amino acid sequence identities but different enzymatic characteristics compared with esterases from the same metagenomic library previously obtained by Jiang et al. [10]. On the basis of the high sequence identity and the differences between the two enzymes in terms of substrate specificity and preference, their sequences and structural models were compared, and insights into the determinants of their phenotypic differences were obtained.

Metagenome sequencing and analysis
A deep-sea sediment sample (depth of 5886 m) was collected from the skirt of a seamount in the central Pacific Ocean and used to construct a fosmid metagenomic library, as described previously [10]. In this study, Escherichia coli EPI300 cells harboring fosmids were spread on Luria-Bertani (LB) agar medium supplemented with 12.5 µg/ml chloramphenicol and 1% tributyrin, and lipolytic-positive clones were screened by using the clear zone method [10]. The LB medium contained 10 g/L NaCl, 10 g/L tryptone, and 5 g/L yeast extract (BD, USA), at pH 7.0. To harvest large amounts of pCC2FOS fosmid DNA for sequencing, the E. coli EPI300 cells were cultivated with 2 μl/mL CopyControl ™ Fosmid Autoinduction Solution (Epicentre Biotechnologies, USA) for 17 h at 250 rpm. The fosmid DNA was extracted using Axygen Plasmid Miniprep Kit (Corning, USA) and treated with plasmid-safe ATP-dependent DNase (Epicentre Biotechnologies, USA) to remove chromosomal DNA contamination from the host strain.
High-throughput sequencing of the fosmid vectors was carried out on the Illumina HiSeq 2000 platform by Novogene Bioinformatics Technology Co. Ltd. (Beijing). One paired-end (2 × 125 bp) library was constructed, with an approximate insert size of 500 bp. Reads were assembled de novo into contigs by using ABySS version 1.5.2 [19]. Assembly k-values from k = 48 to 64 were tested to identify the optimal value, of k = 62, by using the abyss-pe script. Vector contamination was filtered out via BLAST searches against the pCC2FOS2 ™ fosmid sequence (NCBI Accession Number EU140752). Open reading frame (ORF) prediction was performed by using MetaGeneMark [20]. The functional annotation of amino acid sequences of ORFs was performed with the BLASTP program (from the NCBI-Blast-2.2.29 + command-line package) [21] against the GenBank nr [22] and Cluster of Orthologous Groups (COG) [23] databases as well as with the PROKKA package [24].

Sequence analysis of esterase genes
Two highly related putative esterase genes designated dmwf18-543 and dmwf18-558 were screened, and their deduced amino acid sequences were analyzed using the BLASTP program (https://blast.ncbi.nlm.nih.gov) [25]. Multiple sequence alignment of their amino acid sequences was performed using Clustal X version 2 [26] and ESPript 3.0 [27]. The corresponding phylogenetic tree was constructed using the neighbor-joining method [28] with MEGA version 7.0 software [29].

Cloning, mutation, expression, and purification
The genes were amplified using PrimeSTAR HS DNA polymerase (TaKaRa, China). The primers 5ʹ-TCGCGGA TCCATGGCCAGCCCACAGCT-3ʹ (BamHI site underlined) and 5ʹ-TCCGCTCGAGCTAGCGTGCGGCGG CGG-3ʹ (XhoI site underlined) were used to amplify the full-length dmwf18-543 gene; and the primers 5ʹ-TCG CGGATCCATGGCGAGTCCACAGCTCC-3ʹ (BamHI site underlined) and 5ʹ-ATTTGCGGCCGCCTAGCG TGCGGCGGC-3ʹ (NotI site underlined) were used to amplify dmwf18-558. The products were digested by corresponding restriction endonucleases (New England Biolabs, USA) and cloned into pSMT3 expression vector that had been digested with the same enzymes using T4 DNA ligase (New England Biolabs, USA). The recombinant plasmids were then transformed into E. coli Rosetta (DE3) cells for protein expression. Transformants harboring the recombinant plasmids were identified via PCR and further confirmed through DNA sequencing. Point mutants were generated by site-directed mutagenesis using the Fast Mutagenesis System (Transgene Biotech, China), with wild-type plasmids as templates. The cells were cultivated at 37 °C in LB medium with 50 µg/mL kanamycin and 200 rpm until the optical density (OD 600 ) reached approximately 0.6 and were then induced with 0.5 mM of isopropyl-β-d-thiogalactopyranoside (IPTG) at 20 °C and 200 rpm. After cultivation for 16 h, the cells were collected via centrifugation at 12,000 rpm and 4 °C and then washed with phosphate-buffered saline (0.8% NaCl, 0.02% KCl, 0.142% Na 2 HPO 4 , 0.027% KH 2 PO 4 , pH 7.4). Next, the cells were suspended in 20 mM imidazole buffer (500 mM NaCl, 20 mM Tris-HCl, pH 8.0) and subjected to ultrasonic disruption (350 W). After centrifugation at 12,000 rpm and 4 °C for 30 min, the supernatant was purified with Ni Sepharose (GE, USA) to obtain the N-terminal His-tagged small ubiquitin-related modifier (SUMO) fusion. The fusion protein was cleaved with ubiquitin-like specific protease 1 (ULP1) with overnight dialysis at 4 °C. Subsequently, the products were passed through the Ni Sepharose column again to capture the His-tagged SUMO. The recombinant protein in the eluate was verified via sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) using 12% polyacrylamide gels.

Enzyme activity assay
Esterase activity assays were performed using a spectrophotometric method. The standard reaction mixture contained 10 µL of 100 mM p-NP butyrate, 980 µL of phosphate buffer (100 mM, pH 7.5), and the purified enzyme in a final volume of 1 mL. The activity of the enzyme was quantified at 35 °C, and the release of p-nitrophenol was measured at 405 nm over 2 min with a DU800 ultraviolet-visible spectrophotometer (Beckman, USA). All values were determined in triplicate, and reactions with the added thermally inactivated enzyme were used as controls. One unit of enzyme activity was defined as the amount of esterase required to release 1 µmol of p-nitrophenol per minute from the p-NP ester.
The optimum temperature for enzyme activity was assessed over a range of 15-60 °C at intervals of 5 °C. The optimum pH for enzyme activity was determined over a pH range from 3.0 to 10.0. The release of p-nitrophenol at different pH values was measured at 348 nm, the pHindependent isosbestic wavelength of p-nitrophenol and p-nitrophenolate. The buffers used were 100 mM citrate buffer (pH 3.0-6.0), 100 mM phosphate buffer (pH 6.0-7.5), 100 mM tricine buffer (pH 7.5-9.0), and 50 mM CHES buffer (pH 9.0-10.0).
All values were determined in triplicate. Data were presented as mean ± SD. Statistical analyses were performed with two-tailed unpaired Student's t-tests. P values less than 0.05 were considered statistically significant.

Nucleotide sequence accession numbers
The nucleotide sequences of contigs S36 and S37 have been deposited into the GenBank database under Accession Numbers MF770983 and MF770984. The locus_tags of dmwf18-543 and dmwf18-558 esterase genes are S36_0543 and S37_0558.

Sequence analysis of fosmid clones
Eighteen lipolytic-positive clones were screened through activity-based screening and identified on the basis of clear zones around the colonies on LB agar medium supplemented with 1% tributyrin [10]. The fosmid clones were then mixed and used in high-throughput sequencing. In total, 40 contigs were assembled and annotated (data not shown). Among these contigs, S36 and S37 were 11,479 bp and 11,464 bp long and showed high sequence similarity with each other (97.3%, 11,176/11,488). Both of the contigs contained 14 ORFs with the same gene arrangement (Additional file 1: Table S1). The ORFs of the two contigs exhibited high sequence identities: 72.2-100.0% for their nucleotide sequences and 83.7-100.0% for their deduced amino acid sequences (Additional file 1: Table S1). The deduced amino acid sequences indicated that 8 out of the 14 ORFs could be assigned by using the COG classification system. Of these 8 ORFs, three encoded functions related to energy production and conversion, two related to lipid transport and metabolism, and one related to replication, recombination and repair.
Phylogenetic analysis was carried out to reveal the relationships between the two esterases and other known bacterial lipolytic enzymes. A phylogenetic tree constructed with 17 bacterial lipolytic enzyme families showed that DMWf18-543 and DMWf18-558 belong to family IV (Fig. 1). Multiple sequence alignment of DMWf18-543 and DMWf18-558 with highly related homologs indicated that their catalytic triad is composed of residues Ser144, Glu238, and His268 (Fig. 2). The catalytic residue Ser144 and the oxyanion hole residues Gly75 and Gly76 are located in the GXSXG (GDSAG, position 142-146) and HGGG (positions 74-77) motifs, which are conserved in the family IV lipolytic enzymes.

Expression and characterization
The gene fragments encoding the enzymes were cloned into the pSMT3 vector and expressed in E. coli Rosetta (DE3). After induction of expression with IPTG at 20 °C for 12 h, the recombinant proteins were purified using His-tag affinity chromatography. The calculated molecular weights (MW) of DMWf18-543 and DMWf18-558 were 32.32 and 32.12 kDa. SDS-PAGE analysis of the purified proteins revealed an approximate MW of 32 kDa, which was consistent with the calculated value of DMWf18-543 and DMWf18-558 (Additional file 2: Figure S1). The kinetic parameters of purified DMWf18-543 and DMWf18-558 were determined by using p-NP butyrate and p-NP hexanoate as substrates, and the results are shown in Table 2. When p-NP butyrate was used as a substrate, the V max and K m of DMWf18-543 and DMWf18-558 were 5957 ± 112 µmol/mg/min and 0.33 ± 0.02 mM and 5523 ± 228 µmol/mg/min and 0.60 ± 0.06 mM, respectively.   The substrate specificities of the enzymes were determined by using p-NP esters with various acyl chain lengths (C2-C16) (Fig. 3a). Among the esters tested, DMWf18-543 exhibited the highest activity toward p-NP butyrate (4757 U/mg) and only weak activity toward p-NP acetate, p-NP hexanoate, and p-NP caprylate (10, 6, and 1% relative activities). Additionally, no activity was detected toward p-NP esters with side chains longer than C10. Nevertheless, DMWf18-558 exhibited high activity toward p-NP butyrate (4520 U/mg) and p-NP hexanoate (4054 U/mg). DMWf18-558 also retained over 14% and 27% relative activity toward p-NP acetate and p-NP caprylate, respectively, in addition to showing weak activity toward p-NP decanoate, p-NP laurate, and p-NP myristate (10, 4, and 3%) (Fig. 3a). The optimum activities of the enzymes were measured over a pH range of 3.0-10.0 and a temperature range of 15-60 °C, by using p-NP butyrate as the substrate. DMWf18-543 showed the highest activity at pH 7.0 and 35-40 °C and DMWf18-558 at pH 7.0 and 35 °C (Fig. 3b, c).

Comparison of substrate-binding pockets
Superposition of the surfaces of DMWf18-543 and DMWf18-558 showed differences in their substratebinding pockets (Fig. 5). The substrate-binding pocket of DMWf18-543 exhibited a short channel, whereas that of DMWf18-558 had a longer channel (~ 8 Å longer) (Fig. 5a, b). The difference in the shape of the substrate-binding pockets of the two esterases results in differences in the favored carbon chain lengths of the ester substrates (Fig. 3a). Superposition of the surface of DMWf18-558 and sticks of DMWf18-543 showed that Fig. 3 Characterization of DMWf18-543 and DMWf18-558. a Substrate specificity was determined using the p-NP esters, including p-NP acetate (C2), p-NP butyrate (C4), p-NP caprylate (C8), p-NP decanoate (C10), p-NP laurate (C12), p-NP myristate (C14), and p-NP palmitate (C16). All of the tests were performed at 35 °C and pH 7.5. b Effects of pH on the activity were determined in different buffers: 100 mM citrate buffer (pH 3.0-6.0), 100 mM phosphate buffer (pH 6.0-7.5), 100 mM tricine buffer (pH 7.5-9.0), and 50 mM CHES buffer (pH 9.0-10.0). All of the tests were performed at 35 °C using p-NP butyrate as the substrate. c Effects of temperature on the activity were determined at various temperatures at pH 7.5 using p-NP butyrate as the substrate. The highest activity was taken as 100%. Data are presented as the mean ± SD (n = 3) residue Leu199 of DMWf18-543 might shorten and block the substrate-binding pocket (Fig. 5c), thus potentially leading to little DMWf18-543 activity toward p-NP esters with side chains longer than C6. Additionally, DMWf18-558, with relatively high activity toward p-NP hexanoate and p-NP caprylate, exhibited the relatively small sidechain residue Ala199 at the same location.
To verify the above hypothesis, two mutants (DMWf18-543-L199A and DMWf18-558-A199L) were designed, and only DMWf18-558-A199L was successfully expressed. The substrate specificity of the mutant was determined. Compared with the wild-type enzymes, the activity of the DMWf18-558-A199L mutant toward p-NP hexanoate and p-NP caprylate decreased markedly to 18 and 7% (Fig. 6). Additionally, little activity was detected toward p-NP esters with side chains longer than C10 (Fig. 6). The results suggested that residue 199 is the determinant of substrate specificity and preference.

Discussion
In this study, we identified two novel esterase genes through activity-based screening and high-throughput fosmid sequencing from a metagenomic library from Pacific Ocean deep-sea sediment. We analyzed and compared the biochemical characteristics and structural models of the two esterases.
Forty contigs were obtained after 18 lipolytic-positive fosmid clones were sequenced and assembled. The two novel esterase genes were located on two contigs showing high sequence similarity and the same gene arrangement. Each contig contained 14 ORFs, and half of the ORFs (7 out of 14) presented the most significant BLAST hits to members of Chloroflexi (Additional file 1: Table S1), thus indicating that the two genome fragments might belong to uncultured Chloroflexi bacteria. Chloroflexi are quite frequently found in deep-sea surface sediment environments [35][36][37]. Additionally, a Chloroflexi genome fragment has previously been found and sequenced in the same metagenomic library [38].
The two novel esterases, DMWf18-543 and DMWf18-558, shared high amino acid identity with each other (80.7%) and showed the highest sequence identity to Est6, Est2 and Est4 among members of the database (55.0-82.4%), all of which were identified from the same metagenomic library through function-driven screening and subcloning [10]. The enzymatic characteristics of these esterases, which exhibited the same origin and shared high sequence identity, were compared. Est6 is a cold-activated esterase and exhibited its highest activity at 20 °C [10], whereas Est2 and Est4 exhibited their highest activity at higher temperatures (40-45 °C,

Table 3 Effects of various metal ions and chelation on the activity of DMWf18-543 and DMWf18-558
The activity observed without metal ion or chelation was taken as 100%. * p < 0.05, representing a significant difference between DMWf18-543 and DMWf18-558 (Student's t test)

Table 4 Effects of various organic solvents and detergents on the activity of DMWf18-543 and DMWf18-558
The activity observed without an organic solvent or detergent was taken as 100%. and retained 27% activity toward p-NP caprylate and even some activity toward p-NP esters with side chains of C10-C14.
To investigate the structural basis of the different substrate specificities and preferences of the enzymes, we attempted to obtain their crystal structures. Unfortunately, crystals did not appear and grow (data not shown). Thus, structural models of DMWf18-543 and DMWf18-558 were constructed and compared. The model comparison results suggested that DMWf18-543 has a short substrate-binding channel, whereas DMWf18-558 has a relatively long substrate-binding channel (Fig. 5a, b). The substrate-binding channel of DMWf18-543 was shortened by residue Leu199 (Fig. 5c), which might result in low or no activity toward p-NP esters with longer side chains. Residues Leu199 of DMWf18-543 and Ala199 of DMWf18-558 were selected for site-directed mutagenesis to investigate their roles in the substrate accommodation of substrate-binding pockets. Finally, DMWf18-558-A199L was induced and expressed successfully. The substitution of leucine for alanine caused the substratebinding pocket to shorten, and the substrate specificity profile of DMWf18-558-A199L was similar to that of DMWf18-543 (Figs. 3a, 6). Furthermore, the DMWf18-558-A199L mutant retained little activity toward p-NP esters with side chains longer than C6.
The substrate specificity or preference of esterases toward esters with different acyl chain lengths is usually attributed to the shape of the substrate-binding pocket. A previous comparison of the crystal structures of two family IV esterases (RmEstA and RmEstB) from Rhizomucor miehei [17] has shown that two aromatic amino acids (Phe222 and Trp92) in the substrate-binding pocket of RmEstB narrow its substrate-binding pocket as well as its substrate specificity. Phe222 and Trp92 of RmEstB correspond to Tyr78 and Ala203 of DMWf18-543 and DMWf18-558, which are not located in their substratebinding pockets. Thus, these two residues were not considered to be determinants of substrate specificity or preference on the basis of present study.
The mutation of residue Ala199 also affected the catalytic activity and substrate affinity of the enzyme. A decrease in the K m value and an increase in the kcat value toward p-NP butyrate were observed (from 0.60 ± 0.06 mM and 2957 ± 122/s to 0.16 ± 0.01 mM and 1126 ± 17/s, respectively), thus suggesting that the ability to bind substrate was enhanced and that the turnover rate of the enzyme-substrate complex to product and enzyme increased. Therefore, the catalytic efficiency of DMWf18-558-A199L against p-NP butyrate was higher than that of wild-type DMWf18-558, as indicated by a kcat/K m value of 7239/mM/s ( Table 2).
The stronger substrate affinity and higher catalytic efficiency of DMWf18-558-A199L might make it more advantageous in industrial applications. In addition, the metal ion and detergent tolerant properties make DMWf18-543 and DMWf18-558 good candidates for biocatalytic processes requiring or containing metal ions or detergents, such as laundry detergent, textile, bioremediation and waste treatment.
In conclusion, two novel esterase genes with high similarity were screened, cloned and expressed from a deepsea sediment metagenomic library. The enzymes were characterized and showed different hydrolysis abilities toward esters with different acyl chain lengths. Additionally, the structural modeling and mutagenesis of The substrate-binding pockets are indicated with red dashed lines. Residues of the catalytic triad are shown as stick models in yellow. c Superposition of the surfaces of DMWf18-543 (green) and DMWf18-558 (gray). Leu199 of DMWf18-543 and Ala199 of DMWf18-558 are shown as stick models Fig. 6 Substrate specificity of DMWf18-558-A199L. The esterase activity of the mutant was determined using p-NP esters. The highest activity was taken as 100%. Data are presented as the mean ± SD (n = 3)