Genomics insights into different cellobiose hydrolysis activities in two Trichoderma hamatum strains
- Peng Cheng1, 3Email author,
- Bo Liu†2,
- Yi Su†1,
- Yao Hu2,
- Yahui Hong1,
- Xinxin Yi2,
- Lei Chen1,
- Shengying Su1,
- Jeffrey S. C. Chu4Email author,
- Nansheng Chen2, 5Email author and
- Xingyao Xiong1
© The Author(s) 2017
Received: 1 December 2016
Accepted: 9 April 2017
Published: 19 April 2017
Efficient biomass bioconversion is a promising solution to alternative energy resources and environmental issues associated with lignocellulosic wastes. The Trichoderma species of cellulolytic fungi have strong cellulose-degrading capability, and their cellulase systems have been extensively studied. Currently, a major limitation of Trichoderma strains is their low production of β-glucosidases.
We isolated two Trichoderma hamatum strains YYH13 and YYH16 with drastically different cellulose degrading efficiencies. YYH13 has higher cellobiose-hydrolyzing efficiency. To understand mechanisms underlying such differences, we sequenced the genomes of YYH13 and YYH16, which are essentially identical (38.93 and 38.92 Mb, respectively) and are similar to that of the T. hamatum strain GD12. Using GeneMark-ES, we annotated 11,316 and 11,755 protein-coding genes in YYH13 and YYH16, respectively. Comparative analysis identified 13 functionally important genes in YYH13 under positive selection. Through examining orthologous relationships, we identified 172,655, and 320 genome-specific genes in YYH13, YYH16, and GD12, respectively. We found 15 protease families that show differences between YYH13 and YYH16. Enzymatic tests showed that exoglucanase, endoglucanase, and β-glucosidase activities were higher in YYH13 than YYH16. Additionally, YYH13 contains 10 families of carbohydrate-active enzymes, including GH1, GH3, GH18, GH35, and GH55 families of chitinases, glucosidases, galactosidases, and glucanases, which are subject to stronger positive selection pressure. Furthermore, we found that the β-glucosidase gene (YYH1311079) and pGEX-KG/YYH1311079 bacterial expression vector may provide valuable insight for designing β-glucosidase with higher cellobiose-hydrolyzing efficiencies.
This study suggests that the YYH13 strain of T. hamatum has the potential to serve as a model organism for producing cellulase because of its strong ability to efficiently degrade cellulosic biomass. The genome sequences of YYH13 and YYH16 represents a valuable resource for studying efficient production of biofuels.
KeywordsTrichoderma hamatum Comparative genomics Genetic diversity β-Glucosidase Cellobiose
The growing worldwide demand for energy and the desire to reduce dependency on fossil fuels have triggered increased interest in identifying alternative energy resources, especially liquid biofuels, such as bioethanol and biodiesel. Because renewable lignocellulosic biomass is generally considered to be cheaper resource, no competition with agricultural production and cleaner raw material for ethanol production comparing with oil-based fuels , efforts in generating liquid biofuels from renewable lignocellulosic biomass have been made.
Biodegradation of lignocellulosic residues is a process that is primarily performed by microorganisms that can enzymatically digest polymeric sugars to capture soluble monosaccharides and disaccharides as carbon sources for energy production. This ability is exploited by biotechnological industries to obtain large quantities of active, stable, and specific enzymes using agricultural waste solids as raw materials . In 2015, the global market for industrial enzymes is expected to reach more than 4 billion dollars . The industrial enzymes market prefers microbial enzymes because they are more stable than enzymes from plants and animals. Fungi are particularly preferred for enzyme production because they are secreted as enzyme complexes that function in a synergistic manner, and their production is a relatively easy and inexpensive .
Currently, most kinds of commercial cellulase (including β-glucosidase) are derived from fungi, e.g. Trichoderma, Aspergillus, Phanerochaete, Schizophyllum and Penicillium . Aspergillus niger is used to produce many pectinases [6, 7] and hemicellulases  in industry. Trichoderma reesei QM6a was found to be a good producer of cellulose . Due to their efficiency in producing and secreting a broad range of cellulases and hemicellulases, both of these fungi have been the focus of extensive studies on glycoside hydrolase (GH) discovery, and there is a marked effort to understand the regulation of the expression of genes that encoding them.
Species in Trichoderma spp. is a widely distributed saprophytic ascomycete and is well known for their biocontrol ability and lignocellulose degradation abilities. Recent genome sequencing projects have targeted eight species : T. reesei, Trichoderma virens, Trichoderma atroviride, Trichoderma harzianum, Trichoderma longibrachiatum, Trichoderma asperellum, Trichoderma hamatum, and Trichoderma citrinoviride. It was observed that the tropical species T. reesei enhances the induction of its entire cellulolytic and hemicellulolytic arsenal when facing temperate R. solani, which is a very unlikely prey/host for this species in nature, whereas such a response is not observed for T. atroviride or T. virens. The presence of a basidiomycete fungus may thus signal the availability of predigested plant biomass to T. reesei, consistent with the hypothesis that this species became a saprotroph by following basidiomycetes into their habitat .
A striking weakness of the Trichoderma system is that many Trichoderma strains isolated from the wild lack necessary lignocellulolytic enzymes for efficient bioconversion processes , especially β-glucosidases, which are considered key rate- limiting enzymes in the process of cellulose degradation . For example, under cellulase-inducing conditions, the production of secreted β-glucosidase comprises only about 1% of the total T. reesei cellulase , indicating that the hydrolysis of cellobiose constitutes a rate-limiting step during the enzymatic processing of cellulose [15, 16]. Although commercial cellulase is available, many of the most well-known biomass- degrading fungi display low β-glucosidase (cellobiose) activity, thus the initial bioconversion of biomass to sugars remains a key bottleneck in the process of biofuel production. Thus, searching for Trichoderma strains with strong β-glucosidase activities is primary importance.
β-Glucosidases (EC 22.214.171.124) are found in all domains of living organisms, where they play essential roles in the removal of nonreducing terminal glucosyl residues from saccharides and glycosides. β-Glucosidases function in glycolipid and exogenous glycoside metabolism in animals, defense, cell wall lignification, cell wall β-glucan turnover, phytohormone activation, and release of aromatic compounds in plants, and biomass conversion in microorganisms. We identified T. hamatum strains from cultivated soil in HeJiaqiao, LiLing, Hunan province, China, among which YYH13 exhibited much higher antimicrobial activity against the bacterial wilt pathogen because of its higher expression of specific β-glucanase and chitinases, which play important roles as hydrolytic enzymes during cell wall degradation .
In this study, we carried out genome-wide comparative analysis of T. hamatum and other model organisms with publicly available genomes including T. atroviride, T. harzianum, T. reesei, and T. viren, which will help us explain the possible reason for YYH13 and YYH16 genome difference. To examine whether YYH13 has higher cellobiose hydrolyzing efficiency, we subjected YYH13 and YYH16 to exoglucanase, endoglucanase, β-glucosidase activity tests and expression assay of GH1 genes. In total, our results will provide a valuable resource and the genome sequence of T. hamatum YYH13 represents a new strain that can be used for further studies on the genetic bases of efficiently degrade cellulosic biomass for biofuel production by the Trichoderma species.
Results and discussion
YYH13 and YYH16 are two strains of T. hamatum with different cellulose degradation activity
Many alternative mechanisms can cause microorganism growth inhibition, including mycoparasitism, bacteriolysis, nutrient, and space competition . Trichoderma produces many hydrolases that degrade the cell wall, including chitinases, cellulases, xylanase, glucanase and proteases. These enzymes are usually extracellular, of low molecular weight and highly stable. They may be produced in multiple forms or isozymes that differ in size, regulation, and ability. This trait has often been utilized as a means of in vitro screening for biocontrol candidates. Various cell wall degrading enzymes play a very important role in the process of hyperparasitism. Some Trichoderma species have strong cellulose-degrading properties because they can secrete an enzyme system capable of degrading crystalline cellulose . For example, T. reesei QM6a strain possesses a remarkable set of genes encoding hydrolytic enzymes.
We performed cellulose degradation test using filter paper as the substrate. As shown in Fig. 1e, filter paper degradation efficiency of YYH13 at 96 h was 37.14% higher than that of YYH16 at the same time point (P < 0.05), YYH13 also showed much stronger capability for the degradation of cellulose (Fig. 1e). The filter paper degradation analysis indicated that the action of its enzymes is very potential in insoluble cellulosic substrates, due to the crystalline structure of filter paper, degradation of the filter paper would imply multiple cellulose activities, including exoglucanase activities because these enzymes work in crystalline regions. In conclusion, we observed that both YYH13 and YYH16 had rapid growth rates with the similar colonal morphologies, and similar growth curves. Despite of these similarities, these two T. hamatum strains show significant differences in cellulose degradation activities.
Genome sequencing and assembly of YYH13 and YYH16
Sequencing data size and output quality in YYH13 and YYH16
YYH13 Raw data
YYH13 Clean data
YYH16 Raw data
YYH16 Clean data
Number of reads
N of fq1
N of fq2
Low qual base of fq1: (≤5)
Low qual base of fq2: (≤5)
Q20 of fq1
Q20 of fq2
Q30 of fq1
Q30 of fq2
GC of fq1
GC of fq2
Error of fq1
Error of fq2
Discarded reads related to N and low qual
Genome assembly and annotation statistics of YYH13, YYH16, and GD12
Number of scaffolds
Total size (scaffolds)
Total size (contig)
No. of large scaf (>1 kb)
No. of large contig (>1 kb)
G + C content
Number of CEGs identified
Total protein coding genes
Total gene lengths (exon and intron)
Total exon count
Average exon length
Average exon count per gene
Average intron length
Average introns per gene
Average peptide length
17-mer analysis using YYH13, YYH16, and reesei sequencing data
Genome size (M)
Revised genome size (M)
Heterozygousity rate (%)
Genome annotations of YYH13 and YYH16
To evaluate the completeness of the assembled genomes, we performed a CEGMA (Parra et al. ) showed that both YYH13 and YYH16 identified more than 97% of all of the CEGs (complete and partial) (Table 2), higher than that for the published T. hamatum GD12, which only identified 95.97%. Thus, our genome assemblies of the two T. hamatum strains YYH13 and YYH16 are of good quality. We annotated the genomes for protein coding genes using GeneMark-ES and identified 11,316 and 11,755 genes in YYH13 and YYH16, respectively. The previous T. hamatum genome (GD12) was annotated using FgeneSH . To assess the comparability of the two methods, we also annotated the GD12 assembly using GeneMark-ES and compared the annotation results. FgeneSH predicted 10,760 genes, and GeneMark-ES predicted 11,031 genes, for which 10,169 (94% of FgeneSH annotation) genes corresponded to identical annotation structures, and 46 genes were found only by FgeneSH. Thus, we conclude that the FgeneSH gene prediction was comparable to the GeneMark-ES predictions. For consistency, we used the GeneMark-ES annotations for all of the subsequent analyses. The average gene structures (gene size, exon size, and intron size) were also similar among the T. hamatum species (Table 2).
Repetitive element annotation in YYH13, YYH16, and GD12
% of genome
70,804 bp (0.18%)
27,383 bp (0.07%)
54,399 bp (0.15%)
3710 bp (0.01%)
17,140 bp (0.04%)
7052 bp (0.02%)
0 bp (0%)
0 bp (0%)
0 bp (0%)
43,448 bp (0.11%)
44,856 bp (0.12%)
38,856 bp (0.11%)
7770 bp (0.02%)
7550 bp (0.02%)
6257 bp (0.02%)
402,017 bp (1.03%)
432,358 bp (1.11%)
324,187 bp (0.88%)
78,531 bp (0.20%)
83,574 bp (0.21%)
66,153 bp (0.18%)
Non-coding RNA annotation in YYH13, YYH16, and GD12
Identifying functionally important genes through selection pressure analysis
Selection pressure is an important source for genetic differences that may confer phenotypic differences. By examining the nonsynonymous and synonymous substitution rates of all of the one-to-one ortholog pairs of YYH13, YYH16, and GD12, we found that the majority of the genes (>98%) exhibited Ka/Ks <1, suggesting that most of the orthologs are highly conserved in evolution (Additional file 1). Nevertheless, we found that 131 genes between YYH13 and YYH16, 146 genes between GD12 and YYH16, and 154 genes between GD12 and YYH13 corresponded to Ka/Ks value greater than 1. To screen for YYH13 genes that underwent positive selection, we selected genes that satisfied positive selection criteria between YYH13 and the other two genomes but neutral or purifying selection between YYH16 and GD12. Interestingly, we found 13 genes that satisfied the above condition, one of which (GB7226_YYH13) encodes a putative subtilisin protease, which has been shown to be an exoprotease during cellulose metabolism [23, 24].
Notably, the difference between the percentages of non-synonymous mutations was retained among the three strains, which may be due to the different physiological conditions used for the selection of the strains. The selection could have been stronger for YYH13, resulting in positive selection, and thus preferential retention of non-synonymous SNVs. Moreover, Darwinian selection was tested , and the results showed that positive selection drove the evolution of sequences leading to well-known β-glucosidases involved in lignocellulose. Indeed, this study found that YYH13 has 13 genes with Ka/Ks >1, there is an obvious selection pressure will lead to β-glucosidase gene (YYH1311079) production diversity and genetic and functional difference.
Synteny analysis of T. hamatum strains
As previously argued, high synteny between organisms indicates evolutionary relatedness. Therefore, we expect to find more genes with high synteny than between more distant pairs of species. However, Berlin  argued that gene transposition, insertions, deletions, and duplications and rearrangements of chromosome fragments destroy synteny. We found that although some characteristics of the tri/TRI cluster have been conserved during evolution of YYH13 and YYH16, the cluster has undergone marked changes, including gene loss or gain, gene rearrangement, and divergence of gene function. In comparison, previous studies  have indicated that syntenic gaps in other genomes are enriched in genes that are important for species difference attributes. Although the mechanism and specific biological functions of YYH13 gene duplication have not be clarified, Ambro  showed that gene evolution is accelerated to derive new functional genes after gene duplication.
Strain-specific genes in YYH13, YYH16 and GD12
In this study, the rate of synonymous substitutions in the YYH13 gene was found to be very small, which generally occur in the process of evolution during a large-scale genome duplication event, indicating that recent duplication has played an important role in the creation of synonymous substitutions. At the same time, the purification selection pressure after YYH13 difference gene duplication was less than gene duplication of communalism, which suggests that the difference of the difference genes is more likely to produce functional variations. Nevertheless, difference genes of YYH16 may be due to functional redundancy, which contribute less to degrading lignocellulose. Similarly, the instability of transposable elements may lead to YYH16 gene rearrangement, and distribution imbalances of insertion sequences may also affect its evolution, leading to difference expression differences among strains.
Proteases gene family comparison
Protease family showing difference between YYH13 and YYH16
Location of activity
However, carboxypeptidases in the Trichoderma species with no known identified, thus the roles of the carboxypeptidases of T. hamatum in these interactions are still unknown. The proteases of Trichoderma spp. and their biocontrol roles have been previously reported . Interestingly, this work describes a protease gene family analysis of T. hamatum focusing on biomass degrading activity. Proteases have evolved to utilize different mechanisms for proteolysis [36, 37]. Further studies are needed to understand what causes T. hamatum to produce primarily protease-degrading enzymes when grown in the presence of cellulose.
CAZyme gene family comparison
CAZyme families that show difference between YYH13 and YYH16
Copper-dependent lytic polysaccharide monooxygenases
Glycogen or starch phosphorylase
Chitinase, xylanase inhibitor
Exo-acting β-N-acetylglucosaminidases, β-N-acetylgalactosamindase, β-6-SO3-N-acetylglucosaminidases
Exo-acting β-d-glucosidases, α-l-arabinofuranosidases, β-d-xylopyranosidases, N-acetyl-β-d-glucosaminidases
l-arabinofuranosidases, endo-α-l-arabinanases, β-d-xylosidases, exo α-1,3-galactanase
d-4,5-Unsaturated β-glucuronyl hydrolase
Galactose oxidase, glyoxal oxidase
Acetyl xylan esterase
Acetyl xylan esterase, cutinase
Cellulose synthase, chitin synthase
Sucrose synthase, α-glucosyltransferase
The saprotrophic species T. reesei is a model for studying Trichoderma physiology . Comparative genomics showed that YYH13 has a bigger genome than the mycoparasitic species T. reesei, suggesting that gene expansion events have occurred in an ancestor of YYH13. YYH16 is a close relative of YYH13, although YYH13 has more lignocellulose degrading related genes, including CAZymes, than YYH16, suggesting that additional saprotrophic gene expansion events occurred in YYH13 after divergence from YYH16. In summary, T. reesei is an efficient producer of cellulases and hemicellulases and is used as the major industrial resource of these enzymes . YYH13 is also an efficient cellulase producer. Furthermore, comparing cellulolytic enzymes and hemicellulolytic enzymes indicates that the number of these genes did not reduce but was increased in YYH13. The increase in lignocellulose degrading ability is affiliated with the increase in the number of lignocellulose degrading-related genes. Saprotrophy of plant biomass and the high efficiency of cellulolytic enzymes and hemicellulolytic enzyme production suggest that these enzymes may have been optimized to improve specific activities or expression levels in YYH13. In addition, chitinases, glucosidases, galactosidases, and glucanases are subject to stronger positive selection pressure in YYH13, implying that these enzymes may also play crucial roles in lignocellulose degradation.
The omics data analysis and experimental results showed that YYH13 genome expansion is affected by environmental conditions. To adapt to the specific requirements of the host environment, more genes of YYH13 have been differentiated and have formed multiple gene families. The Red Queen hypothesis [43, 44] considers that microorganisms are constantly faced with a contradiction between evolution and adaptation in the biological environment such that their genomes must be modified and transformed to overcome the contradiction. Phylogenetic analysis revealed that YYH13 mutations function significantly stronger than the effect of homologous recombination and that the classification characteristics and genealogy of YYH13 and YYH16 were shaped by these mutations. Consequently, given the differences in the genomes of strains isolated from the same area and phylogenetic classifications among different geographical regions, notwithstanding the environmental and geographic distribution distance factors, there may be other factors driving the evolution of YYH13, YYH16, and GD12 genomes and their population difference.
Expression assay of GH1 genes in YYH13 and YYH16
In contrast, we found that the activity of β-glucosidase in YYH13 was significantly higher than QM6A (Fig. 7c). In fact, β-glucosidase is an important component of the cellulase enzyme system that not only participates in cellulose degradation but also plays a key role in hydrolyzing cellulose to fermentable glucose by relieving the inhibition of exoglucanase and endoglucanase from cellobiose. However, it is difficult for T. reesei to efficiently convert cellobiose to glucose due to the lack of β-glucosidase, although it is a good producer of cellulase .
Cellobiose, which is an intermediate product, is also a strong inhibitor of endoglucanase and exoglucanase and is one of the key bottlenecks in enzymatic hydrolysis . To prevent this inhibition process, the cellobiose unit must be immediately removed. β-glucosidase reduces cellobiose inhibition by hydrolyzing the disaccharide to glucose, allowing cellulolytic enzymes to function more efficiently . Therefore, homologous production and evolutionary studies of the β-glucosidase gene (YYH1311079) from the biomass-degrading fungus T. hamatum gives new insights into the physicochemical parameters and biodiversity of this family.
Cloned YYH1311079 gene and construction of pGEX-KG/YYH1311079 expression vector
YYH1311079 cDNA clone was inserted in the pGEX-KG, a expression vector at BamHI and HindIII sites. Following transformation to Escherichia coli BL21 (DE3) cells, the recombinant clone was selected and propagated. The recombinant plasmid with YYH1311079 gene insert was confirmed following the digestion with BamHI and HindIII which released the fragment of desired 1575 bp. Ampicillin resistance gene and ColE1 origin are provided for selection and maintenance of recombinant in E. coli.
The pGEX-KG/YYH1311079 engineered bacteria was constructed according the antibiotic resistance, colony PCR and sequencing analysis. It indicated that the expression plasmid was constructed correctly. Overall, our results will provide a valuable gene that will be explain whether β-glucosidase is a key rate-limiting enzyme in the process of cellulose degradation. YYH13 strain whether displayed better characteristics in cellulose degradation, and showed great application potentials in ethanol production through degrading renewable lignocellulosic biomass although correlative mechanisms still need further exploration.
The isolates were cultured on PDA (Potato Dextrose Agar, Difco) and were incubated in normal light for 3 days at 28 °C. For morphological characterization of T. hamatum, observations on morphology of mycelium, spore, and colonial were made using Microscopic Imaging System-MVC2000 .
Strains growth conditions and mycelium dry weights determinations
Mature spores of YYH13 and YYH16 strains were collected and re-suspended in sterile distilled water containing 0.05% Tween 20 (Sigma, USA). Spores were counted by haemacytometer. 5 × 105 spores of YYH13 and YYH16 strains were added to 50 mL PDA liquid media respectively, and cultured at 28 °C in the conditions of dark and continuous shaking. For the determination of fungal dry weights, mycelia were collected by two layers of paper filter (Whatman GF-C) after culture of 12, 24, 36, 48, 60, 72, 84 and 96 h respectively. Mycelia were rinsed with distilled water three times and then dried in oven at 60 °C.
Strains liquid fermentation culture conditions
YYH13 (T. hamatum), YYH16 (T. hamatum) and QM6A (T. reesei) strains at the same growth states were cultured in liquid fermentation medium (NH4NO3·2 g, KH2PO4·4 g, MgSO4·7H2O 0.3 g, CaCl2·2H2O 0.3 g, MnSO4·7H2O 0.007 g, FeSO4·7H2O 0.005 g, NaCl 0.1 g, 1% of rice straw, 1000 mL H2O, pH 6.0) at 28 °C and while shaking at 120 rpm for 0, 24, 48, 72, or 96 h. Crude enzyme extract was obtained via centrifugation at 13,000g× 10 min at 4 °C, and the supernatants were used for enzyme activity assays.
YYH13 and YYH16 strains at the same growth state were cultured in the minimal medium (NH4NO3·2 g, KH2PO4·4 g, MgSO4·7H2O 0.3 g, CaCl2·2H2O 0.3 g, MnSO4·7H2O 0.007 g, FeSO4·7H2O 0.005 g, NaCl 0.1 g, peptone 3 g, 1000 mL H2O, pH 6.0). After 48 h of cultivation at 28 °C while shaking at 120 rpm, the mycelia were harvested and transferred to the same medium containing no peptone, and 1% d-cellobiose was added. The cultures were then incubated at 28 °C while shaking at 120 rpm for 0, 4, 8, or 12 h. All of the assays were performed in triplicate.
All enzyme activities were presented as specific activities using international units (IU) per mL supernatant. The FPase (FPA) activity and endoglucanase (EG) activity were measured by the DNS method with glucose as a standard, as described in [47, 48]. The β-glucosidase activity was determined using p-Nitrophenyl-β-d-glucopyranoside (pNPG) as a substrate based on the reported method by Takashima . The exo-1,4-β-glucanase (CBH) activity was measured as reported by Deshpande .
Sequencing and assembly
The sequenced reads were examined for low quality reads by filtering reads with adaptor sequences of >10% Ns or >50% nucleotides of quality (Q) ≤5. The final output was the clean reads. A genome survey was performed with the clean reads by counting the frequency of 17-mers from 3.8 Gb of data from YYH13 and YYH16. The K-mer frequency was plotted using R.
The assembly was performed using SOAPdenovo  with K-mer ranging from 21 to 111, and the assembly with the largest N50 was chosen. Scaffolds that are less than 500 bp were removed in the final assembly.
Phylogenetic analysis of YYH13 and YYH16
Genomic sequence spanning ITS1 and ITS2 were extracted from a previous sequencing study . A total of 54 ITS1-ITS2 sequences were used as queries in BLAST against YYH13 and YYH16 assemblies. A phylogenetic tree with the ITS1-ITS2 sequences from YYH13 and YYH16 was built by first aligning 92 other sequences from JGI database (http://jgi.doe.gov/) and GenBank (https://www.ncbi.nlm.nih.gov/genbank/). Species recognition in Trichoderma is usually based on the application of the genealogical concordance phylogenetic species recognition concept based on the partial genes sequences of translation elongation factor 1ɑ (Tef-1), calmodulin (cal1-1), and chitinase 18-5 (chi18-5) . To further confirmed that YYH13 and YYH16 are strains of the T. hamatum species, the concatenated sequence of Tef-1, cal1-1, and chi18-5 genes were used to construct a phylogenetic tree as described. Consensus tree was inferred using the neighbour-joining method. Bootstrap analysis was conducted using the MEGA 5.1 (http://www.megasoftware.net/) with 1000 replications to obtain the confidence value for the aligned sequence dataset. A phylogenetic tree was constructed via maximum parsimony.
Gene annotation was performed using GeneMark-ES 2.3.e  on YYH13, YYH16, and GD12 assemblies. The GD12 assembly and FgeneSH annotation on GD12 was downloaded from JGI. Each gene was annotated for its putative function using GO, the NCBI-nr database, KOG, and KEGG. Putative functional domains were annotated using Pfam (Protein families). Genes with putative CAZyme functions were annotated using dbCAN  with version 4 of the database. A valid annotation required database alignment >80 aa, E-value < 1e−5, and percent alignment coverage >30%. Genes with putative protease functions were annotated using the MEROPS database Release 9.13 (Rawlings et al. ) with BlastP PID >35%, E-value < 1e−5, and bit score >30. Repetitive element annotation was performed using RepeatModeler and RepeatMasker (www.repeatmasker.org) under the default settings.
Genome sequence analysis
Orthologous relationships were determined first using Inparanoid  under the default settings. Each one-to-one orthologous relationship was examined for possible gene model improvement. Gene model improvement was performed via reciprocal genBlastG  comparisons between YYH13 and YYH16. Thus, genBlastG was performed with YYH13 genes as the query and the YYH16 genome as the target, and vice versa. The genBlastG model must lie within the same coordinates as the original gene model. The revised model is from the highest global PID among the three gene pairs (I: Original YYH13 gene and original YYH16 gene; II: Original YYH13 gene and genBlastG model in YYH16 genome; III: Original YYH16 gene and genBlastG model in YYH13 genome).
The gene model revision in GD12 was performed first using the revised YYH13 gene set as the query and further improved using YYH16. If a gene model was improved by both YYH13 and YYH16, only the revision from YYH13 was kept. Finally, the mean PID and standard deviation were calculated based on the revised one-to-one relationships for each pair of genome.
Genome difference genes were identified using Tribe-MCL (inflation value = 1.6)  with the original gene set. Each genome difference genes were examined using genBlastG. For each genome difference gene, genBlastG was used against the two other genomes under the default settings. If the genBlastG model and the query showed a global PID ≥ mean PID-2 standard deviations, then the genome difference gene was considered a false positive and filtered.
The synteny blocks between two genomes were analyzed using orthocluster with parameters “-f–rs”. The perfect synteny blocks did not allow for any mismatches. Imperfect synteny blocks were obtained with additional “-i 5–o 5” parameters. The orthologous relationships used as input were the one-to-one relationships based on the Inparanoid results. The Circos diagram was constructed by including only the scaffold containing gene models. The genome on the right was considered the reference, and the genome on the left were reordered.
Gene family comparison
The gene family annotation for CAZyme and proteases were annotated to the orthologous relationships from Inparanoid. Genomes that were missing orthologous genes in the family were examined using genBlastG revision to ensure the difference observed was not due to misassembly or misannotation. First, if genBlastG was able to produce a gene model in the target genome with percent identity (PID) > mean PID-2 standard deviation, then the model was considered a valid homologous gene. Otherwise, it was considered a low PID and was filtered. If gene family expansion had occurred, a valid genBlastG model may overlap with an existing gene annotation. Thus, if a valid genBlastG model overlapped with an existing gene annotation that already had an ortholog, then the genBlastG model was filtered. The genBlastG models were also annotated using dbCAN and MEROPS, as previously described, and marked as “No annotated function” if the sequence did not pass the annotation criteria.
Synonymous and non-synonymous mutations were determined from pair-wise alignments of revised one-to-one relationships. Ka/Ks ratio was calculated using Ka/Ks Calculator 2.0 using the MYN algorithm.
Real-time polymerase chain reaction
Mycelia were harvested, frozen and ground in liquid nitrogen. Total RNAs from the mycelia were extracted using TRIzol (Invitrogen, USA), and polyA mRNAs were purified using a PolyATract mRNA Isolation System (Promega, Madison, WI) according to the manufacturer’s instructions. All cDNAs were synthesized via reverse transcription reaction performed using ReverTra Ace (Toyobo, Japan) at 42 °C for 1 h and then 85 °C for 15 min to stop the reaction. The standard protocol was 95 °C for 10 min followed by 40 cycles at 95 °C for 10 s and 59 °C for 50 s. All reactions were performed in triplicate. The GAPDH was used as internal reference gene. GH1 family beta-glucosidase genes (YYH137902, YYH13952, YYH134611, YYH1311079, YYH1612, YYH1611163, YYH164503) were classified using dbcan analysis system, including YYH1311079 was specific gene, YYH137902 and YYH1612, YYH134611 and YYH164503, YYH13952 and YYH1611163 were between homologous genes. qRT-PCR was performed using PikoReal 96-well thermal cyclers (Thermo, USA) with primers and temperatures as described in Additional file 4.
Cloning and construction of recombinant plasmid expression vector
Escherichia coli strains BL21 (DE3) (Invitrogen, Carlsbad, CA, USA) were used for cloning and expression experiments. E. coli strains were grown in Luria–Bertani (LB) broth or on agar plates at 37 °C. Ampicillin (Sangon Biotech, Shanghai, China) was used in growth media when required. The vectors pGEX-KG (Takara, China) was used for polymerase (Additional file 5) chain reaction (PCR) cloning. The coding sequence of YYH1311079 was amplified by PCR using a sense primer (5′-CGCGGATCCATGTCCAAAGAGGCGTC GATGTTC-3′) and an antisense primer (5′-CCCAAGCTTCTATATCCCTCTGCGC CTGGCAAAAG-3′) with BamHI and HindIII restriction enzyme sites (underlined), respectively. The protocol is an initial denaturation at 95 °C for 1 min followed by 30 cycles of amplification (95 °C for 10 s, 58 °C for 50 s, and 72 °C for 2 min) and an additional extension step at 72 °C for 10 min. Two white single colonies were selected and inoculated to 5 mL LB culture solution containing 5 µL 100 μg/mL ampicillin and underwent shake culture at 37 °C overnight. The plasmid DNA was extracted using alkaline lysis, and underwent two single digestions with BamHI and HindIII respectively, and then electrophoresis with 1% agarose gel was done to identify the positive clone. Some of the constructed pGEX-KG/YYH1311079 expression plasmid were sent to Shen Zhen HuiDa an corp in China for sequencing.
the β-glucosidase activity
the CBH activity
the CMC activity
the filter paper activity
potato dextrose agar
the pretreated corn stover
CP, JC, and NC designed research. CP,BL,YH AND XY performed research. XX,LC and SS contributed reagents and analytic tools. CP,BL,YS and YH analyzed data. CP,NC, and YS wrote the paper. All authors read and approved the final manuscript.
The present study was supported by National Natural Science Foundation of China (31570372), Scientific research outstanding youth project of Hunan Provincial Education Department (15B112), Science and technology project of Hunan Province (2015NK3005), The Natural Science foundation of Hunan Province (2016JJ3074), Science and technology project of Changsha City (kh1601179), The Open Science Foundation of Hunan Provincial Key Laboratory for Germplasm Innovation and Utilization of Crop (15KFXM13).
The authors declare that they have no competing interests.
Availability of data and materials
Data supporting the conclusions of this article are included within the article.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Castro LDS, Pedersoli WR, Antoniêto ACC, Steindorff AS, Silva-Rocha R, Martinez-Rossi NM, Rossi A, Brown NA, Goldman GH, Faça VM, et al. Comparative metabolism of cellulose, sophorose and glucose in Trichoderma reesei using high-throughput genomic and proteomic analyses. Biotechnol Biofuels. 2014;7(1):196–202.Google Scholar
- Fang T, Liao B, Lee S. Enhanced production of xylanase by Aspergillus carneus M34 in solid-state fermentation with agricultural waste using statistical approach. New Biotechnol. 2009;27(1):25–32.View ArticleGoogle Scholar
- Sanatan PT, Lomate PR, Giri AP, Hivrale VK. Characterization of a chemostable serine alkaline protease from Periplaneta americana. BMC Biochem. 2013;14(22):1–9.Google Scholar
- He J, Kieselbach T, Jönsson LJ. Enzyme production by filamentous fungi: analysis of the secretome of Trichoderma reesei grown on unconventional carbon source. Microb Cell Fact. 2010;10(47):68.Google Scholar
- Baba Y, Sumitani JI, Tani S, Kawaguchi T. Characterization of Aspergillus aculeatus, β-glucosidase 1 accelerating cellulose hydrolysis with Trichoderma, cellulase system. AMB Express. 2015;5(1):1–9.View ArticleGoogle Scholar
- Bussink HJ, Buxton FP, Fraaye BA, de Graaff LH, Visser J. The polygalacturonases of Aspergillus niger are encoded by a family of diverged genes. Eur J Biochem. 1992;208(1):83–90.View ArticleGoogle Scholar
- Patil SR, Dayanand A. Optimization of process for the production of fungal pectinases from deseeded sunflower head in submerged and solid-state conditions. Bioresour Technol. 2006;97(18):2340–4.View ArticleGoogle Scholar
- Polizeli ML, Rizzatti AC, Monti R, Terenzi HF, Jorge JA, Amorim DS. Xylanases from fungi: properties and industrial applications. Appl Microbiol Biotechnol. 2005;67(5):577–91.View ArticleGoogle Scholar
- Mandels M, Reese ET. Induction of cellulase in Trichoderma viride as influenced by carbon sources and metals. J Bacteriol. 1957;73:269–78.Google Scholar
- Mukherjee PK, Horwitz BA, Herreraestrella A, Schmoll M, Kenerley CM. Trichoderma research in the genome era. Annu Rev Phytopathol. 2013;51(1):105–29.View ArticleGoogle Scholar
- Druzhinina IS, Shelest E, Kubicek CP. Novel traits of Trichoderma, predicted through the analysis of its secretome. FEMS Microbiol Lett. 2012;337(1):1–9.View ArticleGoogle Scholar
- Seidl V, Seiboth B. Trichoderma reesei: genetic approaches to improving strain efficiency. Biofuels. 2010;1(2):343–54.View ArticleGoogle Scholar
- Beitel SM, Knob A. Penicillium miczynskii β -glucosidase: a glucose-tolerant enzyme produced using pineapple peel as substrate. Ind Biotechnol. 2013;9:103–8.View ArticleGoogle Scholar
- Karkehabadi S, Helmich KE, Kaper T, Hansson H, Mikkelsen NE, Gudmundsson M, Piens K, Fujdala M, Banerjee G, Scott-Craig JS, et al. Biochemical characterization and crystal structures of a fungal family 3 beta-glucosidase, Cel3A from Hypocrea jecorina. J Biol Chem. 2014;289(45):31624–37.View ArticleGoogle Scholar
- Duff SJB, Cooper DG, Fuller OM. Cellulase and beta-glucosidase production by mixed culture of Trichoderma reesei Rut C30 and Aspergillus phoenicis. Biotechnol Lett. 1985;7:185–90.View ArticleGoogle Scholar
- Chuang YC, Li WC, Chen CL, Hsu WC, Tung SY, Kuo HC. Trichoderma reesei meiosis generates segmentally aneuploid progeny with higher xylanase-producing capability. Biotechnol Biofuels. 2015;8(1):1–15.View ArticleGoogle Scholar
- Cheng P, Song W, Gong X, Liu YS, Xie WG, Huang LH, Hong YH. Proteomic approaches of Trichoderma hamatum to control Ralstonia solanacearum causing bacterial wilt. Int J Agric Biol. 2015;17(6):987–90.Google Scholar
- Kattner D. The pathogenicity of Trichoderma hamatum on Norway spruce (Picea abies) seedlings. Allgemeine Forst Und Jagdzeitung. 1990;161:1–6.Google Scholar
- El-Hassan SA, Gowen SR, Pembroke B. Use of Trichoderma hamatum for biocontrol of lentil vascular wilt disease: efficacy, mechanisms of interaction and future prospects. J Plant Prot Res. 2013;53(1):12–26.View ArticleGoogle Scholar
- Toyama H. 1Hp20 Enhancement of the degrading ability of micro-crystalline cellulose in the selected strain derived from Trichoderma reesei QM9414 treated with colchicine under lower temperature conditions. Manag Decis. 2013;52(5):934–49.Google Scholar
- Studholme DJ, Winsbury R, Perera V, Ryder L. Investigating the beneficial traits of Trichoderma hamatum GD12 for sustainable agriculture-insights from genomics. Front Plant Sci. 2013;4:258.View ArticleGoogle Scholar
- Kubicek CP, Martinez DA, Druzhinina IS, Thon M. Comparative genome sequence analysis underscores mycoparasitism as the ancestral life style of Trichoderma. Genome Biol. 2011;12(4):81–9.View ArticleGoogle Scholar
- Baroncelli R, Piaggeschi G, Fiorini L, Bertolini E, Zapparata A, Pè ME, Sarrocco S, Vannacci G. Draft whole-genome sequence of the biocontrol agent Trichoderma harzianum T6776. Genome Announc. 2015;3(3):1–2.View ArticleGoogle Scholar
- Martinez D, Berka RM, Henrissat B, Saloheimo M, Arvas M, Baker SE, Chapman J, Chertkov O, Coutinho PM, Cullen D, et al. Genome sequencing and analysis of the biomass-degrading fungus Trichoderma reesei (syn. Hypocrea jecorina). Nat Biotechnol. 2008;26(5):553–60.View ArticleGoogle Scholar
- Sun RY, Liu ZC, Fu K, Fan L, Jie C. Trichoderma biodiversity in China. J Appl Genet. 2012;53(3):343–54.View ArticleGoogle Scholar
- Parra G, Bradnam K, Korf I. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics. 2007;23(9):1061–7.View ArticleGoogle Scholar
- Levasseur A, Saloheimo M, Navarro D, Andberg M, Pontarotti P, Kruus K, Record E. Exploring laccase-like multicopper oxidase genes from the ascomycete Trichoderma reesei: a functional, phylogenetic and evolutionary study. BMC Biochem. 2010;11(1):1–10.View ArticleGoogle Scholar
- Berlin A, Maximenko V, Bura R, Kyu-Young K, Neil G, Jack S. A rapid microassay to evaluate enzymatic hydrolysis of lignocellulosic substrates. Biotechnol Bioeng. 2006;93(5):880–6.View ArticleGoogle Scholar
- Machida M, Asai K, Sano M, Toshihiro T, Toshitaka K, Goro T, Kusumoto KI, Arima T, Akita O, Kashiwagi Y, et al. Genome sequencing and analysis of Aspergillus oryzae. Nature. 2005;438(7071):1157–61.View ArticleGoogle Scholar
- Ambro VH. Conserved functions of yeast genes support the duplication, degeneration and complementation model for gene duplication. Genetics. 2005;171(4):1455–61.View ArticleGoogle Scholar
- Lange C, Weld RJ, Cox MP, Bradshaw RE, Mclean KL, Stewart A, Steyaert JM. Genome-scale investigation of phenotypically distinct but nearly clonal Trichoderma strains. PeerJ. 2016;4(5):e2023.View ArticleGoogle Scholar
- Rawlings ND, Waller M, Barrett AJ, Bateman A. MEROPS: the database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res. 2014;42(Database Issue):503–9.View ArticleGoogle Scholar
- Do VL, Gómez-Mendoza DP, Kim MS, Kim MS, Pandey A, Ricart CA, Ximenes FFE, Sousa MV. Secretome analysis of the fungus Trichoderma harzianum grown on cellulose. Proteomics. 2012;12(17):2716–28.View ArticleGoogle Scholar
- Kanauchi M, Bamforth CW. Growth of Trichoderma viride on crude cell wall preparations from barley. J Agric Food Chem. 2001;49(2):883–7.View ArticleGoogle Scholar
- Elad Y, Kapat A. The role of Trichoderma harzianum protease in the biocontrol of Botrytis cinerea. Eur J Plant Pathol. 1999;105(105):177–89.View ArticleGoogle Scholar
- Rawlings ND, Barrett AJ. Evolutionary families of peptidases. Biochem J. 1993;290:205–18.View ArticleGoogle Scholar
- Rawlings ND, Barrett AJ, Bateman A. MEROPS: the database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res. 2012;40:343–50.View ArticleGoogle Scholar
- Vincent L, Hemalatha GR, Elodie D, Coutinho PM, Bernard H. The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res. 2014;42:490–5.View ArticleGoogle Scholar
- Druzhinina IS, Verena SS, Alfredo HE, Horwitz BA, Kenerley CM, Enrique M. Trichoderma: the genomics of opportunistic success. Nat Rev Microbiol. 2011;9(10):749–59.View ArticleGoogle Scholar
- James R, Ketudat C, Asim E. β-Glucosidases. Cell Mol Life Sci. 2010;67(67):3389–405.Google Scholar
- Thornton CR. An immunological approach to quantifying the saprotrophic growth dynamics of Trichoderma species during antagonistic interactions with Rhizoctonia solani in a soil-less mix. Environ Microbiol. 2004;6(4):323–34.View ArticleGoogle Scholar
- Bischof R, Fourtis L, Limbeck A, Gamauf C, Seiboth B, Kubicek CP. Comparative analysis of the Trichoderma reesei transcriptome during growth on the cellulase inducing substrates wheat straw and lactose. Biotechnol Biofuels. 2013;6(1):1–14.View ArticleGoogle Scholar
- Hantsch L, Braun U, Haase J, Purschke O, Scherer-Lorenzen M, Bruelheide H. No plant functional diversity effects on foliar fungal pathogens in experimental tree communities. Fungal Divers. 2014;66(1):1–13.View ArticleGoogle Scholar
- And KC, Kover PX. The red queen hypothesis and plant/pathogen interactions. Annu Rev Phytopathol. 1996;34(34):29–50.Google Scholar
- George SP, Ahmad A, Rao MB. Studies on carboxymethyl cellulase produced by an alkalothermophilic actinomycete. Bioresour Technol. 2001;77(2):171–5.View ArticleGoogle Scholar
- Shin HJ, Yang JW. Galactooligosaccharide synthesis from lactose by Penicillium funiculosum cellulase. Biotechnol Lett. 1996;18(2):143–4.View ArticleGoogle Scholar
- Xiao Z, Storms R, Tsang A. Microplate-based filter paper assay to measure total cellulase activity. Biotechnol Bioeng. 2004;88(7):832–7.View ArticleGoogle Scholar
- Xiao Z, Storms R, Tsang A. Microplate-based carboxymethylcellulose assay for endoglucanase activity. Anal Biochem. 2005;342(1):176–8.View ArticleGoogle Scholar
- Takashima S, Nakamura A, Hidaka M, Masaki H, Uozumi T. Molecular cloning and expression of the novel fungal beta-glucosidase genes from Humicola grisea and Trichoderma reesei. J Biochem. 1999;125(4):728–36.View ArticleGoogle Scholar
- Deshpande MV, Eriksson KE, Göran Pettersson L. An assay for selective determination of exo-1,4,-β-glucanases in a mixture of cellulolytic enzymes. Anal Biochem. 1984;138(2):481–7.View ArticleGoogle Scholar
- Li Y, Hu Y, Bolund L, Wang J. State of the art de novo assembly of human genomes from massively parallel sequencing data. Hum Genom. 2010;4(4):271–7.View ArticleGoogle Scholar
- Druzhinina IS, Komoń-Zelazowska M, Ismaiel A, Jaklitsch W, Mullaw T, Samuels GJ, Kubicek CP. Molecular phylogeny and species delimitation in the section Longibrachiatum of Trichoderma. Fungal Genet Biol. 2012;49(5):358.View ArticleGoogle Scholar
- Besemer J, Lomsadze A, Borodovsky M. GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. Am Bank. 2001;29(12):2607–18.Google Scholar
- Yanbin Y, Xizeng M, Jincai Y, Xin C, Fenglou M, Ying X. dbCAN: a web resource for automated carbohydrate-active enzyme annotation. Nucleic Acids Res. 2012;40:445.Google Scholar
- Ostlund G, Schmitt T, Forslund K, Köstler T, Messina DN, Roopra S, Frings O, Sonnhammer EL. InParanoid 7: new algorithms and tools for eukaryotic orthology analysis. Nucleic Acids Res. 2010;38:196–203.View ArticleGoogle Scholar
- She R, Chu JS, Uyar B, Wang J, Wang K, Chen N. genBlastG: using BLAST searches to build homologous gene models. Bioinformatics. 2011;27(15):2141–3.View ArticleGoogle Scholar
- Enright AJ, Van Dongen S, Ouzounis CA. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 2002;30(7):1575–84.View ArticleGoogle Scholar