Enhancement of cellulase production in Trichoderma reesei RUT-C30 by comparative genomic screening

Background Cellulolytic enzymes produced by the filamentous fungus Trichoderma reesei are commonly used in biomass conversion. The high cost of cellulase is still a significant challenge to commercial biofuel production. Improving cellulase production in T. reesei for application in the cellulosic biorefinery setting is an urgent priority. Results Trichoderma reesei hyper-cellulolytic mutant SS-II derived from the T. reesei NG14 strain exhibited faster growth rate and more efficient lignocellulosic biomass degradation than those of RUT-C30, another hyper-cellulolytic strain derived from NG14. To identify any genetic changes that occurred in SS-II, we sequenced its genome using Illumina MiSeq. In total, 184 single nucleotide polymorphisms and 40 insertions and deletions were identified. SS-II sequencing revealed 107 novel mutations and a full-length wild-type carbon catabolite repressor 1 gene (cre1). To combine the mutations of RUT-C30 and SS-II, the sequence of one confirmed beneficial mutation in RUT-C30, cre196, was introduced in SS-II to replace full-length cre1, forming the mutant SS-II-cre196. The total cellulase production of SS-II-cre196 was decreased owing to the limited growth of SS-II-cre196. In contrast, 57 genes mutated only in SS-II were selected and knocked out in RUT-C30. Of these, 31 were involved in T. reesei growth or cellulase production. Cellulase activity was significantly increased in five deletion strains compared with that in two starter strains, RUT-C30 and SS-II. Cellulase production of T. reesei Δ108642 and Δ56839 was significantly increased by 83.7% and 70.1%, respectively, compared with that of RUT-C30. The amount of glucose released from pretreated corn stover hydrolyzed by the crude enzyme from Δ108642 increased by 11.9%. Conclusions The positive attribute confirmed in one cellulase hyper-producing strain does not always work efficiently in another cellulase hyper-producing strain, owing to the differences in genetic background. Genome re-sequencing revealed novel mutations that might affect cellulase production and other pathways indirectly related to cellulase formation. Our strategy of combining the mutations of two strains successfully identified a number of interesting phenotypes associated with cellulase production. These findings will contribute to the creation of a gene library that can be used to investigate the involvement of various genes in the regulation of cellulase production. Electronic supplementary material The online version of this article (10.1186/s12934-019-1131-z) contains supplementary material, which is available to authorized users.


Background
Lignocellulosic biomass, which consists of cellulose, hemicellulose, and lignin, is a renewable resource that is abundantly available for the production of biofuels and chemicals. Biological conversion of lignocellulosic biomass into fermentable sugars by cellulosic enzymes is an environment-friendly and promising approach [1,2]. However, the production cost of biomass-degrading enzymes is still a significant challenge for commercial biofuel production [2]. Trichoderma reesei (an anamorph of Hypocrea jecorina) has been widely used to produce commercial cellulase required for the complete hydrolysis of lignocellulose [3]. The cellulase produced by T. reesei mainly comprises two cellobiohydrolases (CBHI and CBHII), two endoglucanases (EGI and EGII), and β-glucosidase I (BGLI) that synergistically hydrolyze lignocellulosic materials, in concert with related xylanases and auxiliary proteins [1,3].
In general, the regulation of cellulase expression in T. reesei depends on several transcription factors [4]. Xylanase regulator 1 (XYR1) is essential for the expression of most cellulase and xylanase genes [4,5]. Moreover, expression of cellulase and xylanase genes is subject to carbon catabolite repression (CCR) [6], regulated by carbon catabolite repressor 1 (CRE1) [7]. CCR facilitates preferential assimilation of easily metabolized carbon sources by inhibiting the expression of enzymes involved in the catabolism of other carbon sources. This is essential for the adaptation and survival of T. reesei [4,6].
Classical mutagenesis techniques have been used to generate many T. reesei hyper-cellulolytic strains that exhibit increased production of cellulases compared to that in the progenitor strain QM6a [8][9][10][11][12]. There are two distinct pedigree lineages of T. reesei mutant strains [8,11]. One was developed at Rutgers University (Fig. 1a). It includes the NG14 strain, which was derived from strain M7 (no longer available, shown in gray in Fig. 1a) through chemical mutagenesis using N-nitrosoguanidine (NTG) [13]. T. reesei RUT-C30, a carbon catabolite-repression mutant, was isolated from NG14 using ultraviolet (UV) mutagenesis. RUT-C30 is one of the best cellulase hyperproducers available in the public domain. RUT-C30 produces twice the amount of extracellular protein as that in the parental strain NG14 [8] and has diverse applications in research and industry [2]. Improving cellulase production in T. reesei RUT-C30 for application in the cellulosic biorefinery setting is increasingly becoming a focus of research [2,14,15].
Genetic changes can influence protein synthesis and secretion in T. reesei. Recently, the genomes of several T. reesei mutants were analyzed using a variety of techniques and several mutation sites were reported [11,12,[16][17][18]. Genome sequencing of RUT-C30 and its parental strain NG14 revealed 126 single nucleotide polymorphisms (SNPs) and 22 insertions and deletions (indels) between the original strain QM6a and NG14 mutant strain, and 97 SNPs and 11 indels between NG14 and RUT-C30 [11]. To date, two mutations involved in cellulase production in RUT-C30 have been identified. One is a truncated form of the cre1 gene (cre1 96 ), which exerts a positive regulatory influence on the expression of target genes [19][20][21]. The other is a frameshift mutation in the glucosidase II alpha subunit gene [22]. Interestingly, a cre1 mutation was also found in another T. reesei hyper-cellulolytic mutant, PC-3-7 [23]. Mutations of cre1 have been consistently linked to the hyper-production of cellulase. Many other mutations have been found in other cellulase hyper-producers, usually accompanied by the identification of novel genes that are involved in cellulase production. A comparative genomic analysis of PC-3-7 and its parent KDG-12 revealed a novel beta-glucosidase regulator BglR with a missense mutation. The deletion of BglR contributed to improved cellulase production [24]. Pei et al. [25] analyzed the hypersecretion-related mutant genes in T. reesei RUT-C30 and systematically deleted their orthologs in the fungus Neurospora crassa to identify several genes related to cellulase production, particularly Ncap3m. The deletion of Ncap3m increased the filter paper activity (FPase) and lignocellulose secretion by approximately 44% and 50%, respectively [25]. The novel transcription factor vib1 was found by genome sequencing analysis of T. reesei QM9978 [17]. Zhang et al. improved cellulase production of T. reesei RUT-C30 by overexpressing vib1 [14]. Therefore, genome-wide analysis of mutations is an efficient approach for the identification of novel genes involved in cellulase production.
In this study, we sequenced a T. reesei hyper-cellulolytic mutant (donated by Sunson Industry Group Co., Ltd, China) designated SS-II, which has been used for industrial production and has not been characterized through academic research. It exhibits a remarkable increase in the growth rate and carboxymethyl cellulase (CMCase) and p-nitrophenyl-β-d-glucosidase (pNPGase) activity compared with those of RUT-C30. SS-II also shows improved ability to degrade lignocellulosic biomass. To Biomass dry weight of T. reesei strains were measured in MA medium containing 2% (w/v) glucose (b), 2% (w/v) lactose (c), or 2% (w/v) Avicel (D) as the sole carbon source. FPase (e), CMCase (f), pNPCase (g), and pNPGase (h) activities and total secreted protein (i) of T. reesei strains were measured using Avicel as the carbon source. j Hydrolysis of pretreated corn stover using the crude enzyme from T. reesei strains at 20 FPU/g dry biomass. Values are the mean ± SD of the results from three independent experiments. Asterisks indicate significant differences (*p < 0.05, **p < 0.01, ***p < 0.001, Student's t test) Liu et al. Microb Cell Fact (2019) 18:81 identify novel genes related to cellulase production and to acquire more efficient cellulase-producing strains, 57 genes that were mutated only in SS-II and which have been demonstrated to be involved in transport, secretion, protein metabolism, and transcription were deleted in T. reesei RUT-C30 to construct mutants to screen their effects on cellulase production. Genome sequencing provided insight into the influence of genetic changes in cellulase production, ultimately leading to the production of more efficient cellulase-producing strains. The findings indicate that introducing specific mutations from one hyper-cellulolytic strain into another is a feasible strategy to improve cellulase production.

Phenotypic characteristics of T. reesei hyper-cellulolytic mutant SS-II
We obtained the T. reesei hyper-cellulolytic mutant strain, SS-II, which was used for cellulase production, from Sunson Industry between 2002 and 2010. was derived from strain NG14 through NTG mutagenesis. The well-known hyper-cellulolytic mutant RUT-C30 (ATCC 56765) was also isolated from NG14 using UV mutagenesis (Fig. 1a).
To characterize the differences in growth rates between SS-II and RUT-C30, T. reesei strains were grown in MA medium with 2% (w/v) glucose, 2% (w/v) lactose, or 2% (w/v) Avicel as the sole carbon source. SS-II grew significantly faster than RUT-C30 on all three carbon sources ( Fig. 1b-d). The genetic basis underlying the high growth rate of the SS-II requires further study.
To characterize the cellulase productivity of SS-II, T. reesei strains were grown containing 2% (w/v) Avicel as the carbon source. SS-II exhibited levels of FPase activity similar to that of the hyper-cellulolytic mutant RUT-C30 (Fig. 1e). SS-II exhibited lower pNPCase activity than RUT-C30. However, CMCase activity, pNPGase activity, and extracellular protein production in SS-II were higher than those in RUT-C30 ( Fig. 1f-i). SS-II, as a T. reesei hyper-cellulolytic mutant strain, also exhibited considerably enhanced CMCase, pNPCase, and pNPGase activities compared with those of NG14 ( Fig. 1e-i).
When the same FPase loading (20 filter paper cellulase units [FPU]/g dry biomass) was used to hydrolyze pretreated corn stover (PCS), SS-II showed an increase of 30.9% in glucose concentration, reaching 35.2 g/L, compared to that of RUT-C30 (Fig. 1j). The enhanced pNPGase and CMCase activities in the cellulase set of SS-II accounted for the higher glucose yield than that of RUT-C30 when the same FPase loading was used (Fig. 1j). Cellulase produced by SS-II was more effective than that from RUT-C30 for the degradation of lignocellulosic biomass to produce glucose. The difference in the ratios of the four kinds of hydrolase activities in the cellulase set of SS-II compared with those of RUT-C30 indicated again that the genetic basis of SS-II was quite different from that of RUT-C30, although SS-II and RUT-C30 were both derived from the same parental strain NG14.

Genomic analysis of T. reesei SS-II
The SS-II genome was sequenced to identify the genetic differences in SS-II and explore new genetic features that may be involved in the hyper-production of cellulase. The sequence determined in the Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under the accession number SIJN00000000. The version described in this paper is version SIJN01000000.
In total, 8,232,344 paired-end reads (98.05% of the total) were mapped to the wild-type (QM6a) reference genome, representing an average coverage depth of 52×. This resulted in 97.02% coverage of the QM6a reference genome. Using the quality filter described in the methods, 184 single nucleotide polymorphisms (SNPs) and 40 insertions/deletions (indels) were identified in the SS-II genome (Additional file 1). The majority of the SNPs (71%) were in non-coding regions, largely in the promoter region. Forty-one (1.3%) of the SNPs were in the exons. Of these, 26 SNPs caused a change in the amino acid sequence. Most indels (80.8%) were also in the noncoding regions, with only three occurring in the exons (Table 1).
Of the 184 SNPs detected in the genome of SS-II, 77 were unique to this strain. They were not inherited from the parental strain NG14 and did not exist in the genome of the RUT-C30 offspring of NG14 [11]. Of the 40 indels detected in SS-II, 30 were unique to SS-II. Among the mutated genes affected by the 77 SNPs and 30 indels, 57

Table 1 Number of genes affected by different SNPs and indels
a The difference in the sum is due to mutations affecting the same gene simultaneously at the two regions-promoter and terminator. Differences in genes affected are due to several single nucleotide exchanges occurring in the same gene b SNPs and indels that exist in SS-II, but not in NG14 and RUT-C30

Total SNPs 184 (77) b Total indels 40 (30) b
In promoters 60 (18) In promoters 8 (5) In terminators 33 (22) In terminators 13 (11) In introns 13 ( (Table 2) were obtained based on whether the SNPs and indels were located in the exon, promoter (within 1 kb upstream of the start codon), terminator (within 0.5 kb downstream of the stop codon), or intron regions. These 57 selected genes were affected by the unique SNPs and indels, which were present in SS-II, but not in NG14 or RUT-C30. The manner in which the SNPs and indels affected the encoding proteins is summarized in Table 2. The data indicate the types of changes, such as missense mutations or amino acid exchanges. For example, tre108914 (annotated as methyltransferase type II) was affected by the SNP in the exon, where Cys was changed to Arg owing to the mutation of a T to a C. tre74622 (annotated as P-loop containing nucleoside triphosphate hydrolase protein) was affected by a deletion of 16 bases "GAT GAC GAT GAT TTTC" in the terminator region. Genes were also categorized based on gene ontology (GO) annotation. These 57 genes were probably involved in transport and secretion (9 genes), signal transduction (3 genes), transcription (6 genes), and protein metabolism (18 genes). Mutations in the proteins of SS-II could potentially lead to changes in growth rate and metabolite sensing. Interestingly, several genes containing multiple mutations were found in SS-II. They included tre76453 with two SNPs and tre104898 with four SNPs.

Introduction of truncated form cre1 96 sequence from RUT-C30 in SS-II to replace full-length cre1
SS-II and RUT-C30 were derived from the same parental strain NG14 (Fig. 1a). Unlike NG14 and the wild-type QM6a, RUT-C30 displays a catabolite-derepressed phenotype in the presence of glycerol or glucose, because of the partial lack of the cre1 gene ( Fig. 2a) encoding the carbon catabolite repressor 1 (CREI) [6,19,21]. RUT-C30 harbors a truncated form, cre1 96 (Fig. 2a), which has a positive regulatory influence on the production of cellulase [6,19,20]. Genome sequencing revealed that unlike SS-II, which displayed higher cellulase production compared with that of RUT-C30, as measured by FPase, SS-II possesses wild-type cre1 (Fig. 2a). To address whether the truncated cre1 96 could further improve cellulase production by SS-II and acquire a more efficient cellulaseproducing strain, the cre1 96 sequence from RUT-C30 was introduced into SS-II to replace the full-length cre1, which generated the mutant SS-II-cre1 96 (Fig. 2a).
The cellulase production of T. reesei SS-II-cre1 96 was measured upon cultivation in MA medium with lactose as the carbon source. As shown in Fig. 2b-e, the FPase and pNPCase activities of T. reesei SS-II-cre1 96 and SS-II were independently measured as U/mL fermentation broth and U/mg biomass. When the enzyme activities were measured as U/mL fermentation broth (Fig. 2b, d), FPase and pNPCase of T. reesei SS-II-cre1 96 were significantly decreased in comparison with those of SS-II. However, when the enzyme activities were measured as U/mg biomass, FPase and pNPCase activities of the SS-II-cre1 96 strain were remarkably increased after 5 days of cultivation compared with those of SS-II (Fig. 2c, e). The obvious difference between U/mL fermentation broth and U/mg biomass of the SS-II-cre1 96 strain was the substantially lower growth rate of SS-II-cre1 96 (Additional file 2: Fig. S1). Nakari-Setälä et al. [20] reported that the deletion of cre1 increased the quantity of cellulases produced by the wild-type T. reesei QM6a strain. Therefore, we mutated cre1 to obtain a better cellulase-producing strain from SS-II. However, truncation or deletion (Additional file 2: Fig. S2) of cre1 did not result in higher total cellulase production in SS-II, which was associated with the remarkably lower growth rate. The expression of cellulases in individual hyphae of SS-II-cre1 96 was increased, but total cellulase production of SS-II-cre1 96 was decreased because of the lower growth rate in SS-II-cre1 96 . This demonstrated that cre1 greatly affected the growth of T. reesei SS-II and that the full-length cre1 might be necessary for the rapid growth of SS-II. Mutation of cre1 is not always the key step to increase total cellulase production in T. reesei because of the effect on the growth of T. reesei.

RUT-C30
The 57 genes ( Table 2) with specific mutations in SS-II might be involved in cellulase hyper-production of SS-II. To address whether the 57-gene knockout (KO) could enhance cellulase production in T. reesei RUT-C30, we tested the cellulase production capacity of 57 KO mutants of RUT-C30. Furthermore, screening of the KO mutants of RUT-C30 was expected to combine the mutations of both RUT-C30 and SS-II, to acquire more efficient cellulase-producing strains from T. reesei RUT-C30.
The 57 genes were knocked out in RUT-C30 (Table 2). Only 46 mutants produced viable homokaryotic colonies, proving that these genes were not essential for survival. Homokaryotic transformants of 11 genes could not be obtained upon deleting them in RUT-C30, suggesting that these were essential genes in RUT-C30 (Table 2). These gene KO mutants died during the spore germination period. Compared with the original strain of RUT-C30, six deletion strains showed severely retarded growth (Additional file 2: Fig. S3).
We further screened these 46 mutants by determining their cellulase production capacity. The measured FPase activities of the mutants 5 days from cultivation in Avicel, which reflected total extracellular cellulase activity [38],  Table 2. The differences in FPase activities of the mutants compared with those of the parental strain RUT-C30 were statistically tested ( Table 2). FPase activities were significantly reduced in 15 strains compared with those in RUT-C30 (Table 2). Among these 15 mutants with reduced FPase activities, deletion of tre55868 (encoding a serine/threonine phosphatase), tre81043 (encoding a zinc finger, TFIIS-type), tre111216 (encoding a histone H3 methyltransferase), and tre80339 (encoding an apolipophorin-III and similar insect proteins) significantly reduced the FPase activity to 66.3%, 54.5%, 65.2%, and 56.5%, respectively. These 15 genes were speculated to be involved in cellulase hyper-production. However, further studies involving experiments, such as overexpression of these genes in RUT-C30, are required to verify this hypothesis. Extracellular cellulase production significantly increased in five mutants compared with that of RUT-C30 (Table 2). Of these five mutants, the Δ108642 strain (tre108642 encodes an unknown protein) displayed an 83.7% increase in FPase activity, which was the highest extracellular cellulase activity observed among the tested strains. The Δ56839 strain (tre56839 is annotated as a zinc-dependent alcohol dehydrogenase-like protein) demonstrated an approximately 70.1% increase in FPase activity ( Table 2). The Δ108642 and Δ56839 mutants displayed over 70% increase in their FPase activities.
To further analyze the effects of deletion of the tre108642 and tre56839 genes on the production of the cellulase set, the corresponding activities of various cellulase components and secreted protein concentrations were assayed in Δ108642 and Δ56839. The FPase, CMCase, pNPCase, and pNPGase activities of Δ108642 and Δ56839 were greater than those of the parental strain RUT-C30 (Fig. 3). In agreement with the noticeable increase in cellulase activities, 55% to 70% more secreted proteins were detected in the culture supernatant of Δ108642 and Δ56839 than those of RUT-C30 (Fig. 3).  Δ108642 and Δ56839 displayed increased FPase activities and secreted more protein than both the parent strain, RUT-C30 and strain SS-II, which harbored specific mutations ( Table 3). The crude enzyme produced by Δ108642 and Δ56839 was used for the hydrolysis of PCS to produce glucose, to evaluate the efficiency of cellulase activity in the T. reesei mutants (Table 3) with the same FPase loading (20 FPU/g dry biomass). The glucose yields after enzymatic hydrolysis from T. reesei SS-II, Δ108642, and Δ56839 were statistically compared with the yield from T. reesei RUT-C30 (Table 3). Compared with that of T.
reesei RUT-C30, the enzyme set of the Δ108642 mutant showed an 11.9% increase in glucose concentration at 48 h in the PCS hydrolysate. The Δ56839 strain showed a similar glucose concentration in the PCS hydrolysate. Although the PCS hydrolysate efficiency of Δ108642 was lower than that of SS-II, the total cellulase production of Δ108642 was superior to that of SS-II and RUT-C30.

Analysis of Δ108642 and Δ56839 strains with improved cellulase production
The mutants Δ108642 and Δ56839 were also re-complemented by transformation with the vectors R108642 and R56839, respectively. The complementation strains (R108642 and R56839) displayed restorations of the enzyme activities and secreted protein concentration, similar to those of the parent strain RUT-C30 (Additional file 2: Fig. S4). This proved that gene KO of tre108640 and tre56839 contributed to the improved cellulase production in Δ108642 and Δ56839.
To further confirm the effects of the mutations on cellulase expression, the transcriptional levels (expression as fold-change) of cellulase related genes, including the five main cellulase genes (cbh1, cbh2, egl1, egl2, and bgl1), and the key transcriptional activator gene xyr1, were analyzed at 48 h, 72 h, and 96 h using RT-qPCR. The mutant Δ108642 exhibited higher expression levels of all five genes compared to RUT-C30 (Fig. 4), consistent with the results of the enhanced cellulase activities. The improved transcription of xyr1 may be the main reason for the enhanced cellulase expression in T. reesei Δ108642 (Fig. 4). The transcript levels of the five main cellulase genes and xyr1 showed no significant difference between Δ56839 mutant and RUT-C30 (Fig. 4). In addition, the expression level of the activator of chaperone genes (hac1) [28] was monitored by RT-qPCR in Δ56839 (Fig. 4). In Δ56839, the hac1 expression level significantly increased compared with that in RUT-C30 at all time points. The increased expression level of hac1 activates many genes encoding endoplasmic reticulum chaperones and foldases [28], which might explain the enhanced production of cellulase by Δ56839.
tre108642 is annotated as an unknown protein, whose function needs further research. Moreover, tre56839 encodes an alcohol dehydrogenase. To further characterize the function of gene tre56839, it was expressed in Escherichia coli using the T7 expression vector pET22b. The purified TRE56839 protein was resolved using SDS-PAGE (Additional file 2, Fig. S5). TRE56839 displayed enzyme activity against cinnamyl alcohol. The activity was approximately 13.26/nmol/min/mg using NADP + as the coenzyme (the method of protein expression and enzyme activity assay is shown in Additional file 2). However, the role of tre56839 in cellulase expression remains unclear. Therefore, the reasons for increased cellulase production by Δ108642 and Δ56839 are different. Scientifically combining the effects of these KOs into one strain needs to be further researched for the construction of more hyper-cellulolytic strains.

Discussion
In this study, the genetic background of an industrial hyper-cellulase-producing strain of Trichoderma reesei, SS-II, was analyzed. T. reesei SS-II is derived from T. reesei NG-14, the parental strain of the industrial precursor strain RUT-C30. The SS-II strain exhibited higher CMCase activity and increased growth compared with those of the RUT-C30 strain. Cellulase produced by SS-II was more efficient than that from RUT-C30 for the degradation of lignocellulosic biomass to produce glucose. However, the SS-II strain showed similar levels of FPase activity, and lower pNPCase activity compared with those of the RUT-C30 strain. Thus, we sought to combine the mutations of RUT-C30 and SS-II to acquire more efficient cellulase-producing strains.
We sequenced the SS-II strain and identified a number of new mutations, mainly SNPs and indels. Fiftyseven genes that were uniquely mutated in SS-II, but not in NG14 and RUT-C30, were identified. The impact of the 57 genes on cellulase production has not been studied before. The 57 genes were knocked out in RUT-C30. Compared with those in RUT-C30, cellulase activity was significantly increased in five deletion strains. KO of 15 genes in RUT-C30 decreased the FPase activities in RUT-C30. Eleven genes were proven to be essential genes in RUT-C30.
Deletion of tre108642 in RUT-C30 produced the highest extracellular cellulase activity among the tested strains. The improved transcription of xyr1 is proposed to be the main reason for the enhanced cellulase expression in T. reesei Δ108642. The components of cellulase affected the efficiency of hydrolysis in the pretreated biomass [14,39]. Cellulase from Δ108642 can produce more glucose through the degradation of PCS than that of T. reesei RUT-C30 with the same FPase loading. Strain Δ108642 can produce more cellulase than that of T. reesei RUT-C30 or SS-II. T. reesei Δ108642 may have potential value in industrial cellulase production. A BLAST search of tre108642 indicated it encodes an unknown protein ( Table 2). The function and mechanism of this protein on cellulase production require further investigation.
Only one alcohol dehydrogenase (glucose-ribitol dehydrogenase 1, GRD1) has been studied in detail in T. reesei [40]. The activity of GRD1 on cellobiose was reported. Deletion of grd1 in T. reesei leads to decreased cellulase activity. Presently, we found two alcohol dehydrogenase genes, tre56839 and tre108784, which affected cellulase production in T. reesei (Table 2). After examing some substrates (methanol, ethanol, cinnamyl alcohol, butanol, isopropyl alcohol, benzyl alcohol, and ribitol) for TRE56839 and TRE108784, the TRE56839 protein has enzyme activity for cinnamyl alcohol using NADP + as the coenzyme (Additional file 2). However, the catalytic substrates for tre108784 are still unknown. Cinnamyl alcohol dehydrogenase is a key enzyme in lignin biosynthesis and is involved in the final step of the monolignol synthesis in plants [41,42]. Genetic evidence indicates that cinnamyl alcohol dehydrogenase deficiency in plants decreases overall lignin and alters cell wall structure [41,42]. Cinnamyl alcohol dehydrogenases in Saccharomyces cerevisiae participate in the degradation of lignin and NADP(H) homeostasis [43]. However, cinnamyl alcohol dehydrogenases in filamentous fungi have not been studied. TRE56839 is the first identified cinnamyl alcohol dehydrogenase in filamentous fungi. We predict that tre56839, which encodes the fungal cinnamyl alcohol dehydrogenase, may participate in the degradation of lignin and NADP(H) homeostasis. This ligninolytic activity will be studied. The deletion of tre56839 may alters the NADP(H) homeostasis in T. reesei cells. However, its relationship with increasing protein secretion (Fig. 3) remains unknown. The present data indicate that three NAD(P) related protein genes (tre56839, tre108784, and tre66256) influence cellulase production, as their deletion led to the significant increase of cellulase production ( Table 2). The precise mechanism(s) relating NADP(H) homeostasis and cellulase production warrant further investigation.
Among the tested strains, 15 deletion strains showed decreased FPase activities in RUT-C30. Overexpression of the 15 genes may potentially yield cellulase hyperproduction strains. Deletion of two putative methyltransferase genes, tre111216 and tre59381, in RUT-C30 significantly decreased cellulase activities ( Table 2). Methyltransferases are a large group of enzymes that methylate their substrates [44]. As the most common class of methyltransferases, class I methyltransferase LAE1 (gene ID: tre41617, S-adenosyl-methioninedependent) positively regulates secondary metabolite gene expression in T. reesei [35]. Presently, tre59381 was also BLAST searched and was revealed to encode a class I S-adenosyl-l-methionine-dependent methyltransferase. Therefore, we predict that tre59381 has a positive effect on the cellulase production, like LAE1. This remains to be confirmed. In addition, tre111216 was BLAST searched and revealed to encode a histone H3 methyltransferase belonging to class II methyltransferases (Table 2). Saccharomyces cerevisiae Set1p is a histone H3 methyltransferase that has a central role in transcriptional regulation and efficient gene expression [45]. Therefore, the tre111216 deletion in T. reesei might affect the transcriptional regulation and reduce cellulase production ( Table 2). The deletion strain Δ81043 (tre81043 is annotated as zinc finger, TFIIStype) displayed significantly reduced cellulase activity ( Table 2). Cellulase expression is tightly regulated by transcription factors in T. reesei [26,27,29,30]. Therefore, whether tre81043 is a positive transcription factor for cellulase expression requires further study. Other deleted strains with reduced cellulase production are also meaningful for the understanding of cellulase regulation. These strains include Δ112034 (tre112034 is annotated as MFS general substrate transporter); tre112034 might be an important transporter for carbon sources. The present data helped to identify a number of interesting target genes associated with cellulase production.
Further, we combined the beneficial mutation of RUT-C30 into SS-II to acquire a more efficient cellulase-producing strain. One confirmed beneficial mutation is the lack of the full version of CRE1 (CRE1 96 ) in RUT-C30, which exerts a positive regulatory influence on the expression of cellulase genes [20,46]. However, the cre1 96 replacement did not improve the total cellulase production of SS-II because of the remarkably decreased growth rate (Fig. 2). Unlike the wild-type T. reesei QM6a, T. reesei hyper-cellulolytic strain RUT-C30 displayed a catabolite-derepressed phenotype in the presence of glycerol or glucose, because of the partial lack of the cre1 gene (cre1 96 ) encoding the truncated carbon catabolite repressor 1 (CRE1 96 ). However, T. reesei SS-II possesses wild-type cre1. The cre1 96 sequence from RUT-C30 was introduced into SS-II to replace the full-length cre1, forming the mutant SS-II-cre1 96 (Fig. 2a). Additionally, we deleted cre1 in SS-II to form the mutant SS-II-Δcre1 (Additional file 2: Fig. S6). Both SS-II-cre1 96 and SS-II-Δcre1 strains showed remarkedly decreased growth rates compared with T. reesei SS-II. This result is consistent with the description by Nakari-Setälä et al. [20] reporting that deleting or truncating cre1 in wild-type T. reesei QM6a causes a growth delay. Therefore, cre1 truncation is a possible explanation for the slow growth rate of RUT-C30. Possession of wild-type cre1 is a possible explanation for the faster growth rate of SS-II. The results highlight that a positive attribute confirmed in one cellulase hyper-producing strain might not always work well in other cellulase hyper-producing strains because of different genetic backgrounds.
Our strategy of combining the mutations of RUT-C30 and SS-II successfully helped to identify a number of interesting phenotypes with regard to cellulase production. Novel genes involved in cellulase production will gradually be revealed, in part with the contribution of our data. Our research contributes to the creation of a library of genes, which can be investigated for their involvement in the regulation of cellulase production. These data will be used to develop new, improved, and more efficient cellulase-producing strains by combining the advantages of hyper-cellulolytic mutants.

Conclusions
T. reesei hyper-cellulolytic strain SS-II derived from NG14 grew faster and more effectively degraded lignocellulosic biomass to produce more glucose than the RUT-C30 strain, which was also isolated from NG14. SS-II genome sequencing revealed a full-length cre1. Using comparative genomic analysis, we identified multiple mutations in the SS-II strain. Fifty-seven genes mutated only in SS-II but not in NG14 and RUT-C30 were identified. Among these mutated SS-II genes, the tre108642 and tre56839 deletions in T. reesei RUT-C30 significantly improved cellulase production and secretion. T. reesei Δ108642 produced a higher yield of cellulase and was more effective for the degradation of lignocellulosic biomass to produce glucose than RUT-C30. Therefore, combining mutated genes of two hyper-cellulolytic strains to enhance the cellulase production in T. reesei might be an effective strategy for enhancing cellulase production.

Microorganism strains and culture conditions
Escherichia coli DH5α strain was used for vector construction and propagation. Agrobacterium tumefaciens AGL1 strain was used for the transformation of genes into T. reesei strains. The strains used in this research included the original strain T. reesei QM6a (ATCC 13631), NG14 (ATCC 56767), RUT-C30 (ATCC 56765), and SS-II. Spores were harvested by cultivating fungus on potato dextrose agar plates (PDA) at 28 °C for 6 days. For genomic DNA extraction, strains were cultivated on MA medium [47] containing 2% glucose as a sole carbon source at 28 °C for 48 h.
To analyze enzyme production, conidia (final concentration 10 6 /mL) of T. reesei strains were inoculated into 100 mL of MA medium containing 2% (w/v) Avicel (PH-101; Sigma-Aldrich, St. Louis, MO, USA) in a 500 mL shake flask incubated at 28 °C and 200 rpm for 5 to 7 days [48]. Mycelia were collected at different time intervals and kept frozen at − 80 °C for RNA extraction. The supernatant was used for enzyme assays.

Fungal growth and biomass assay
For T. reesei plate growth assays, the conidia were diluted to 10 7 /mL with sterile water. An equal volume of conidia solution (2 μL) was cultured onto the center of PDA plates for approximately 3 to 7 days at 28 °C.
For T. reesei biomass assays, conidia (final concentration 10 6 /mL) of T. reesei strains were inoculated into 100 mL of MA medium containing 2% (w/v) Avicel, 2% (w/v) lactose, or 2% (w/v) glucose at 28 °C and 200 rpm for 72 h. The biomass dry weight was measured as previously described [48]. Each experiment was performed in three biological replicates.

Genomic DNA sequencing and bioinformatics analysis
Genomic DNA was prepared with the TIANamp Yeast DNA Kit (Tiangen, Beijing, China). The integrity of the DNA and the absence of RNA contamination were analyzed using an Agilent 2100 Bioanalyzer and 1% agarose gel electrophoresis. Chromosomal DNA of T. reesei SS-II was sequenced using a model 251 PE Illumina MiSeq apparatus at the Shanghai Personalbio Biotechnology facility (Shanghai, China). The genome sequence of the wild-type strain QM6a (version 2.0) was downloaded from the Joint Genome Institute of Department of Energy (USDOE-JGI) website (http://genom e.jgi-psf.org/ Trire 2/Trire 2.home.html) and was used as the reference for comparative genomic analysis. Sequence reads were mapped onto the reference sequence using bwa (bwa-0.7.5a) [49] with default parameters. PCR duplicates were removed from the bam files using SAMtools (samtools-0.1.19) [50]. High-quality bam files were produced using GATK (GenomeAnalysisTK-2.2-15) and CountCovariates, TableRecalibration, RealignerTargetCreator, and IndelRealigner [51].
SNP detection was performed by comparing mapped sequences between the reference and the samples. Loci with heterozygous and homozygous genotypes in a sample different from the reference base were defined as SNPs. We used GATK to detect SNPs and indels. The filtering conditions to remove incorrectly identified SNPs and indels were "QD < 2.0, " "MQ < 40.0, " "FS > 60.0, " "Hap-lotypeScore > 13.0, " "MQRankSum < -12.5, " "ReadPos-RankSum < -8.0, " "DP < 4 or DP > 200, " "10 bp containing 3 or more SNPs;" "QD < 2.0, " "ReadPosRankSum < -20.0, " and "FS > 200.0. " The total number of SNPs and indels was counted and categorized as being located in the promoter region (within 1 kb upstream of the start codon), terminator region (within 500 bp downstream of the stop codon), intron, or exon. They were also characterized as to whether they caused a nonsynonymous amino acid substitution within an exon by using the T. reesei filtered gene models from the Joint Genome Institute.

Mutation confirmation of SS-II
To validate that the newly identified SNPs and indels truly exist in the SS-II genome, 0.5-to 1-kb fragments surrounding each mutation were amplified from the SS-II genomic DNA. These fragments were sequenced directly in an ABI 3730 XL sequencer (Majorbio, Shanghai, China). The primers used for amplification and sequencing are listed in Additional file 3.

Vector construction and transformation
Trichoderma reesei RUT-C30 lacking tku70 [52] was used as a recipient for all targeted gene knockouts. Deletion cassettes for selected genes were constructed by ligating 0.9 to 1 kb 5′-and 3′-flanks of each gene to the hygromycin resistant plasmid LML2.1 [52], according to our previous study [53]. As shown in Additional file 2: Fig. S6, the upstream fragment was ligated into the PacI-and XbaIlinearized LML2.1 using the ClonExpress ™ II One Step Cloning Kit (Vazyme, Nanjing, China). Subsequently, the downstream fragment was inserted into the SwaI site to form the deletion cassettes. Primers for the construction of all gene deletion vectors are presented in Additional file 3. The re-complementation cassettes of the genes were constructed by ligating the whole gene sequences (including the 1.5 kb promoter, gene coding sequence, and 1 kb terminator) to LML2.1. Re-complementation cassettes were transformed into the corresponding gene knockout mutants as previously described [53]. Primers for the construction of re-complementation cassettes are shown in Additional file 3. The vector pSS-II-cre1 96 was constructed by ligating upstream and downstream homologous arms to the hygromycin resistant plasmid LML2.1, as shown in Fig. 2a. The vector pSS-II-cre1 96 was introduced into SS-II, to replace full-length cre1, forming the mutant SS-II-cre1 96 .
All the constructed cassettes were transformed into T. reesei RUT-C30 by Agrobacterium-mediated transformation (AMT) [52]. Strains were selected using hygromycin B and cefotaxime on Mandel medium. The hygromycin resistant cassette was self-excised by xyloseinduced Cre recombinase [52]. The putative gene disruption mutants generated by double crossover were verified by diagnostic PCR using the primers XX-CF and XX-CR (XX represents the gene name) (Primers shown in Additional file 3). The fragments generated from the genome of transformants by PCR using the primers XX-CF and XX-CR were sequenced to verify the correct transformants.

Real-time quantitative PCR (RT-qPCR)
RNA extraction, reverse-transcription, and RT-qPCR assay were performed following previously described protocols [53]. The forward and reverse primers are listed in Additional file 3.

Enzyme activity assays
Filter paper hydrolase (FPase) activity, representing total extracellular cellulase activity, was determined using the 3,5-dinitrosalicylic acid method [54]. Endoglucanase activity (CMCase) was measured using 1% carboxymethylcellulose (CMC, Sigma-Aldrich) in 50 mM sodium acetate buffer (pH 5.0) at 50 °C for 30 min. One unit of activity was defined as the amount of enzyme that produces 1 μmol of reducing sugar per min.
Cellobiohydrolase activity (pNPCase) was determined using p-nitrophenol-d-cellobioside (pNPC) as a substrate [53]. β-glucosidase activity (pNPGase) was determined using p-nitrophenol-d-glucopyranoside (pNPG) as a substrate [14]. The release of p-nitrophenol was assessed by measuring absorbance at 405 nm. One unit of enzymatic activity was defined as 1 μmol of p-nitrophenol released from the substrate per min.