From: Increasing the production of the bioactive compounds in medicinal mushrooms: an omics perspective
Genomics techniques | Approaches and tools | Description | Advantages | Constraints | References |
---|---|---|---|---|---|
Genome mining | Southern blots | Is considered a classical approach Probes are created on the basis of conserved sequences and then these probes are used for screening for BGCs | Can be used for screening novel BGCs | Time-consuming | [150] |
In silico nucleotide/amino acid sequence alignment tools (e.g. BLAST, Diamond, and HMMer) | Mining for BGCs in databases (such as ClustScan and NP.searcher) and genome sequences is enabled by using a conserved sequence | Can be used for screening novel BGCs | Is mostly restricted to identifying particular classes of metabolites, including polyketide synthases and non-ribosomal peptide synthetases | ||
In silico software (e.g. PRISM and antiSMASH) | Sequence alignment-based profiles in a Hidden Markov Model of genes specific for particular types of BGCs are exploited for anticipating BGC types | User-friendly | Only similar BGCs to previously recognized pathways can be identified | ||
Machine learning- based mining tools | Machine learning strategies like ClusterFinder and DeepBGC are employed The set of known clusters or a set of clusters anticipated by in silico software are used for training these tools | Can be used for discovering unknown BGCs | The rate of false-positive results is much higher in comparison to in silico software Identifying completely novel BGCs is difficult | ||
[150] | |||||
Genome sequencing | Shotgun sequencing | Sanger sequencing is used for gaining reads containing overlapping ends from smaller fragments of DNA. These overlapping ends will then guide the assembly of the obtained reads to achieve the original target sequence | Routinely utilized for short reads | Utilizes large amounts of data that are filled with sequencing errors Assembly of complex genomes is challenging Needs numerous overlapping reads for each fragment of the original DNA | |
Whole-genome Shotgun sequencing | Genomic DNA is cut into random fragments, size-selected, and cloned into a suitable vector. The Sanger method is used for sequencing clones containing short inserts from both ends (paired-end sequencing). Sequence assembly software will then provide the original sequences from these reads | May be used for enhancing the accuracy of pre-existing sequencing information May be beneficial for revising or removing data gaps left by other DNA sequencing technologies | More prominent risk of long-range misassemblies and the emergence of sequencing errors Requires software | ||
Next generation sequencing (e.g. Roche 454, Illumina, etc.) | Delivers large amounts of high-throughput information from multiple samples per run in parallel | Fast Less expensive Delivers better coverage | Needs intricate bioinformatics interpretations Requires considerable data storage | [156] | |
Genome annotation | Ab initio | Depends on statistical models and gene signals Uses programs such as AUGUSTUS, FGENESH, etc | Identifies novel genes quickly and easily | Less accurate compared to Homology-based approaches | [149] |
Homology-based | Depends on sequence alignments Collects data from coding sequences, proteins, expressed sequence tags, and cDNA Uses programs and tools such as Exonerate, DIALIGN, etc | More accurate compared to Ab initio approaches Beneficial for functional annotations | Determining genes different from the referenced genes is difficult Increased evolutionary distance lowers its accuracy |