Skip to main content

Table 3 Main techniques in genomics investigations, their different approaches, advantages, and limitations

From: Increasing the production of the bioactive compounds in medicinal mushrooms: an omics perspective

Genomics techniques

Approaches and tools

Description

Advantages

Constraints

References

Genome mining

Southern blots

Is considered a classical approach

Probes are created on the basis of conserved sequences and then these probes are used for screening for BGCs

Can be used for screening novel BGCs

Time-consuming

[150]

In silico nucleotide/amino acid sequence alignment tools (e.g. BLAST, Diamond, and HMMer)

Mining for BGCs in databases (such as ClustScan and NP.searcher) and genome sequences is enabled by using a conserved sequence

Can be used for screening novel BGCs

Is mostly restricted to identifying particular classes of metabolites, including polyketide synthases and non-ribosomal peptide synthetases

In silico software (e.g. PRISM and antiSMASH)

Sequence alignment-based profiles in a Hidden Markov Model of genes specific for particular types of BGCs are exploited for anticipating BGC types

User-friendly

Only similar BGCs to previously recognized pathways can be identified

Machine learning- based mining tools

Machine learning strategies like ClusterFinder and DeepBGC are employed

The set of known clusters or a set of clusters anticipated by in silico software are used for training these tools

Can be used for discovering unknown BGCs

The rate of false-positive results is much higher in comparison to in silico software

Identifying completely novel BGCs is difficult

[150]

Genome sequencing

Shotgun sequencing

Sanger sequencing is used for gaining reads containing overlapping ends from smaller fragments of DNA. These overlapping ends will then guide the assembly of the obtained reads to achieve the original target sequence

Routinely utilized for short reads

Utilizes large amounts of data that are filled with sequencing errors

Assembly of complex genomes is challenging

Needs numerous overlapping reads for each fragment of the original DNA

[151,152]

Whole-genome Shotgun sequencing

Genomic DNA is cut into random fragments, size-selected, and cloned into a suitable vector. The Sanger method is used for sequencing clones containing short inserts from both ends (paired-end sequencing). Sequence assembly software will then provide the original sequences from these reads

May be used for enhancing the accuracy of pre-existing sequencing information

May be beneficial for revising or removing data gaps left by other DNA sequencing technologies

More prominent risk of long-range misassemblies and the emergence of sequencing errors

Requires software

[153,154,155]

Next generation sequencing (e.g. Roche 454, Illumina, etc.)

Delivers large amounts of high-throughput information from multiple samples per run in parallel

Fast

Less expensive

Delivers better coverage

Needs intricate bioinformatics interpretations

Requires considerable data storage

[156]

Genome annotation

Ab initio

Depends on statistical models and gene signals

Uses programs such as AUGUSTUS, FGENESH, etc

Identifies novel genes quickly and easily

Less accurate compared to Homology-based approaches

[149]

Homology-based

Depends on sequence alignments

Collects data from coding sequences, proteins, expressed sequence tags, and cDNA

Uses programs and tools such as Exonerate, DIALIGN, etc

More accurate compared to Ab initio approaches

Beneficial for functional annotations

Determining genes different from the referenced genes is difficult

Increased evolutionary distance lowers its accuracy

  1. BLAST basic local alignment search tool, PRISM protein informatics system for modeling