Skip to content
Robert J. Gifford edited this page Jun 23, 2024 · 7 revisions

Sequence similarity search tools, such as the Basic Local Alignment Search Tool (BLAST), are essential for biological sequence analysis. These tools detect regions of local similarity between molecular sequences and are invaluable for various purposes. They can be used to characterize a locus in detail, helping to identify the coordinates of specific sequence features at the protein or nucleic acid level (e.g., conserved protein motifs, oligonucleotide primer sites).

The basic functions of BLAST can be expanded into comprehensive investigative strategies for comparative analysis of genes and genomes. This might involve using different combinations of probe sequences and target databases or integrating BLAST searches with other sequence analysis methods (e.g., phylogenetic or statistical analysis).

Similarity searches enable researchers to selectively recover similar – thus potentially related – sequences from the vast quantity of sequences held within sequence databases. These searches they serve as a 'search engine' for retrieving similar sequences from databases, which may indicate evolutionary relationships. This function is particularly useful for comparative and evolutionary studies, especially given the rapid accumulation of sequence data.

BLAST-based approaches are particularly useful for investigating genomic features that are poorly annotated in public databases, such as small RNAs, pseudogenes, transposable elements, highly duplicated gene families, and endogenous viral elements (EVEs). More broadly, BLAST searches can underpin heuristic in silico investigations, where the overall strategy is loosely defined and requires multiple iterations of trial and error, using new information from each iteration to refine the approach.

While systematic BLAST screens of genome databases are crucial for many comparative genomics investigations, efficiently implementing these procedures and integrating them into bioinformatics workflows can be technically challenging.