Introduction to BLAST
- BLAST stands for Basic Local Alignment Search Tool (BLAST). It is an algorithm for comparing primary biological sequence information, such as amino acid sequences of proteins or nucleotides of DNA/RNA sequences.
- BLAST (Basic Local Alignment Search Tool) allows rapid sequence comparison of a query sequence nucleotides or amino acids) against a database.
- BLAST is used to find similarity between sequences by searching sequence databases for matches to a query sequence. It helps researchers identify genes and genetic features and explore evolutionary relationships.
- BLAST was developed in 1990 by Stephen Altschul, Warren Gish, Webb Miller, Eugene Myers, and David J. Lipman at the National Institutes of Health (NIH). It has since become a widely used bioinformatics tool.
Objectives of BLAST
- It is one of the most popular programs for sequence analysis.
- Enables a researcher to compare a query sequence with a library or database of sequence.
- Identify library sequences that resemble the query sequence above a certain threshold.
- The objective is to find high scoring un-gaped segments among related sequences.
- Alignments of the high-scoring segment pairs showing identical and similar residues.
- A complete list of the parameter settings used for the search.
Why BLAST?
- BLAST is essential for many bioinformatics analyses, as it helps us discover evolutionary relationships, infer protein function, and identify conserved domains.
- It accelerates sequence similarity searching, making it practical for large-scale genomic and proteomic studies.
- Comparing the query sequence to known sequences in databases is fundamental to understanding the relatedness of any query sequence to other known proteins or DNA sequences.
- BLAST is 50 times faster then dynamic programming.
Applications include
- Identifying shared similarities with sequences already deposited in the databanks (orthologs and paralogs?)
- Discovering new genes or proteins (ascertaining existence of a putative ORF)
- Discovering variants of genes or proteins
- Identifying functional motifs shared with other proteins.
- Investigating expressed sequence tags (ESTs)
- Exploring protein structure and function
Why use local alignment for database searches?
Local alignment is a useful approach to database searching because many query sequences have domains, active sites or other motifs that have local but not global regions of similarity to other sequences.
Uses of BLAST
- Species identification: BLAST aids in accurately identifying unknown species by comparing DNA sequences with a comprehensive database. It provides insights into phylogenetic relationships and potential matches.
- Domain analysis: BLAST detects known domains within protein or nucleotide sequences, revealing conserved regions, functional motifs, and structural features. This enhances understanding of biological function.
- Phylogenetic analysis: BLAST’s web pages generate phylogenetic trees from results, showing evolutionary relationships. They visually represent the query sequence’s evolutionary history and relatedness to known species.
- Chromosomal mapping: BLAST helps map query sequences to unknown chromosome locations in known species. It aligns them with database hits, indicating their genomic position.
- Comparative genomics and annotations: BLAST enables mapping of annotations between different organisms and searching for common genes in related species. This comparative approach uncovers gene functions, regulatory elements, and evolutionary conservation, facilitating cross-species comparative studies.
BLAST Versions
- Over time, various versions of BLAST have been developed, such as BLASTn (nucleotide vs. nucleotide), BLASTp (protein vs. protein), BLASTx (translated nucleotide vs. protein), and tBLASTn (protein vs. translated nucleotide).