Jump to content

BLAST (biotechnology)

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 168... (talk | contribs) at 07:20, 5 January 2004. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

This article is about the computer program. For the Vorticist journal, see BLAST (journal).


Stands for Basic Local Alignment Search Tool, BLAST is an algorithm for comparing biological sequences, such as the amino-acid sequences of different proteins or the DNA sequences of different genes. Given a library or database of sequences, a BLAST search enables a researcher to look for sequences that either duplicate or resemble any sequence of interest. For example, following the discovery of a previously unknown gene in a non-human animal, a scientist typically will perform a BLAST search of the human genome to see if human beings carry a similar gene, which is identified based on its sequence. The BLAST algorithm and a computer program that impliments it were developed by Stephen Altschul at the U.S. National Center for Biotechnology Information. It is available on the web at [1]

Other questions that researchers use BLAST to answer are

  • Which bacterial species have a protein that is related in lineage to a certain protein whose amino-acid sequence I know?
  • Where does the DNA that I've just sequenced come from?
  • What other genes encode proteins that exhibit structures or motifs such as the one I've just determined?

Algorithm

BLAST is designed to take a query sequence (called the target sequence) and pairwise comparing it to all the sequences in a large (multi-gigabyte) library, finding the most similar sequences. Because it is comparing the target sequence to so many other sequences, the BLAST algorthm must be extremely fast. The algorithm works by searching for small regions that are exactly the same in the two sequences and then attempting to extend the alignment to either side until the comparison score reaches a certain threshold.

Program

The BLAST web server, hosted by the NCBI, allows anyone with a web browser to perform similarity searches against constantly updated databases of proteins and DNA that include most of the newly sequenced organisms. The server includes many programs, but the most important are the following ones:

Nucleotide-nucleotide BLAST (blastn)

This program, given a DNA query, returns the most similar DNA sequences from the DNA database that the user specifies.

Protein-protein BLAST (blastp)

This program, given a protein query, returns the most similar DNA sequences from the protein database that the user specifies.


PSI-BLAST

One of the more recent BLAST programs, this program is used for finding distant relatives of a protein. First, a list of all closely related proteins is created. Then these proteins are combined into a "profile" that is a sort of average sequence. A query against the protein database is then run using this profile, and a larger group of proteins found. This larger group is used to construct another profile, and the process is repeated.

By including related proteins in the search, PSI-BLAST is much more sensitive in picking up distant evolutionary relationships than the standard protein-protein BLAST.

  • The main BLAST page is here.
  • If you are new to BLAST, try the tutorial.