BLAST

he BLAST programs are widely used tools for searching DNA and protein databases for sequence similarity to identify homologs to a query sequence. While often referred to as just "BLAST", this can really be thought of as a set of programs: blastp, blastn, blastx, tblastn, and tblastx.

Availability & Restrictions

Versions

The following versions of BLAST are available on OSC systems: 

Version Oakley Owens
2.2.24+ X  
2.2.25+ X  
2.2.26 X  
2.2.31+ X  
2.4.0+   X*
2.6.0+  X*  

 

* Current Default Version

You can use module spider blast to view available modules for a given machine. Feel free to contact OSC Help if you need other versions for your work.

If you need to use blastx, you will need to load one of the C++ implimenations modules of blast (any version with a "+").

Access

BLAST is available without restriction to all OSC users.

Publisher/Vendor/Repository and License Type

National Institutes of Health, Open source

Usage

Set-up

To load BLAST, type the following into the command line:

module load blast

Then create a resource file .ncbirc, and put it under your home directory.

Using BLAST

The five flavors of BLAST mentioned above perform the following tasks:

blastp: compares an amino acid query sequence against a protein sequence database
blastn: compares a nucleotide query sequence against a nucleotide sequence database
blastx: compares the six-frame conceptual translation products of a nucleotide query sequence (both strands) against a protein sequence database
tblastn: compares a protein query sequence against a nucleotide sequence database dynamically translated in all six reading frames (both strands).
tblastx: compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database. (Due to the nature of tblastx, gapped alignments are not available with this option)

NCBI BLAST Database

Information on the NCBI BLAST database can be found here. https://www.osc.edu/resources/available_software/scientific_database_list/blast_database 

We provide local access to nt and refseq_protein databases. You can access the database by loading desired blast-database modules. If you need other databases, please send a request email to OSC Help .

Batch Usage

A sample batch script is below:

#PBS -l nodes=1:ppn=1
#PBS -l walltime=10:00
#PBS -N Blast
#PBS -S /bin/bash
#PBS -j oe
module load blast
module load blast-database
set -x
cd $PBS_O_WORKDIR
mkdir $PBS_JOBID
cp 100.fasta $TMPDIR
cd $TMPDIR
/usr/bin/time blastn -db nt -query 100.fasta -out test.out
cp test.out $PBS_O_WORKDIR/$PBS_JOBID

 

Further Reading

Supercomputer: 
Service: 
Fields of Science: