clasp - A fast local fragment chainer using sum-of-pair gap costs
Introduction
clasp is a fast and flexible fragment chainer that supports linear and sum-of-pair gap costs and uses highly time-efficient index structure, i.e., Johnson priority queues and range trees padded with Johnson priority queues. Chaining of short match fragments helps to quickly and accurately identify region of synteny that may not be found using common local alignment heuristics alone. Further details on the algorithm or the gap cost models are provided in Abouelhoda and Ohlebusch (2003).
It reads tab-separated fragment files providing information on fragment start and end position on query and database sequence as well as a score measure. It executes a local chaining algorithm using either the linear (parameter -L, --lin) or the sum-of-pair gap cost model (default). It produces a tab-separated output of chain data, optionally including fragments. Note that the algorithm is optimized for short queries and large database sequences using a novel clustering approach.
Download
For detailed instructions type
./clasp.x --help
or see the man page.
Installation
Download the latest release and extract the archive using
tar -xvzf clasp_v*.tar.gz
subsequently go to the new directory and type
make
Run clasp by typing
./clasp.x
Example (with BLAST)
Run BLAST
Download chr5 of Mus musculus from UCSC and uncompress:
gunzip chr5.fa.gz
Create blast database with
formatdb -i chr5.fa -p F
Run BLAST with Human H/ACA snoRNA ACA42 (from snoRNABase) and
its reverse complement
(options -m 8 and -S 1 are required):
blastall -p blastn -d chr5.fa -i ACA42.fa -m 8 -S 1 -W 8 -e 1e5 -o ACA42.blast
blastall -p blastn -d chr5.fa -i ACA42_revcomp.fa -m 8 -S 1 -W 8 -e 1e5 -o ACA42_revcomp.blast
Chain blast output
Run clasp on BLAST output (with sum-of-pair gap cost model,
epsilon=0, lambda=0.5, and minimal demanded chain score of 35):
./clasp.x -i ACA42.blast -c 7 8 9 10 4 -C 1 2 -e 0 -l 0.5 -S 35 -o ACA42.chn
./clasp.x -i ACA42_revcomp.blast -c 7 8 9 10 4 -C 1 2 -e 0 -l 0.5 -S 35 -o ACA42_revcomp.chn
Contact
If you have any further questions, complaints, or bug reports please mail to christian (at) bioinf (dot) uni-leipzig (dot) de.