Package org.biojava.bio.program.ssaha

SSAHA sequence searching API.

Overview

SSAHA is Sequence Searching Algorithm by Hashing. The idea is to take a sequence database, such as EMBL, walk over all of the sequences using a window size and step size, represent each of these same-sized fragments as a bit-string, and use the bit-string as an index into a hash-table. The hash-table is used to store the location of every window (sequence and position). Search sequences are encoded as bit-patterns in the same manner, and then this is used as an index into the table to fetch all hits. Finaly, these hits are sorted and potentialy merged to produce HSPs.