Class BlastLikeSearchBuilder

  • All Implemented Interfaces:
    SearchBuilder, SearchContentHandler

    public class BlastLikeSearchBuilder
    extends Object
    implements SearchBuilder

    BlastLikeSearchBuilder will create SeqSimilaritySearchResults from SAX events via a SeqSimilarityAdapter. The SAX events should describe elements conforming to the BioJava BlastLikeDataSetCollection DTD. Suitable sources are BlastLikeSAXParser or FastaSearchSAXParser. The result objects are placed in the List supplied to the constructor.

    The start/end/strand of SeqSimilaritySearchHits are calculated from their constituent SeqSimilaritySearchSubHits as follows:

    • The query start is the lowest query start coordinate of its sub-hits, regardless of strand
    • The query end is the highest query end coordinate of its sub-hits, regardless of strand
    • The hit start is the lowest hit start coordinate of its sub-hits, regardless of strand
    • The hit end is the highest hit end coordinate of its sub-hits, regardless of strand
    • The query strand is null for protein sequences. Otherwise it is equal to the query strand of its sub-hits if they are all on the same strand, or StrandedFeature.UNKNOWN if the sub-hits have mixed query strands
    • The hit strand is null for protein sequences. Otherwise it is equal to the hit strand of its sub-hits if they are all on the same strand, or StrandedFeature.UNKNOWN if the sub-hits have mixed hit strands

    This class has special meanings for particular keys: if you want to adapt this class for another parser, you will need to be aware of this. These originate from and are fully described in the BlastLikeDataSetCollection DTD.

    Key Meaning
    program either this value or the subjectSequenceType value must be set. This can take values acceptable to AlphabetResolver. These are BLASTN, BLASTP, BLASTX, TBLASTN, TBLASTX, DNA and PROTEIN.
    databaseId Identifier of database searched (in SequenceDBInstallation).
    subjectSequenceType type of sequence that hit is. Can be DNA or PROTEIN.
    subjectId id of sequence that is hit
    subjectDescription description of sequence that is hit
    queryStrand Strandedness of query in alignment. Takes values of "plus" and "minus"
    subjectStrand Strandedness of query in alignment. Takes values of "plus" and "minus"
    queryFrame self-evident
    subjectFrame self-evident
    querySequenceStart self-evident
    querySequenceEnd self-evident
    subjectSequenceStart self-evident
    subjectSequenceEnd self-evident
    score self-evident
    expectValue self-evident
    pValue self-evident
    Since:
    1.2
    Author:
    Keith James, Greg Cox