Class IndexedSequenceDB

  • All Implemented Interfaces:
    Serializable, SequenceDB, SequenceDBLite, Changeable

    public final class IndexedSequenceDB
    extends AbstractSequenceDB
    implements SequenceDB, Serializable

    This class implements SequenceDB on top of a set of sequence files and sequence offsets within these files.

    This class is primarily responsible for managing the sequence IO, such as calculating the sequence file offsets, and parsing individual sequences based upon file offsets. The actual persistant storage of all this information is delegated to an instance of IndexStore, such as TabIndexStore.

     // create a new index store and populate it
     // this may take some time
     TabIndexStore indexStore = new TabIndexStore(
       storeFile, indexFile, dbName,
       format, sbFactory, symbolParser );
     IndexedSequenceDB seqDB = new IndexedSequenceDB(indexStore);
    
     for(int i = 0; i < files; i++) {
       seqDB.addFile(files[i]);
     }
    
     // load an existing index store and fetch a sequence
     // this should be quite quick
     TabIndexStore indexStore = TabIndexStore.open(storeFile);
     SequenceDB seqDB = new IndexedSequenceDB(indexStore);
     Sequence seq = seqDB.getSequence(id);
     

    Note: We may be able to improve the indexing speed further by discarding all feature creation & annotation requests during index parsing.

    Author:
    Matthew Pocock, Thomas Down, Keith James
    See Also:
    TabIndexStore, Serialized Form
    • Constructor Detail

      • IndexedSequenceDB

        public IndexedSequenceDB​(IDMaker idMaker,
                                 IndexStore indexStore)
        Create an IndexedSequenceDB by specifying both the IDMaker and IndexStore used.

        The IDMaker will be used to calculate the ID for each Sequence. It will delegate the storage and retrieval of the sequence offsets to the IndexStore.

        Parameters:
        idMaker - the IDMaker used to calculate Sequence IDs
        indexStore - the IndexStore delegate
      • IndexedSequenceDB

        public IndexedSequenceDB​(IndexStore indexStore)
        Create an IndexedSequenceDB by specifying IndexStore used.

        IDMaker.byName will be used to calculate the ID for each Sequence. It will delegate the storage and retrieval of the sequence offsets to the IndexStore.

        Parameters:
        indexStore - the IndexStore delegate
    • Method Detail

      • addFile

        public void addFile​(File seqFile)
                     throws IllegalIDException,
                            BioException,
                            ChangeVetoException
        Add sequences from a file to the sequence database. This method works on an "all or nothing" principle. If it can successfully interpret the entire file, all the sequences will be read in. However, if it encounters any problems, it will abandon the whole file; an IOException will be thrown. Multiple files may be indexed into a single database. A BioException will be thrown if it has problems understanding the sequences.
        Parameters:
        seqFile - the file containing the sequence or set of sequences
        Throws:
        BioException - if for any reason the sequences can't be read correctly
        ChangeVetoException - if there is a listener that vetoes adding the files
        IllegalIDException
      • getName

        public String getName()
        Get the name of this sequence database. The name is retrieved from the IndexStore delegate.
        Specified by:
        getName in interface SequenceDBLite
        Returns:
        the name of the sequence database, which may be null.
      • ids

        public Set ids()
        Description copied from interface: SequenceDB
        Get an immutable set of all of the IDs in the database. The ids are legal arguments to getSequence.
        Specified by:
        ids in interface SequenceDB
        Returns:
        a Set of ids - at the moment, strings