Class IndexedSequenceDB
- java.lang.Object
-
- org.biojava.utils.AbstractChangeable
-
- org.biojava.bio.seq.db.AbstractSequenceDB
-
- org.biojava.bio.seq.db.IndexedSequenceDB
-
- All Implemented Interfaces:
Serializable
,SequenceDB
,SequenceDBLite
,Changeable
public final class IndexedSequenceDB extends AbstractSequenceDB implements SequenceDB, Serializable
This class implements SequenceDB on top of a set of sequence files and sequence offsets within these files.
This class is primarily responsible for managing the sequence IO, such as calculating the sequence file offsets, and parsing individual sequences based upon file offsets. The actual persistant storage of all this information is delegated to an instance of
IndexStore
, such as TabIndexStore.// create a new index store and populate it // this may take some time TabIndexStore indexStore = new TabIndexStore( storeFile, indexFile, dbName, format, sbFactory, symbolParser ); IndexedSequenceDB seqDB = new IndexedSequenceDB(indexStore); for(int i = 0; i < files; i++) { seqDB.addFile(files[i]); } // load an existing index store and fetch a sequence // this should be quite quick TabIndexStore indexStore = TabIndexStore.open(storeFile); SequenceDB seqDB = new IndexedSequenceDB(indexStore); Sequence seq = seqDB.getSequence(id);
Note: We may be able to improve the indexing speed further by discarding all feature creation & annotation requests during index parsing.
- Author:
- Matthew Pocock, Thomas Down, Keith James
- See Also:
TabIndexStore
, Serialized Form
-
-
Field Summary
-
Fields inherited from interface org.biojava.bio.seq.db.SequenceDBLite
SEQUENCES
-
-
Constructor Summary
Constructors Constructor Description IndexedSequenceDB(IDMaker idMaker, IndexStore indexStore)
Create an IndexedSequenceDB by specifying both the IDMaker and IndexStore used.IndexedSequenceDB(IndexStore indexStore)
Create an IndexedSequenceDB by specifying IndexStore used.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addFile(File seqFile)
Add sequences from a file to the sequence database.IndexStore
getIndexStore()
Retrieve the IndexStore.String
getName()
Get the name of this sequence database.Sequence
getSequence(String id)
Retrieve a single sequence by its id.Set
ids()
Get an immutable set of all of the IDs in the database.SequenceIterator
sequenceIterator()
Returns a SequenceIterator over all sequences in the database.-
Methods inherited from class org.biojava.bio.seq.db.AbstractSequenceDB
addSequence, filter, removeSequence
-
Methods inherited from class org.biojava.utils.AbstractChangeable
addChangeListener, addChangeListener, generateChangeSupport, getChangeSupport, hasListeners, hasListeners, isUnchanging, removeChangeListener, removeChangeListener
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.biojava.utils.Changeable
addChangeListener, addChangeListener, isUnchanging, removeChangeListener, removeChangeListener
-
Methods inherited from interface org.biojava.bio.seq.db.SequenceDB
filter
-
Methods inherited from interface org.biojava.bio.seq.db.SequenceDBLite
addSequence, removeSequence
-
-
-
-
Constructor Detail
-
IndexedSequenceDB
public IndexedSequenceDB(IDMaker idMaker, IndexStore indexStore)
Create an IndexedSequenceDB by specifying both the IDMaker and IndexStore used.The IDMaker will be used to calculate the ID for each Sequence. It will delegate the storage and retrieval of the sequence offsets to the IndexStore.
- Parameters:
idMaker
- the IDMaker used to calculate Sequence IDsindexStore
- the IndexStore delegate
-
IndexedSequenceDB
public IndexedSequenceDB(IndexStore indexStore)
Create an IndexedSequenceDB by specifying IndexStore used.IDMaker.byName will be used to calculate the ID for each Sequence. It will delegate the storage and retrieval of the sequence offsets to the IndexStore.
- Parameters:
indexStore
- the IndexStore delegate
-
-
Method Detail
-
getIndexStore
public IndexStore getIndexStore()
Retrieve the IndexStore.- Returns:
- the IndexStore delegate
-
addFile
public void addFile(File seqFile) throws IllegalIDException, BioException, ChangeVetoException
Add sequences from a file to the sequence database. This method works on an "all or nothing" principle. If it can successfully interpret the entire file, all the sequences will be read in. However, if it encounters any problems, it will abandon the whole file; an IOException will be thrown. Multiple files may be indexed into a single database. A BioException will be thrown if it has problems understanding the sequences.- Parameters:
seqFile
- the file containing the sequence or set of sequences- Throws:
BioException
- if for any reason the sequences can't be read correctlyChangeVetoException
- if there is a listener that vetoes adding the filesIllegalIDException
-
getName
public String getName()
Get the name of this sequence database. The name is retrieved from the IndexStore delegate.- Specified by:
getName
in interfaceSequenceDBLite
- Returns:
- the name of the sequence database, which may be null.
-
getSequence
public Sequence getSequence(String id) throws IllegalIDException, BioException
Description copied from interface:SequenceDBLite
Retrieve a single sequence by its id.- Specified by:
getSequence
in interfaceSequenceDBLite
- Parameters:
id
- the id to retrieve by- Returns:
- the Sequence with that id
- Throws:
IllegalIDException
- if the database doesn't know about the idBioException
- if there was a failure in retrieving the sequence
-
sequenceIterator
public SequenceIterator sequenceIterator()
Description copied from interface:SequenceDB
Returns a SequenceIterator over all sequences in the database. The order of retrieval is undefined.- Specified by:
sequenceIterator
in interfaceSequenceDB
- Overrides:
sequenceIterator
in classAbstractSequenceDB
- Returns:
- a SequenceIterator over all sequences
-
ids
public Set ids()
Description copied from interface:SequenceDB
Get an immutable set of all of the IDs in the database. The ids are legal arguments to getSequence.- Specified by:
ids
in interfaceSequenceDB
- Returns:
- a Set of ids - at the moment, strings
-
-