Class EmblCDROMIndexStore
- java.lang.Object
-
- org.biojava.bio.seq.db.EmblCDROMIndexStore
-
- All Implemented Interfaces:
IndexStore
public class EmblCDROMIndexStore extends Object implements IndexStore
EmblCDROMIndexStore
s implement a read-onlyIndexStore
backed by EMBL CD-ROM format binary indices. The required index files are typically named "division.lkp" and "entrynam.idx". As anIndexStore
performs lookups by sequence ID, the index files "acnum.trg" and "acnum.hit" (which store additional accession number data) are not used.The sequence IDs are found using a binary search via a pointer into the index file. The whole file is not read unless a request for all the IDs is made using the getIDs() method. The set of IDs is then cached after the first pass. This class also has a
close()
method to free resources associated with the underlyingRandomAccessFile
.The binary index files may be created using the EMBOSS programs dbifasta, dbiblast, dbiflat or dbigcg. The least useful from the BioJava perspective is dbigcg because we do not have a
SequenceFormat
implementation for GCG format files.The
Index
instances returned by this class do not have the record length set because this information is not available in the binary index. The value -1 is used instead, as described in theIndex
interface.- Since:
- 1.2
- Author:
- Keith James
-
-
Constructor Summary
Constructors Constructor Description EmblCDROMIndexStore(File pathPrefix, File divisionLkp, File entryNamIdx, SequenceFormat format, SequenceBuilderFactory factory, SymbolTokenization parser)
Creates a newEmblCDROMIndexStore
backed by a random access binary index.EmblCDROMIndexStore(File divisionLkp, File entryNamIdx, SequenceFormat format, SequenceBuilderFactory factory, SymbolTokenization parser)
Creates a newEmblCDROMIndexStore
backed by a random access binary index.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
close
closes the underlyingEntryNamRandomAccess
which in turn closes the lower levelRandomAccessFile
.void
commit()
commit
commits changes.Index
fetch(String id)
Fetch an Index based upon an ID.Set
getFiles()
Retrieve the Set of files that are currently indexed.SequenceFormat
getFormat()
Retrieve the format of the index file.Set
getIDs()
Retrieve the set of all current IDs.String
getName()
getName
returns the database name as defined within the EMBL CD-ROM index.File
getPathPrefix()
getPathPrefix
returns the abstract path currently being appended to the raw sequence database filenames extracted from the binary index.SequenceBuilderFactory
getSBFactory()
Retrieve the SequenceBuilderFactory used to build Sequence instances.SymbolTokenization
getSymbolParser()
Retrieve the symbol parser used to turn the sequence characters into Symobl objects.void
rollback()
rollback
rolls back changes made since the lastcommit
.void
setPathPrefix(File pathPrefix)
setPathPrefix
sets the abstract path to be appended to sequence database filenames retrieved from the binary index.void
store(Index index)
store
adds anIndex
to the store.
-
-
-
Constructor Detail
-
EmblCDROMIndexStore
public EmblCDROMIndexStore(File divisionLkp, File entryNamIdx, SequenceFormat format, SequenceBuilderFactory factory, SymbolTokenization parser) throws IOException
Creates a newEmblCDROMIndexStore
backed by a random access binary index.- Parameters:
divisionLkp
- aFile
containing the master index.entryNamIdx
- aFile
containing the sequence IDs and offsets.format
- aSequenceFormat
.factory
- aSequenceBuilderFactory
.parser
- aSymbolTokenization
.- Throws:
IOException
- if an error occurs.
-
EmblCDROMIndexStore
public EmblCDROMIndexStore(File pathPrefix, File divisionLkp, File entryNamIdx, SequenceFormat format, SequenceBuilderFactory factory, SymbolTokenization parser) throws IOException
Creates a newEmblCDROMIndexStore
backed by a random access binary index.- Parameters:
pathPrefix
- aFile
containing the abstract path to be appended to sequence database filenames retrieved from the binary index.divisionLkp
- aFile
containing the master index.entryNamIdx
- aFile
containing the sequence IDs and offsets.format
- aSequenceFormat
.factory
- aSequenceBuilderFactory
.parser
- aSymbolTokenization
.- Throws:
IOException
- if an error occurs.
-
-
Method Detail
-
getPathPrefix
public File getPathPrefix()
getPathPrefix
returns the abstract path currently being appended to the raw sequence database filenames extracted from the binary index. This value defaults to the empty abstract path.- Returns:
- a
File
.
-
setPathPrefix
public void setPathPrefix(File pathPrefix)
setPathPrefix
sets the abstract path to be appended to sequence database filenames retrieved from the binary index. E.g. if the binary index refers to the database as 'SWALL' and thepathPrefix
is set to "/usr/local/share/data/seq/", then theIndexStore
will know the database path as "/usr/local/share/data/seq/swall" and anyIndex
instances produced by the store will return the latter path when their getFile() method is called. This value defaults to the empty abstract path.- Parameters:
pathPrefix
- aFile
prefix specifying the abstract path to append.
-
getName
public String getName()
getName
returns the database name as defined within the EMBL CD-ROM index.- Specified by:
getName
in interfaceIndexStore
- Returns:
- a
String
value.
-
store
public void store(Index index) throws IllegalIDException, BioException
store
adds anIndex
to the store. As EMBL CD-ROM indices are read-only, this implementation throws aBioException
.- Specified by:
store
in interfaceIndexStore
- Parameters:
index
- anIndex
.- Throws:
IllegalIDException
- if an error occurs.BioException
- if an error occurs.
-
commit
public void commit() throws BioException
commit
commits changes. As EMBL CD-ROM indices are read-only, this implementation throws aBioException
.- Specified by:
commit
in interfaceIndexStore
- Throws:
BioException
- if an error occurs.
-
rollback
public void rollback()
rollback
rolls back changes made since the lastcommit
. As EMBL CD-ROM indices are read-only, this implementation does nothing.- Specified by:
rollback
in interfaceIndexStore
-
fetch
public Index fetch(String id) throws IllegalIDException, BioException
Description copied from interface:IndexStore
Fetch an Index based upon an ID.- Specified by:
fetch
in interfaceIndexStore
- Parameters:
id
- The ID of the sequence Index to retrieve- Throws:
IllegalIDException
- if the ID couldn't be foundBioException
- if the fetch fails in the underlying storage mechanism
-
getIDs
public Set getIDs()
Description copied from interface:IndexStore
Retrieve the set of all current IDs.This set should either be immutable, or modifiable totally separately from the IndexStore.
- Specified by:
getIDs
in interfaceIndexStore
- Returns:
- a Set of all legal IDs
-
getFiles
public Set getFiles()
Description copied from interface:IndexStore
Retrieve the Set of files that are currently indexed.- Specified by:
getFiles
in interfaceIndexStore
-
getFormat
public SequenceFormat getFormat()
Description copied from interface:IndexStore
Retrieve the format of the index file.This set should either be immutable, or modifiable totally separately from the IndexStore.
- Specified by:
getFormat
in interfaceIndexStore
- Returns:
- a Set of all indexed files
-
getSBFactory
public SequenceBuilderFactory getSBFactory()
Description copied from interface:IndexStore
Retrieve the SequenceBuilderFactory used to build Sequence instances.- Specified by:
getSBFactory
in interfaceIndexStore
- Returns:
- the associated SequenceBuilderFactory
-
getSymbolParser
public SymbolTokenization getSymbolParser()
Description copied from interface:IndexStore
Retrieve the symbol parser used to turn the sequence characters into Symobl objects.- Specified by:
getSymbolParser
in interfaceIndexStore
- Returns:
- the associated SymbolParser
-
close
public void close() throws IOException
close
closes the underlyingEntryNamRandomAccess
which in turn closes the lower levelRandomAccessFile
. This frees the resources associated with the file.- Throws:
IOException
- if an error occurs.
-
-