public class EmblCDROMIndexStore extends Object implements IndexStore
EmblCDROMIndexStore
s implement a read-only
IndexStore
backed by EMBL CD-ROM format binary
indices. The required index files are typically named
"division.lkp" and "entrynam.idx". As an IndexStore
performs lookups by sequence ID, the index files "acnum.trg" and
"acnum.hit" (which store additional accession number data) are not
used.
The sequence IDs are found using a binary search via a pointer
into the index file. The whole file is not read unless a request
for all the IDs is made using the getIDs() method. The set of IDs
is then cached after the first pass. This class also has a
close()
method to free resources associated with the
underlying RandomAccessFile
.
The binary index files may be created using the EMBOSS programs
dbifasta, dbiblast, dbiflat or dbigcg. The least useful from the
BioJava perspective is dbigcg because we do not have a
SequenceFormat
implementation for GCG format
files.
The Index
instances returned by this class do not
have the record length set because this information is not
available in the binary index. The value -1 is used instead, as
described in the Index
interface.
Constructor and Description |
---|
EmblCDROMIndexStore(File pathPrefix,
File divisionLkp,
File entryNamIdx,
SequenceFormat format,
SequenceBuilderFactory factory,
SymbolTokenization parser)
Creates a new
EmblCDROMIndexStore backed by a
random access binary index. |
EmblCDROMIndexStore(File divisionLkp,
File entryNamIdx,
SequenceFormat format,
SequenceBuilderFactory factory,
SymbolTokenization parser)
Creates a new
EmblCDROMIndexStore backed by a
random access binary index. |
Modifier and Type | Method and Description |
---|---|
void |
close()
close closes the underlying
EntryNamRandomAccess which in turn closes the
lower level RandomAccessFile . |
void |
commit()
commit commits changes. |
Index |
fetch(String id)
Fetch an Index based upon an ID.
|
Set |
getFiles()
Retrieve the Set of files that are currently indexed.
|
SequenceFormat |
getFormat()
Retrieve the format of the index file.
|
Set |
getIDs()
Retrieve the set of all current IDs.
|
String |
getName()
getName returns the database name as defined
within the EMBL CD-ROM index. |
File |
getPathPrefix()
getPathPrefix returns the abstract path currently
being appended to the raw sequence database filenames extracted
from the binary index. |
SequenceBuilderFactory |
getSBFactory()
Retrieve the SequenceBuilderFactory used to build Sequence instances.
|
SymbolTokenization |
getSymbolParser()
Retrieve the symbol parser used to turn the sequence characters
into Symobl objects.
|
void |
rollback()
rollback rolls back changes made since the last
commit . |
void |
setPathPrefix(File pathPrefix)
setPathPrefix sets the abstract path to be
appended to sequence database filenames retrieved from the
binary index. |
void |
store(Index index)
store adds an Index to the store. |
public EmblCDROMIndexStore(File divisionLkp, File entryNamIdx, SequenceFormat format, SequenceBuilderFactory factory, SymbolTokenization parser) throws IOException
EmblCDROMIndexStore
backed by a
random access binary index.divisionLkp
- a File
containing the master
index.entryNamIdx
- a File
containing the sequence
IDs and offsets.format
- a SequenceFormat
.factory
- a SequenceBuilderFactory
.parser
- a SymbolTokenization
.IOException
- if an error occurs.public EmblCDROMIndexStore(File pathPrefix, File divisionLkp, File entryNamIdx, SequenceFormat format, SequenceBuilderFactory factory, SymbolTokenization parser) throws IOException
EmblCDROMIndexStore
backed by a
random access binary index.pathPrefix
- a File
containing the abstract
path to be appended to sequence database filenames retrieved
from the binary index.divisionLkp
- a File
containing the master
index.entryNamIdx
- a File
containing the sequence
IDs and offsets.format
- a SequenceFormat
.factory
- a SequenceBuilderFactory
.parser
- a SymbolTokenization
.IOException
- if an error occurs.public File getPathPrefix()
getPathPrefix
returns the abstract path currently
being appended to the raw sequence database filenames extracted
from the binary index. This value defaults to the empty
abstract path.File
.public void setPathPrefix(File pathPrefix)
setPathPrefix
sets the abstract path to be
appended to sequence database filenames retrieved from the
binary index. E.g. if the binary index refers to the database
as 'SWALL' and the pathPrefix
is set to
"/usr/local/share/data/seq/", then the IndexStore
will know the database path as
"/usr/local/share/data/seq/swall" and any Index
instances produced by the store will return the latter path
when their getFile() method is called. This value defaults to
the empty abstract path.pathPrefix
- a File
prefix specifying the
abstract path to append.public String getName()
getName
returns the database name as defined
within the EMBL CD-ROM index.getName
in interface IndexStore
String
value.public void store(Index index) throws IllegalIDException, BioException
store
adds an Index
to the store. As
EMBL CD-ROM indices are read-only, this implementation throws a
BioException
.store
in interface IndexStore
index
- an Index
.IllegalIDException
- if an error occurs.BioException
- if an error occurs.public void commit() throws BioException
commit
commits changes. As EMBL CD-ROM indices are
read-only, this implementation throws a
BioException
.commit
in interface IndexStore
BioException
- if an error occurs.public void rollback()
rollback
rolls back changes made since the last
commit
. As EMBL CD-ROM indices are read-only, this
implementation does nothing.rollback
in interface IndexStore
public Index fetch(String id) throws IllegalIDException, BioException
IndexStore
fetch
in interface IndexStore
id
- The ID of the sequence Index to retrieveIllegalIDException
- if the ID couldn't be foundBioException
- if the fetch fails in the underlying storage mechanismpublic Set getIDs()
IndexStore
This set should either be immutable, or modifiable totally separately from the IndexStore.
getIDs
in interface IndexStore
public Set getFiles()
IndexStore
getFiles
in interface IndexStore
public SequenceFormat getFormat()
IndexStore
This set should either be immutable, or modifiable totally separately from the IndexStore.
getFormat
in interface IndexStore
public SequenceBuilderFactory getSBFactory()
IndexStore
getSBFactory
in interface IndexStore
public SymbolTokenization getSymbolParser()
IndexStore
getSymbolParser
in interface IndexStore
public void close() throws IOException
close
closes the underlying
EntryNamRandomAccess
which in turn closes the
lower level RandomAccessFile
. This frees the
resources associated with the file.IOException
- if an error occurs.Copyright © 2020 BioJava. All rights reserved.