See: Description
Interface | Description |
---|---|
BoundaryFinder | |
ChangeTable.Changer |
Callback used to produce a new value from an old one.
|
ChangeTable.Splitter |
Callback used to produce a list of values from a single old one.
|
PropertyChanger |
Interface for objects that change tag names or properties systematically.
|
StateMachine.ExitNotification |
Interface implemented by State listeners that
want notification when a transition leaves the State.
|
StateMachine.State |
Interface for a State within this StateMachine
|
TagValueContext |
Communication interface between Parser and a TagValueListener that allows
listeners to request that a parser/listener pair be pushed onto the stack to
handle the current tag.
|
TagValueListener |
An object that wishes to be informed of events during the parsing of a file.
|
TagValueParser |
Tokenize single records (lines of text, objects) into a tag and a value.
|
TagValueWrapper |
Interface for TagValueListeners that wrap other TagValueListeners
Implementations will tend to intercept the tags or values as they stream
through and modify them in some manner before forwarding them to the delegate
listener.
|
Class | Description |
---|---|
AbstractWrapper |
An abstract TagValueWrapper that does nothing!
|
Aggregator |
Joins multipel values into single values.
|
AnnotationBuilder |
Builds an Annotation tree from TagValue events using an AnnotationType to
work out which fields are of what type.
|
ChangeTable |
A mapping between keys and actions to turn old values into new values.
|
ChangeTable.ChainedChanger |
An implementation of Changer that applies a list of Changer instances to
the value in turn.
|
Echo |
A simple listener that just echoes events back to the console.
|
Formats |
This is intended as a repository for tag-value and AnnotationType information
about common file formats.
|
Index2Model | |
Indexer |
Listens to tag-value events and passes on indexing events to an IndexStore.
|
Indexer2 |
Listens to tag-value events and passes on indexing events to an IndexStore.
|
LineSplitParser |
A parser that splits a line into tag/value at a given column number.
|
MultiTagger |
Partician multiple values for a tag into their own tag groups.
|
Parser |
Encapsulate the parsing of lines from a buffered reader into tag-value
events.
|
ParserListener |
ParserListener is an immutable pairing of a parser and
listener. |
RegexChanger |
A ValueChanger.Changer that returns a specific match value using a regex
Pattern.
|
RegexFieldFinder | |
RegexParser |
A TagValueParser that splits a line based upon a regular expression.
|
RegexSplitter |
A ValueChanger.Splitter that splits a line of text using a regular
expression, returning one value per match.
|
SimpleTagValueWrapper |
Helper class to wrap one TagValueListener inside another one.
|
StateMachine |
This class implements a state machine for parsing events from
the Parser class.
|
TagDelegator |
Pushes a new parser and listener, or delegate to a listener depending on the
tag.
|
TagDropper |
Silently drop all tags except those specified, and pass the rest onto a
delegate.
|
TagMapper |
TagMapper maps arbitrary object keys to new keys. |
TagRenamer |
Rename tags using a TagMapper.
|
TagValue |
Utility class for representing tag-value pairs for TagValueParser
implementors.
|
ValueChanger |
Intercept the values associated with some tags and change them
systematically.
|
Process files as streams of records, each with tags with values.
Many files in biology are structured as multiple records, each of which can be broken down into lines composed from some tag and an associated value. For example, EMBL files have two letter tags and values extend from column 5 to the end of the line. There are a vast array of files all of which have broadly similar structures, and this package aims to provide a framework within which parsing strategies and data consumers can be reused as much as possible.
The data associated with each record is represented by a stream of events
encapsulated by callbacks on the TagValueListener
interface. It is up to the
user to provide implementations of this interface that build static
representations of the data if they so wish.
Parser
and Pushing Sub-DocumentsOften file formats have embedded sub-documents. For example, in EMBL format files the feature table area is identical to that in GENBANK files if the first five columns are ignored. In ACeDB files, every time an ace-tag is found, it causes a new sub-document to be induced with its own structure and set of allowed tags and values. Python code uses indent depth to represent code blocks.
Parser
allows TagValueListener
objects to request
that all of the values associated with the current tag should be handled by a
new TagValueParser
and TagValueListener
pair. The
Parser
instance will use the original TagValueParser
to process the line as before, and then take the value that would have been
handed to the listener's value method, and present it to the newly registered
TagValueParser
to tokenize into tag and value portions. That tag
and value will then be passed onto the new TagValueListener
. The
new TagValueListener
can itself choose to push a new pair of parser
and listener to start a new sub-sub-document. This can be repeated to arbitrary
depth. As soon as a parser and listener pair are registered, the pushed listener
receives a startRecord() message. Once the entire containing record ends (due to
a record separator line such as "//", or because the end of file has been
reached), or if the tag that caused the delegation ends, the pushed listener
will receive the appropriate endTag() message and also endRecord().
TagDelegator
is a useful helper class that always delegates to
a given parser and listener pair on a given tag.
Often while parsing, you will need to change tag names or modify values. In the simple case, all the tags and values will be String instances. You will probably want different types, such as the numeric objects (Double, Int and their friends), or to instantiate your own objects from these Strings. Additionaly, some values are themselves better represented as lists of more fundamental items. There are several TagValueListener helper classes that extend TagValueWrapper that allow you to configure a chain of event transducors while writing the minimal amount of code.
TagMapper
remaps a sub-set of the tags it sees. For example, it
could be configured to replace all "FOO_ID" tags with "accession.number".
ValueChanger
intercepts the value() calls for specific tags and
uses either a ValueChanger.Changer
or
ValueChanger.Splitter
instance to replace or sub-devide the value
before passing it onto another listener.
Copyright © 2014 BioJava. All rights reserved.