Package org.biojava.bio.seq.io
Interface SymbolTokenization
-
- All Superinterfaces:
Annotatable
,Changeable
- All Known Implementing Classes:
AlternateTokenization
,CharacterTokenization
,CrossProductTokenization
,DoubleTokenization
,IntegerTokenization
,NameTokenization
,SoftMaskedAlphabet.CaseSensitiveTokenization
,SubIntegerTokenization
,WordTokenization
public interface SymbolTokenization extends Annotatable
Encapsulate a mapping between BioJava Symbol objects and some string representation.- Since:
- 1.2
- Author:
- Thomas Down
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface Description static class
SymbolTokenization.TokenType
-
Nested classes/interfaces inherited from interface org.biojava.bio.Annotatable
Annotatable.AnnotationForwarder
-
-
Field Summary
Fields Modifier and Type Field Description static SymbolTokenization.TokenType
CHARACTER
static SymbolTokenization.TokenType
FIXEDWIDTH
static SymbolTokenization.TokenType
SEPARATED
static SymbolTokenization.TokenType
UNKNOWN
-
Fields inherited from interface org.biojava.bio.Annotatable
ANNOTATION
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description Alphabet
getAlphabet()
The alphabet to which this tokenization applies.SymbolTokenization.TokenType
getTokenType()
Determine the style of tokenization represented by this object.StreamParser
parseStream(SeqIOListener listener)
Return an object which can parse an arbitrary character stream into symbols.Symbol
parseToken(String token)
Returns the symbol for a single token.String
tokenizeSymbol(Symbol sym)
Return a token representing a single symbol.String
tokenizeSymbolList(SymbolList symList)
Return a string representation of a list of symbols.-
Methods inherited from interface org.biojava.bio.Annotatable
getAnnotation
-
Methods inherited from interface org.biojava.utils.Changeable
addChangeListener, addChangeListener, isUnchanging, removeChangeListener, removeChangeListener
-
-
-
-
Field Detail
-
CHARACTER
static final SymbolTokenization.TokenType CHARACTER
-
FIXEDWIDTH
static final SymbolTokenization.TokenType FIXEDWIDTH
-
SEPARATED
static final SymbolTokenization.TokenType SEPARATED
-
UNKNOWN
static final SymbolTokenization.TokenType UNKNOWN
-
-
Method Detail
-
getAlphabet
Alphabet getAlphabet()
The alphabet to which this tokenization applies.
-
getTokenType
SymbolTokenization.TokenType getTokenType()
Determine the style of tokenization represented by this object.
-
parseToken
Symbol parseToken(String token) throws IllegalSymbolException
Returns the symbol for a single token.The Symbol will be a member of the alphabet. If the token is not recognized as mapping to a symbol, an exception will be thrown.
- Parameters:
token
- the token to retrieve a Symbol for- Returns:
- the Symbol for that token
- Throws:
IllegalSymbolException
- if there is no Symbol for the token
-
parseStream
StreamParser parseStream(SeqIOListener listener)
Return an object which can parse an arbitrary character stream into symbols.- Parameters:
listener
- The listener which gets notified of parsed symbols.
-
tokenizeSymbol
String tokenizeSymbol(Symbol sym) throws IllegalSymbolException
Return a token representing a single symbol.- Parameters:
sym
- The symbol- Throws:
IllegalSymbolException
- if the symbol isn't recognized.
-
tokenizeSymbolList
String tokenizeSymbolList(SymbolList symList) throws IllegalAlphabetException, IllegalSymbolException
Return a string representation of a list of symbols.- Parameters:
symList
- A SymbolList- Throws:
IllegalAlphabetException
- if alphabets don't matchIllegalSymbolException
-
-