public class AlternateTokenization extends Unchangeable implements SymbolTokenization, Serializable
Implementation of SymbolTokenization which binds symbols to strings of characters. These tokenizations are intented to provide alternate way of writing sequences into Strings. Therefore they cannot be used for parsing files.
As this release, alternate tokenizations are available for the built-in DNA alphabet (write symbols as capital letter) and PROTEIN-TERM alphabet (write symbol as triplets of characters with the first one being a capital letter as in "Glu".
By convention, instances of AlternateTokenization should have an associated token starting by the word 'alternate'.
SymbolTokenization.TokenType
Annotatable.AnnotationForwarder
CHARACTER, FIXEDWIDTH, SEPARATED, UNKNOWN
ANNOTATION
Constructor and Description |
---|
AlternateTokenization(Alphabet alpha,
boolean caseSensitive) |
Modifier and Type | Method and Description |
---|---|
void |
bindSymbol(Symbol s,
String str)
Bind a Symbol to a string.
|
Alphabet |
getAlphabet()
The alphabet to which this tokenization applies.
|
Annotation |
getAnnotation()
Should return the associated annotation object.
|
SymbolTokenization.TokenType |
getTokenType()
Tokens have fixed size.
|
int |
getWidth()
Get the width of the tokens.
|
StreamParser |
parseStream(SeqIOListener listener)
Will throw an exception.
|
Symbol |
parseToken(String token)
Will throw an exception.
|
String |
tokenizeSymbol(Symbol s)
Return a token representing a single symbol.
|
String |
tokenizeSymbolList(SymbolList sl)
Return a string representation of a list of symbols.
|
addChangeListener, addChangeListener, addForwarder, getForwarders, getListeners, isUnchanging, removeChangeListener, removeChangeListener, removeForwarder
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
addChangeListener, addChangeListener, isUnchanging, removeChangeListener, removeChangeListener
public AlternateTokenization(Alphabet alpha, boolean caseSensitive)
public Alphabet getAlphabet()
SymbolTokenization
getAlphabet
in interface SymbolTokenization
public SymbolTokenization.TokenType getTokenType()
getTokenType
in interface SymbolTokenization
public Annotation getAnnotation()
Annotatable
getAnnotation
in interface Annotatable
public int getWidth()
public void bindSymbol(Symbol s, String str)
s
- the Symbol to bindstr
- the string to bind it topublic Symbol parseToken(String token) throws IllegalSymbolException
parseToken
in interface SymbolTokenization
token
- the token to retrieve a Symbol forIllegalSymbolException
- if there is no Symbol for the tokenpublic String tokenizeSymbol(Symbol s) throws IllegalSymbolException
SymbolTokenization
tokenizeSymbol
in interface SymbolTokenization
s
- The symbolIllegalSymbolException
- if the symbol isn't recognized.public String tokenizeSymbolList(SymbolList sl) throws IllegalAlphabetException
SymbolTokenization
tokenizeSymbolList
in interface SymbolTokenization
sl
- A SymbolListIllegalAlphabetException
- if alphabets don't matchpublic StreamParser parseStream(SeqIOListener listener)
parseStream
in interface SymbolTokenization
listener
- The listener which gets notified of parsed symbols.Copyright © 2020 BioJava. All rights reserved.