Interface Symbol
-
- All Superinterfaces:
Annotatable
,Changeable
- All Known Subinterfaces:
AtomicSymbol
,BasisSymbol
,DotState
,EmissionState
,ModelInState
,State
- All Known Implementing Classes:
AbstractSymbol
,DoubleAlphabet.DoubleRange
,DoubleAlphabet.DoubleSymbol
,FundamentalAtomicSymbol
,IntegerAlphabet.IntegerSymbol
,MagicalState
,ProfileEmissionState
,SimpleAtomicSymbol
,SimpleDotState
,SimpleEmissionState
,SimpleModelInState
public interface Symbol extends Annotatable
A single symbol.This is the atomic unit of a SymbolList, or a sequence. It allows for fine-grain fly-weighting, so that there can be one instance of each symbol that is referenced multiple times.
Symbols from finite alphabets are identifiable using the == operator. Symbols from infinite alphabets may have some specific API to test for equality, but should realy over-ride the equals() method.
Some symbols represent a single token in the sequence. For example, there is a Symbol instance for adenine in DNA, and another one for cytosine. Symbols can potentialy represent sets of Symbols. For example, n represents any DNA Symbol, and X any protein Symbol. Gap represents the knowledge that there is no Symbol. In addition, some symbols represent ordered lists of other Symbols. For example, the codon agt can be represented by a single Symbol from the Alphabet DNAxDNAxDNA. Symbols can represent ambiguity over these complex symbols. For example, you could construct a Symbol instance that represents the codons atn. This matches the codons {ata, att, atg, atc}. It is also possible to build a Symbol instance that represents all stop codons {taa, tag, tga}, which can not be represented in terms of a single ambiguous n'tuple.
There are three Symbol interfaces. Symbol is the most generic. It has the methods getToken and getName so that the Symbol can be textually represented. In addition, it defines getMatches that returns an Alphabet over all the AtomicSymbol instances that match the Symbol (N would return an Alphabet containing {A, G, C, T}, and Gap would return {}).
BasisSymbol instances can always be represented by an n'tuple of BasisSymbol instances. It adds the method getSymbols so that you can retrieve this list. For example, the tuple [ant] is a BasisSymbol, as it is uniquely specified with those three BasisSymbol instances a, n and t. n is a BasisSymbol instance as it is uniquely represented by itself.
AtomicSymbol instances specialize BasisSymbol by guaranteeing that getMatches returns a set containing only that instance. That is, they are indivisable. The DNA nucleotides are instances of AtomicSymbol, as are individual codons. The stop codon {tag} will have a getMatches method that returns {tag}, a getBases method that also returns {tag} and a getSymbols method that returns the List [t, a, g]. {tna} is a BasisSymbol but not an AtomicSymbol as it matches four AtomicSymbol instances {taa, tga, tca, tta}. It follows that each symbol in getSymbols for an AtomicSymbol instance will also be AtomicSymbol instances.
- Author:
- Matthew Pocock
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.biojava.bio.Annotatable
Annotatable.AnnotationForwarder
-
-
Field Summary
-
Fields inherited from interface org.biojava.bio.Annotatable
ANNOTATION
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description Alphabet
getMatches()
The alphabet containing the symbols matched by this ambiguity symbol.String
getName()
The long name for the symbol.-
Methods inherited from interface org.biojava.bio.Annotatable
getAnnotation
-
Methods inherited from interface org.biojava.utils.Changeable
addChangeListener, addChangeListener, isUnchanging, removeChangeListener, removeChangeListener
-
-
-
-
Method Detail
-
getMatches
Alphabet getMatches()
The alphabet containing the symbols matched by this ambiguity symbol.This alphabet contains all of, and only, the symbols matched by this symbol. For example, the symbol representing the DNA ambiguity code for W would contain the symbol for A and T from the DNA alphabet.
- Returns:
- the Alphabet of symbols matched by this symbol
-
-