Package org.biojava.bio.symbol
Class SuffixTree
- java.lang.Object
-
- org.biojava.bio.symbol.SuffixTree
-
- All Implemented Interfaces:
Serializable
public class SuffixTree extends Object implements Serializable
Suffix tree implementation. The interface is a bit strange, as it needed to be as space-efficient as possible. More work could be done on the space issue.A suffix tree is an efficient method for encoding the frequencies of motifs in a sequence. They are sometimes used to quickly screen for similar sequences. For instance, all motifs of length up to 2 in the sequence
AAGT
could be encoded as:root(4) | A(2)--------G(1)-----T(1) | | A(1)--G(1) T(1)
A possible method of comparing SuffixTrees is provided as a kernel function as
org.biojava.stats.svm.tools.SuffixTreeKernel
.- Author:
- Matthew Pocock, Thomas Down (documentation and other updates)
- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
SuffixTree.SuffixNode
A node in the suffix tree.
-
Constructor Summary
Constructors Constructor Description SuffixTree(FiniteAlphabet alphabet)
Construct a new SuffixTree to contain motifs over the specified alphabet.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addSymbols(SymbolList sList, int window)
Add a count for all motifs with length of up towindow
to this tree.int
frequency(int length)
Return the number of motifs of a given length encoded in this SuffixTree.FiniteAlphabet
getAlphabet()
Return the Alphabet containing all Symbols which might be found in this SuffixTree.SuffixTree.SuffixNode
getChild(SuffixTree.SuffixNode node, int i)
Get the n'th child of a node.SuffixTree.SuffixNode
getChild(SuffixTree.SuffixNode node, Symbol s)
Get a child of a SuffixTree.SuffixNode, constructing a new one if need be.SuffixTree.SuffixNode
getRoot()
Return the node object which is the root of this suffix tree.protected void
incCounts(int i, int c)
int
maxLength()
Return the length of the longest motif represented in this SuffixTree
-
-
-
Constructor Detail
-
SuffixTree
public SuffixTree(FiniteAlphabet alphabet)
Construct a new SuffixTree to contain motifs over the specified alphabet.- Parameters:
alphabet
- The alphabet of this SuffixTree (must be finite).
-
-
Method Detail
-
getAlphabet
public FiniteAlphabet getAlphabet()
Return the Alphabet containing all Symbols which might be found in this SuffixTree.
-
getRoot
public SuffixTree.SuffixNode getRoot()
Return the node object which is the root of this suffix tree. This represents the set of all motifs found in this tree.
-
getChild
public SuffixTree.SuffixNode getChild(SuffixTree.SuffixNode node, Symbol s) throws IllegalSymbolException
Get a child of a SuffixTree.SuffixNode, constructing a new one if need be. This method is here due to memory optimisations.- Throws:
IllegalSymbolException
-
getChild
public SuffixTree.SuffixNode getChild(SuffixTree.SuffixNode node, int i)
Get the n'th child of a node.
-
addSymbols
public void addSymbols(SymbolList sList, int window) throws IllegalSymbolException
Add a count for all motifs with length of up towindow
to this tree.- Parameters:
sList
- a SymbolList whose motifs should be added to the tree.window
- The maximum motif length to count.- Throws:
IllegalSymbolException
-
incCounts
protected void incCounts(int i, int c)
-
maxLength
public int maxLength()
Return the length of the longest motif represented in this SuffixTree
-
frequency
public int frequency(int length)
Return the number of motifs of a given length encoded in this SuffixTree.
-
-