public class MotifTools extends Object
MotifTools contains utility methods for sequence
motifs.| Constructor and Description |
|---|
MotifTools() |
| Modifier and Type | Method and Description |
|---|---|
static String |
createRegex(SymbolList motif)
createRegex creates a regular expression which
matches the SymbolList. |
public MotifTools()
public static String createRegex(SymbolList motif)
createRegex creates a regular expression which
matches the SymbolList. Ambiguous
Symbols are simply transformed into character
classes. For example the nucleotide sequence "AAGCTT" becomes
"A{2}GCT{2}" and "CTNNG" is expanded to
"CT[ABCDGHKMNRSTVWY]{2}G". The character class is generated
using the getMatches method of an ambiguity symbol
to obtain the alphabet of AtomicSymbols it
matches, followed by calling getAllSymbols on this
alphabet, removal of any gap symbols and then tokenization of
the remainder. The ordering of the tokens in a character class
is by ascending numerical order of their tokens as determined
by Arrays.sort(char []).
The Alphabet of the SymbolList
must be finite and must have a character token type. Regular
expressions may be generated for any such
SymbolList, not just DNA, RNA and protein.
motif - a SymbolList.String regular expression.Copyright © 2020 BioJava. All rights reserved.