public interface RichSequenceHandler
RichSequence
could be optimized so that the operation
is performed more efficiently than dragging the whole sequence to memory and
then doing the operation.
Implementations of RichSequence
should generally delegate
symbolAt(int index)
, subStr(int start, int end)
,
subList(int start, int end)
and subSequence(int start, int end)
to some implementation of this interface.Modifier and Type | Method and Description |
---|---|
void |
edit(RichSequence seq,
Edit edit)
Apply an edit to the Sequence as specified by the edit object.
|
Iterator<Symbol> |
iterator(RichSequence seq)
An Iterator over all Symbols in this SymbolList.
|
String |
seqString(RichSequence seq)
Stringify this Sequences.
|
SymbolList |
subList(RichSequence seq,
int start,
int end)
Return a new SymbolList for the symbols start to end inclusive.
|
String |
subStr(RichSequence seq,
int start,
int end)
Return a region of this sequence as a String.
|
Symbol |
symbolAt(RichSequence seq,
int index)
Return the symbol at index, counting from 1.
|
List<Symbol> |
toList(RichSequence seq)
Returns a List of symbols.
|
void edit(RichSequence seq, Edit edit) throws IndexOutOfBoundsException, IllegalAlphabetException, ChangeVetoException
All edits can be broken down into a series of operations that change contiguous blocks of the sequence. This represent a one of those operations.
When applied, this Edit will replace 'length' number of symbols starting a position 'pos' by the SymbolList 'replacement'. This allow to do insertions (length=0), deletions (replacement=SymbolList.EMPTY_LIST) and replacements (length>=1 and replacement.length()>=1).
The pos and pos+length should always be valid positions on the SymbolList to:
RichSequence seq = //code to initialize RichSequence
System.out.println(seq.seqString());
// delete 5 bases from position 4
Edit ed = new Edit(4, 5, SymbolList.EMPTY_LIST);
seq.edit(ed);
System.out.println(seq.seqString());
// delete one base from the start
ed = new Edit(1, 1, SymbolList.EMPTY_LIST);
seq.edit(ed);
// delete one base from the end
ed = new Edit(seq.length(), 1, SymbolList.EMPTY_LIST);
seq.edit(ed);
System.out.println(seq.seqString());
// overwrite 2 bases from position 3 with "tt"
ed = new Edit(3, 2, DNATools.createDNA("tt"));
seq.edit(ed);
System.out.println(seq.seqString());
// add 6 bases to the start
ed = new Edit(1, 0, DNATools.createDNA("aattgg");
seq.edit(ed);
System.out.println(seq.seqString());
// add 4 bases to the end
ed = new Edit(seq.length() + 1, 0, DNATools.createDNA("tttt"));
seq.edit(ed);
System.out.println(seq.seqString());
// full edit
ed = new Edit(3, 2, DNATools.createDNA("aatagaa");
seq.edit(ed);
System.out.println(seq.seqString());
edit
- the Edit to performIndexOutOfBoundsException
- if the edit does not lie within the
SymbolListIllegalAlphabetException
- if the SymbolList to insert has an
incompatible alphabetChangeVetoException
- if either the SymboList does not support the
edit, or if the change was vetoedSymbol symbolAt(RichSequence seq, int index) throws IndexOutOfBoundsException
index
- the offset into this SymbolListIndexOutOfBoundsException
- if index is less than 1, or greater than
the length of the symbol listList<Symbol> toList(RichSequence seq)
This should be an immutable list of symbols or a copy.
String subStr(RichSequence seq, int start, int end) throws IndexOutOfBoundsException
This should use the same rules as seqString.
start
- the first symbol to includeend
- the last symbol to includeIndexOutOfBoundsException
- if either start or end are not within the
SymbolListSymbolList subList(RichSequence seq, int start, int end) throws IndexOutOfBoundsException
The resulting SymbolList will count from 1 to (end-start + 1) inclusive, and refer to the symbols start to end of the original sequence.
start
- the first symbol of the new SymbolListend
- the last symbol (inclusive) of the new SymbolListIndexOutOfBoundsException
String seqString(RichSequence seq)
It is expected that this will use the symbol's token to render each symbol. It should be parsable back into a SymbolList using the default token parser for this alphabet.
Iterator<Symbol> iterator(RichSequence seq)
This is an ordered iterator over the Symbols. It cannot be used to edit the underlying symbols.
Copyright © 2020 BioJava. All rights reserved.