Class ChunkedSymbolList

  • All Implemented Interfaces:
    Serializable, SymbolList, Changeable

    public class ChunkedSymbolList
    extends AbstractSymbolList
    implements Serializable
    SymbolList implementation using constant-size chunks. Each chunk provides the same number of symbols (except the last one, which may be shorter). When a request for symbols comes in, firstly the apropreate chunk is located, and then the symbols are extracted. This implementation is used in the IO package to avoid allocating and re-allocating memory when the total length of the symbol list is unknown. It can also be used when chunks are to be lazily fetched from some high-latency stoorage by implementing a single lazy-fetch SymbolList class and populating a ChunkedSymbolList with a complete tile-path of them.
    Author:
    David Huen, Matthew Pocock, George Waldon
    See Also:
    Serialized Form
    • Method Detail

      • getAlphabet

        public Alphabet getAlphabet()
        Description copied from interface: SymbolList
        The alphabet that this SymbolList is over.

        Every symbol within this SymbolList is a member of this alphabet. alphabet.contains(symbol) == true for each symbol that is within this sequence.

        Specified by:
        getAlphabet in interface SymbolList
        Returns:
        the alphabet
      • length

        public int length()
        Description copied from interface: SymbolList
        The number of symbols in this SymbolList.
        Specified by:
        length in interface SymbolList
        Returns:
        the length
      • symbolAt

        public Symbol symbolAt​(int pos)
        Description copied from interface: SymbolList
        Return the symbol at index, counting from 1.
        Specified by:
        symbolAt in interface SymbolList
        Parameters:
        pos - the offset into this SymbolList
        Returns:
        the Symbol at that index
      • subList

        public SymbolList subList​(int start,
                                  int end)
        Description copied from interface: SymbolList
        Return a new SymbolList for the symbols start to end inclusive.

        The resulting SymbolList will count from 1 to (end-start + 1) inclusive, and refer to the symbols start to end of the original sequence.

        Specified by:
        subList in interface SymbolList
        Overrides:
        subList in class AbstractSymbolList
        Parameters:
        start - the first symbol of the new SymbolList
        end - the last symbol (inclusive) of the new SymbolList
      • edit

        public void edit​(Edit edit)
                  throws IllegalAlphabetException,
                         ChangeVetoException
        Description copied from interface: SymbolList
        Apply an edit to the SymbolList as specified by the edit object.

        Description

        All edits can be broken down into a series of operations that change contiguous blocks of the sequence. This represent a one of those operations.

        When applied, this Edit will replace 'length' number of symbols starting a position 'pos' by the SymbolList 'replacement'. This allow to do insertions (length=0), deletions (replacement=SymbolList.EMPTY_LIST) and replacements (length>=1 and replacement.length()>=1).

        The pos and pos+length should always be valid positions on the SymbolList to:

        • be edited (between 0 and symL.length()+1).
        • To append to a sequence, pos=symL.length()+1, pos=0.
        • To insert something at the beginning of the sequence, set pos=1 and length=0.

        Examples

         SymbolList seq = DNATools.createDNA("atcaaaaacgctagc");
         System.out.println(seq.seqString());
        
         // delete 5 bases from position 4
         Edit ed = new Edit(4, 5, SymbolList.EMPTY_LIST);
         seq.edit(ed);
         System.out.println(seq.seqString());
        
         // delete one base from the start
         ed = new Edit(1, 1, SymbolList.EMPTY_LIST);
         seq.edit(ed);
        
         // delete one base from the end
         ed = new Edit(seq.length(), 1, SymbolList.EMPTY_LIST);
         seq.edit(ed);
         System.out.println(seq.seqString());
        
         // overwrite 2 bases from position 3 with "tt"
         ed = new Edit(3, 2, DNATools.createDNA("tt"));
         seq.edit(ed);
         System.out.println(seq.seqString());
        
         // add 6 bases to the start
         ed = new Edit(1, 0, DNATools.createDNA("aattgg");
         seq.edit(ed);
         System.out.println(seq.seqString());
        
         // add 4 bases to the end
         ed = new Edit(seq.length() + 1, 0, DNATools.createDNA("tttt"));
         seq.edit(ed);
         System.out.println(seq.seqString());
        
         // full edit
         ed = new Edit(3, 2, DNATools.createDNA("aatagaa");
         seq.edit(ed);
         System.out.println(seq.seqString());
         
        Specified by:
        edit in interface SymbolList
        Overrides:
        edit in class AbstractSymbolList
        Parameters:
        edit - the Edit to perform
        Throws:
        IllegalAlphabetException - if the SymbolList to insert has an incompatible alphabet
        ChangeVetoException - if either the SymboList does not support the edit, or if the change was vetoed