Class PackedSymbolList

  • All Implemented Interfaces:
    Serializable, SymbolList, Changeable

    public class PackedSymbolList
    extends AbstractSymbolList
    implements Serializable

    A SymbolList that stores symbols as bit-patterns in an array of longs.

    Bit-packed symbol lists are space efficient compared to the usual pointer storage model employed by implementations like SimpleSymbolList. This comes at the cost of encoding/decoding symbols from the storage. In practice, the decrease in memory when storing large sequences makes applications go quicker because of issues like page swapping.

    Symbols can be mapped to and from bit-patterns. The Pattern interface encapsulates this. A SymbolList can then be stored by writing these bit-patterns into memory. This implementation stores the bits in the long elements of an array. The first symbol will be packed into bits 0 through packing.wordLength()-1 of the long at index 0.

    Example Usage

     SymbolList symL = ...;
     SymbolList packed = new PackedSymbolList(
       PackingFactory.getPacking(symL.getAlphabet(), true),
       symL
     );
     
    Author:
    Matthew Pocock, David Huen (new constructor for Symbol arrays and some speedups)
    See Also:
    Serialized Form
    • Constructor Detail

      • PackedSymbolList

        public PackedSymbolList​(Packing packing,
                                long[] syms,
                                int length)

        Create a new PackedSymbolList directly from a bit pattern.

        Warning: This is a risky developer method. You must be sure that the syms array is packed in a way that is consistent with the packing. Also, it is your responsibility to ensure that the length is sensible.

        Parameters:
        packing - the Packing used
        syms - a long array containing already packed symbols
        length - the length of the sequence packed in symbols
      • PackedSymbolList

        public PackedSymbolList​(Packing packing,
                                SymbolList symList)
                         throws IllegalAlphabetException

        Create a new PackedSymbolList as a packed copy of another symbol list.

        This will create a new and independent symbol list that is a copy of the symbols in symList. Both lists can be modified independently.

        Parameters:
        packing - the way to bit-pack symbols
        symList - the SymbolList to copy
        Throws:
        IllegalAlphabetException
    • Method Detail

      • getAlphabet

        public Alphabet getAlphabet()
        Description copied from interface: SymbolList
        The alphabet that this SymbolList is over.

        Every symbol within this SymbolList is a member of this alphabet. alphabet.contains(symbol) == true for each symbol that is within this sequence.

        Specified by:
        getAlphabet in interface SymbolList
        Returns:
        the alphabet
      • length

        public int length()
        Description copied from interface: SymbolList
        The number of symbols in this SymbolList.
        Specified by:
        length in interface SymbolList
        Returns:
        the length
      • symbolAt

        public Symbol symbolAt​(int indx)
        Description copied from interface: SymbolList
        Return the symbol at index, counting from 1.
        Specified by:
        symbolAt in interface SymbolList
        Parameters:
        indx - the offset into this SymbolList
        Returns:
        the Symbol at that index
      • getSyms

        public long[] getSyms()

        Return the long array within which the symbols are bit-packed.

        Warning: This is a risky developer method. This is the actual array that this object uses to store the bits representing symbols. You should not modify this in any way. If you do, you will modify the symbols returned by symbolAt(). This methd is provided primarily as an easy way for developers to extract the bit pattern for storage in such a way as it could be fetched later and fed into the appropriate constructor.

        Returns:
        the actual long array used to store bit-packed symbols