Class SimpleSequence

  • All Implemented Interfaces:
    Serializable, Annotatable, FeatureHolder, RealizingFeatureHolder, Sequence, SymbolList, Changeable
    Direct Known Subclasses:
    PhredSequence, RevCompSequence

    public class SimpleSequence
    extends AbstractChangeable
    implements Sequence, RealizingFeatureHolder, Serializable
    A basic implementation of the Sequence interface.

    This class now implements all methods in the SymbolList interface by delegating to another SymbolList object. This avoids unnecessary copying, but means that any changes in the underlying SymbolList will be silently reflected in the SimpleSequence. In general, SimpleSequences should only be constructed from SymbolLists which are known to be immutable.

    By default, features attached to a SimpleSequence are realized as simple in-memory implementations using SimpleFeatureRealizer.DEFAULT. If you need alternative feature realization behaviour, any FeatureRealizer implementation may be supplied at construction-time.

    More functionality and better persistence to biosql is offered by SimpleRichSequence
    Author:
    Matthew Pocock, Thomas Down, Mark Schreiber
    See Also:
    Serialized Form
    • Constructor Detail

      • SimpleSequence

        public SimpleSequence​(SymbolList sym,
                              String urn,
                              String name,
                              Annotation annotation)
        Create a SimpleSequence with the symbols and alphabet of sym, and the sequence properties listed.
        Parameters:
        sym - the SymbolList to wrap as a sequence
        urn - the URN
        name - the name - should be unique if practical
        annotation - the annotation object to use or null
      • SimpleSequence

        public SimpleSequence​(SymbolList sym,
                              String urn,
                              String name,
                              Annotation annotation,
                              FeatureRealizer realizer)
        Create a SimpleSequence using a specified FeatureRealizer.
        Parameters:
        sym - the SymbolList to wrap as a sequence
        urn - the URN
        name - the name - should be unique if practical
        annotation - the annotation object to use or null
        realizer - the FeatureRealizer implemetation to use when adding features
    • Method Detail

      • countFeatures

        public int countFeatures()
        Description copied from interface: FeatureHolder
        Count how many features are contained.
        Specified by:
        countFeatures in interface FeatureHolder
        Returns:
        a positive integer or zero, equal to the number of features contained
      • edit

        public void edit​(Edit edit)
                  throws ChangeVetoException
        Description copied from interface: SymbolList
        Apply an edit to the SymbolList as specified by the edit object.

        Description

        All edits can be broken down into a series of operations that change contiguous blocks of the sequence. This represent a one of those operations.

        When applied, this Edit will replace 'length' number of symbols starting a position 'pos' by the SymbolList 'replacement'. This allow to do insertions (length=0), deletions (replacement=SymbolList.EMPTY_LIST) and replacements (length>=1 and replacement.length()>=1).

        The pos and pos+length should always be valid positions on the SymbolList to:

        • be edited (between 0 and symL.length()+1).
        • To append to a sequence, pos=symL.length()+1, pos=0.
        • To insert something at the beginning of the sequence, set pos=1 and length=0.

        Examples

         SymbolList seq = DNATools.createDNA("atcaaaaacgctagc");
         System.out.println(seq.seqString());
        
         // delete 5 bases from position 4
         Edit ed = new Edit(4, 5, SymbolList.EMPTY_LIST);
         seq.edit(ed);
         System.out.println(seq.seqString());
        
         // delete one base from the start
         ed = new Edit(1, 1, SymbolList.EMPTY_LIST);
         seq.edit(ed);
        
         // delete one base from the end
         ed = new Edit(seq.length(), 1, SymbolList.EMPTY_LIST);
         seq.edit(ed);
         System.out.println(seq.seqString());
        
         // overwrite 2 bases from position 3 with "tt"
         ed = new Edit(3, 2, DNATools.createDNA("tt"));
         seq.edit(ed);
         System.out.println(seq.seqString());
        
         // add 6 bases to the start
         ed = new Edit(1, 0, DNATools.createDNA("aattgg");
         seq.edit(ed);
         System.out.println(seq.seqString());
        
         // add 4 bases to the end
         ed = new Edit(seq.length() + 1, 0, DNATools.createDNA("tttt"));
         seq.edit(ed);
         System.out.println(seq.seqString());
        
         // full edit
         ed = new Edit(3, 2, DNATools.createDNA("aatagaa");
         seq.edit(ed);
         System.out.println(seq.seqString());
         
        Specified by:
        edit in interface SymbolList
        Parameters:
        edit - the Edit to perform
        Throws:
        ChangeVetoException - if either the SymboList does not support the edit, or if the change was vetoed
      • filter

        public FeatureHolder filter​(FeatureFilter filter)
        Description copied from interface: FeatureHolder
        Query this set of features using a supplied FeatureFilter.
        Specified by:
        filter in interface FeatureHolder
        Parameters:
        filter - the FeatureFilter to apply.
        Returns:
        all features in this container which match filter.
      • filter

        public FeatureHolder filter​(FeatureFilter ff,
                                    boolean recurse)
        Description copied from interface: FeatureHolder
        Return a new FeatureHolder that contains all of the children of this one that passed the filter fc. This method is scheduled for deprecation. Use the 1-arg filter instead.
        Specified by:
        filter in interface FeatureHolder
        Parameters:
        ff - the FeatureFilter to apply
        recurse - true if all features-of-features should be scanned, and a single flat collection of features returned, or false if just immediate children should be filtered.
      • getAlphabet

        public Alphabet getAlphabet()
        Description copied from interface: SymbolList
        The alphabet that this SymbolList is over.

        Every symbol within this SymbolList is a member of this alphabet. alphabet.contains(symbol) == true for each symbol that is within this sequence.

        Specified by:
        getAlphabet in interface SymbolList
        Returns:
        the alphabet
      • getChangeSupport

        protected ChangeSupport getChangeSupport​(ChangeType ct)
        Description copied from class: AbstractChangeable
        Called to retrieve the ChangeSupport for this object.

        Your implementation of this method should have the following structure:

         ChangeSupport cs = super.getChangeSupport(ct);
        
         if(someForwarder == null && ct.isMatching(SomeInterface.SomeChangeType)) {
           someForwarder = new ChangeForwarder(...
        
           this.stateVariable.addChangeListener(someForwarder, VariableInterface.AChange);
         }
        
         return cs;
         
        It is usual for the forwarding listeners (someForwarder in this example) to be transient and lazily instantiated. Be sure to register & unregister the forwarder in the code that does the ChangeEvent handling in setter methods.
        Overrides:
        getChangeSupport in class AbstractChangeable
      • getName

        public String getName()
        Description copied from interface: Sequence
        The name of this sequence.

        The name may contain spaces or odd characters.

        Specified by:
        getName in interface Sequence
        Returns:
        the name as a String
      • getSchema

        public FeatureFilter getSchema()
        Description copied from interface: FeatureHolder
        Return a schema-filter for this FeatureHolder. This is a filter which all Features immediately contained by this FeatureHolder will match. It need not directly match their child features, but it can (and should!) provide information about them using FeatureFilter.OnlyChildren filters. In cases where there is no feature hierarchy, this can be indicated by including FeatureFilter.leaf in the schema filter.

        For the truly non-informative case, it is possible to return FeatureFilter.all. However, it is almost always possible to provide slightly more information that this. For example, Sequence objects should, at a minimum, return FeatureFilter.top_level. Feature objects should, as a minimum, return FeatureFilter.ByParent(new FeatureFilter.ByFeature(this)).

        Specified by:
        getSchema in interface FeatureHolder
        Returns:
        the schema filter
      • getURN

        public String getURN()
        Description copied from interface: Sequence
        A Uniform Resource Identifier (URI) which identifies the sequence represented by this object. For sequences in well-known database, this may be a URN, e.g.
         urn:sequence/embl:AL121903
         
        It may also be a URL identifying a specific resource, either locally or over the network
         file:///home/thomas/myseq.fa|seq22
         http://www.mysequences.net/chr22.seq
         
        Specified by:
        getURN in interface Sequence
        Returns:
        the URI as a String
      • iterator

        public Iterator iterator()
        Description copied from interface: SymbolList
        An Iterator over all Symbols in this SymbolList.

        This is an ordered iterator over the Symbols. It cannot be used to edit the underlying symbols.

        Specified by:
        iterator in interface SymbolList
        Returns:
        an iterator
      • length

        public int length()
        Description copied from interface: SymbolList
        The number of symbols in this SymbolList.
        Specified by:
        length in interface SymbolList
        Returns:
        the length
      • seqString

        public String seqString()
        Description copied from interface: SymbolList
        Stringify this symbol list.

        It is expected that this will use the symbol's token to render each symbol. It should be parsable back into a SymbolList using the default token parser for this alphabet.

        Specified by:
        seqString in interface SymbolList
        Returns:
        a string representation of the symbol list
      • setName

        public void setName​(String name)
        Assign a name to this sequence
      • setURN

        public void setURN​(String urn)
        Provide the URN for this sequence
      • subList

        public SymbolList subList​(int start,
                                  int end)
        Description copied from interface: SymbolList
        Return a new SymbolList for the symbols start to end inclusive.

        The resulting SymbolList will count from 1 to (end-start + 1) inclusive, and refer to the symbols start to end of the original sequence.

        Specified by:
        subList in interface SymbolList
        Parameters:
        start - the first symbol of the new SymbolList
        end - the last symbol (inclusive) of the new SymbolList
      • subStr

        public String subStr​(int start,
                             int end)
        Description copied from interface: SymbolList
        Return a region of this symbol list as a String.

        This should use the same rules as seqString.

        Specified by:
        subStr in interface SymbolList
        Parameters:
        start - the first symbol to include
        end - the last symbol to include
        Returns:
        the string representation
      • symbolAt

        public Symbol symbolAt​(int index)
        Description copied from interface: SymbolList
        Return the symbol at index, counting from 1.
        Specified by:
        symbolAt in interface SymbolList
        Parameters:
        index - the offset into this SymbolList
        Returns:
        the Symbol at that index
      • toList

        public List toList()
        Description copied from interface: SymbolList
        Returns a List of symbols.

        This is an immutable list of symbols. Do not edit it.

        Specified by:
        toList in interface SymbolList
        Returns:
        a List of Symbols