Class AlphabetManager

  • public final class AlphabetManager
    extends Object
    Utility methods for working with Alphabets. Also acts as a registry for well-known alphabets.

    The alphabet interfaces themselves don't give you a lot of help in actually getting an alphabet instance. This is where the AlphabetManager comes in handy. It helps out in serialization, generating derived alphabets and building CrossProductAlphabet instances. It also contains limited support for parsing complex alphabet names back into the alphabets.

    Matthew Pocock, Thomas Down, Mark Schreiber, George Waldon (alternate tokenization)
    • Method Detail

      • instance

        public static AlphabetManager instance()
        all AlphabetManager methods have become static
        Retrieve the singleton instance.
        the AlphabetManager instance
      • getAllAmbiguitySymbol

        public static Symbol getAllAmbiguitySymbol​(FiniteAlphabet alpha)
        Return the ambiguity symbol which matches all symbols in a given alphabet.
        alpha - The alphabet
        the ambiguity symbol
      • getAllSymbols

        public static Set getAllSymbols​(FiniteAlphabet alpha)
        Return a set containing all possible symbols which can be considered members of a given alphabet, including ambiguous symbols. Warning, this method can return large sets!
        alpha - The alphabet
        The set of symbols that are members of alpha
      • registerAlphabet

        public static void registerAlphabet​(String name,
                                            Alphabet alphabet)
        Register an alphabet by name.
        name - the name by which it can be retrieved
        alphabet - the Alphabet to store
      • registerAlphabet

        public static void registerAlphabet​(String[] names,
                                            Alphabet alphabet)
        Register and Alphabet by more than one name. This allows aliasing of an alphabet with two or more names. It is equivalent to calling registerAlphabet(String name, Alphabet alphabet) several times.
        names - the names by which it can be retrieved
        alphabet - the Alphabet to store
      • registrations

        public static Set registrations()
        A set of names under which Alphabets have been registered.
        a Set of Strings
      • registered

        public static boolean registered​(String name)
        Has an Alphabet been registered by that name
        name - the name of the alphabet
        true if it has or false otherwise
      • alphabets

        public static Iterator alphabets()
        Get an iterator over all alphabets known.
        an Iterator over Alphabet objects
      • getGapSymbol

        public static Symbol getGapSymbol()

        Get the special `gap' Symbol.

        The gap symbol is a Symbol that has an empty alphabet of matches. As such , ever alphabet contains gap, as there is no symbol that matches gap, so there is no case where an alphabet doesn't contain a symbol that matches gap.

        Gap can be thought of as an empty sub-space within the space of all possible symbols. If you are working in a cross-product alphabet, you should chose whether to use gap to represent 'no symbol', or a basis symbol of the appropriate size built entirely of gaps to represent 'no symbol in each of the slots'. Perhaps this could be explained better.

        the system-wide symbol that represents a gap
      • getGapSymbol

        public static Symbol getGapSymbol​(List alphas)

        Get the gap symbol appropriate to this list of alphabets.

        The gap symbol with have the same shape a the alphabet list. It will be as long as the list, and if any of the alphabets in the list have a dimension greater than 1, it will also insert the appropriate gap there.

        alphas - List of alphabets
        the appropriate gap symbol for the alphabet list
      • createSymbol

        public static AtomicSymbol createSymbol​(String name,
                                                Annotation annotation)

        Generate a new AtomicSymbol instance with a name and Annotation.

        Use this method if you wish to create an AtomicSymbol instance. Initially it will not be a member of any alphabet.

        name - the String returned by getName()
        annotation - the Annotation returned by getAnnotation()
        a new AtomicSymbol instance
      • createSymbol

        public static AtomicSymbol createSymbol​(String name)

        Generate a new AtomicSymbol instance with a name and an Empty Annotation.

        Use this method if you wish to create an AtomicSymbol instance. Initially it will not be a member of any alphabet.

        name - the String returned by getName()
        a new AtomicSymbol instance
      • createSymbol

        public static AtomicSymbol createSymbol​(char token,
                                                String name,
                                                Annotation annotation)
        Use the two-arg version of this method instead.

        Generate a new AtomicSymbol instance with a token, name and Annotation.

        Use this method if you wish to create an AtomicSymbol instance. Initially it will not be a member of any alphabet.

        token - the Char token returned by getToken() (ignpred as of BioJava 1.2)
        name - the String returned by getName()
        annotation - the Annotation returned by getAnnotation()
        a new AtomicSymbol instance
      • createSymbol

        public static Symbol createSymbol​(char token,
                                          Annotation annotation,
                                          List symList,
                                          Alphabet alpha)
                                   throws IllegalSymbolException
        use the new version, without the token argument

        Generates a new Symbol instance that represents the tuple of Symbols in symList.

        This method is most useful for writing Alphabet implementations. It should not be invoked by casual users. Use alphabet.getSymbol(List) instead.

        annotation - The annotation bundle for the symbol
        token - the Symbol's token [ignored since 1.2]
        symList - a list of Symbol objects
        alpha - the Alphabet that this Symbol will reside in
        a Symbol that encapsulates that List
        IllegalSymbolException - If the Symbol cannot be made
      • createSymbol

        public static Symbol createSymbol​(Annotation annotation,
                                          List symList,
                                          Alphabet alpha)
                                   throws IllegalSymbolException

        Generates a new Symbol instance that represents the tuple of Symbols in symList. This will attempt to return the same symbol for the same list.

        This method is most useful for writing Alphabet implementations. It should not be invoked by casual users. Use alphabet.getSymbol(List) instead.

        annotation - The annotation bundle for the Symbol
        symList - a list of Symbol objects
        alpha - the Alphabet that this Symbol will reside in
        a Symbol that encapsulates that List
        IllegalSymbolException - If the Symbol cannot be made
      • createSymbol

        public static Symbol createSymbol​(char token,
                                          Annotation annotation,
                                          Set symSet,
                                          Alphabet alpha)
                                   throws IllegalSymbolException
        use the three-arg version of this method instead.

        Generates a new Symbol instance that represents the tuple of Symbols in symList.

        This method is most useful for writing Alphabet implementations. It should not be invoked by users. Use alphabet.getSymbol(Set) instead.

        token - the Symbol's token [ignored since 1.2]
        annotation - the Symbol's Annotation
        symSet - a Set of Symbol objects
        alpha - the Alphabet that this Symbol will reside in
        a Symbol that encapsulates that List
        IllegalSymbolException - If the Symbol cannot be made
      • createSymbol

        public static Symbol createSymbol​(Annotation annotation,
                                          Set symSet,
                                          Alphabet alpha)
                                   throws IllegalSymbolException

        Generates a new Symbol instance that represents the tuple of Symbols in symList.

        This method is most useful for writing Alphabet implementations. It should not be invoked by users. Use alphabet.getSymbol(Set) instead.

        annotation - the Symbol's Annotation
        symSet - a Set of Symbol objects
        alpha - the Alphabet that this Symbol will reside in
        a Symbol that encapsulates that List
        IllegalSymbolException - If the Symbol cannot be made
      • generateCrossProductAlphaFromName

        public static Alphabet generateCrossProductAlphaFromName​(String name)
        Generates a new CrossProductAlphabet from the give name.
        name - the name to parse
        the associated Alphabet
      • getCrossProductAlphabet

        public static Alphabet getCrossProductAlphabet​(List aList)

        Retrieve a CrossProductAlphabet instance over the alphabets in aList.

        If all of the alphabets in aList implements FiniteAlphabet then the method will return a FiniteAlphabet. Otherwise, it returns a non-finite alphabet.

        If you call this method twice with a list containing the same alphabets, it will return the same alphabet. This promotes the re-use of alphabets and helps to maintain the 'flyweight' principal for finite alphabet symbols.

        The resulting alphabet cpa will be retrievable via AlphabetManager.alphabetForName(cpa.getName())

        aList - a list of Alphabet objects
        a CrossProductAlphabet that is over the alphabets in aList
      • getCrossProductAlphabet

        public static Alphabet getCrossProductAlphabet​(List aList,
                                                       String name)
                                                throws IllegalAlphabetException
        Attempts to create a cross product alphabet and register it under a name.
        aList - A list of alphabets
        name - The name which the new alphabet will be registered under.
        The CrossProductAlphabet
        IllegalAlphabetException - If the Alphabet cannot be made or a different alphabet is already registed under this name.
      • getCrossProductAlphabet

        public static Alphabet getCrossProductAlphabet​(List aList,
                                                       Alphabet parent)

        Retrieve a CrossProductAlphabet instance over the alphabets in aList.

        This method is most usefull for implementors of cross-product alphabets, allowing them to safely build the matches alphabets for ambiguity symbols.

        If all of the alphabets in aList implements FiniteAlphabet then the method will return a FiniteAlphabet. Otherwise, it returns a non-finite alphabet.

        If you call this method twice with a list containing the same alphabets, it will return the same alphabet. This promotes the re-use of alphabets and helps to maintain the 'flyweight' principal for finite alphabet symbols.

        The resulting alphabet cpa will be retrievable via AlphabetManager.alphabetForName(cpa.getName())

        aList - a list of Alphabet objects
        parent - a parent alphabet
        a CrossProductAlphabet that is over the alphabets in aList
      • factorize

        public static List factorize​(Alphabet alpha,
                                     Set symSet)
                              throws IllegalSymbolException

        Return a list of BasisSymbol instances that uniquely sum up all AtomicSymbol instances in symSet. If the symbol can't be represented by a single list of BasisSymbol instances, return null.

        This method is most useful for implementers of Alphabet and Symbol. It probably should not be invoked by users.

        symSet - the Set of AtomicSymbol instances
        alpha - the Alphabet instance that the Symbols are from
        a List of BasisSymbols
        IllegalSymbolException - In practice it should not. If it does it probably indicates a subtle bug somewhere in AlphabetManager
      • loadAlphabets

        public static void loadAlphabets​(InputSource is)
                                  throws SAXException,
        Load additional Alphabets, defined in XML format, into the AlphabetManager's registry. These can the be retrieved by calling alphabetForName.
        is - an InputSource encapsulating the document to be parsed
        IOException - if there is an error accessing the stream
        SAXException - if there is an error while parsing the document
        BioException - if a problem occurs when creating the new Alphabets.
      • getAlphabetIndex

        public static AlphabetIndex getAlphabetIndex​(FiniteAlphabet alpha)
        Get an indexer for a specified alphabet.
        alpha - The alphabet to index
        an AlphabetIndex instance