Package org.biojava.nbio.data.sequence
Class SequenceUtil
- java.lang.Object
- 
- org.biojava.nbio.data.sequence.SequenceUtil
 
- 
 public final class SequenceUtil extends Object Utility class for operations on sequences- Since:
- 3.0.2
- Version:
- 1.0
- Author:
- Peter Troshin
 
- 
- 
Field SummaryFields Modifier and Type Field Description static PatternAAValid Amino acidsstatic PatternAMBIGUOUS_AASame as AA pattern but with one additional letters - Xstatic PatternAMBIGUOUS_NUCLEOTIDEAmbiguous nucleotidestatic PatternDIGITA digitstatic PatternNON_AAinversion of AA patternstatic PatternNON_NUCLEOTIDENon nucleotidestatic PatternNONWORDNon wordstatic PatternNUCLEOTIDENucleotides a, t, g, c, ustatic PatternWHITE_SPACEA whitespace character: [\t\n\x0B\f\r]
 - 
Method SummaryAll Methods Static Methods Concrete Methods Modifier and Type Method Description static StringcleanSequence(String sequence)Removes all whitespace chars in the sequence stringstatic StringdeepCleanSequence(String sequence)Removes all special characters and digits as well as whitespace chars from the sequencestatic booleanisAmbiguosProtein(String sequence)Check whether the sequence confirms to amboguous protein sequencestatic booleanisNonAmbNucleotideSequence(String sequence)Ambiguous DNA chars : AGTCRYMKSWHBVDN // differs from protein in only one (!)static booleanisNucleotideSequence(FastaSequence s)static booleanisProteinSequence(String sequence)static List<FastaSequence>readFasta(InputStream inStream)Reads fasta sequences from inStream into the list of FastaSequence objectsstatic voidwriteFasta(OutputStream os, List<FastaSequence> sequences)Writes FastaSequence in the file, each sequence will take one line onlystatic voidwriteFasta(OutputStream outstream, List<FastaSequence> sequences, int width)Writes list of FastaSequeces into the outstream formatting the sequence so that it contains width chars on each line
 
- 
- 
- 
Field Detail- 
WHITE_SPACEpublic static final Pattern WHITE_SPACE A whitespace character: [\t\n\x0B\f\r]
 - 
AMBIGUOUS_AApublic static final Pattern AMBIGUOUS_AA Same as AA pattern but with one additional letters - X
 - 
NUCLEOTIDEpublic static final Pattern NUCLEOTIDE Nucleotides a, t, g, c, u
 - 
AMBIGUOUS_NUCLEOTIDEpublic static final Pattern AMBIGUOUS_NUCLEOTIDE Ambiguous nucleotide
 - 
NON_NUCLEOTIDEpublic static final Pattern NON_NUCLEOTIDE Non nucleotide
 
- 
 - 
Method Detail- 
isNucleotideSequencepublic static boolean isNucleotideSequence(FastaSequence s) - Returns:
- true is the sequence contains only letters a,c, t, g, u
 
 - 
isNonAmbNucleotideSequencepublic static boolean isNonAmbNucleotideSequence(String sequence) Ambiguous DNA chars : AGTCRYMKSWHBVDN // differs from protein in only one (!) - B char
 - 
cleanSequencepublic static String cleanSequence(String sequence) Removes all whitespace chars in the sequence string- Parameters:
- sequence-
- Returns:
- cleaned up sequence
 
 - 
deepCleanSequencepublic static String deepCleanSequence(String sequence) Removes all special characters and digits as well as whitespace chars from the sequence- Parameters:
- sequence-
- Returns:
- cleaned up sequence
 
 - 
isProteinSequencepublic static boolean isProteinSequence(String sequence) - Parameters:
- sequence-
- Returns:
- true is the sequence is a protein sequence, false overwise
 
 - 
isAmbiguosProteinpublic static boolean isAmbiguosProtein(String sequence) Check whether the sequence confirms to amboguous protein sequence- Parameters:
- sequence-
- Returns:
- return true only if the sequence if ambiguous protein sequence Return false otherwise. e.g. if the sequence is non-ambiguous protein or DNA
 
 - 
writeFastapublic static void writeFasta(OutputStream outstream, List<FastaSequence> sequences, int width) throws IOException Writes list of FastaSequeces into the outstream formatting the sequence so that it contains width chars on each line- Parameters:
- outstream-
- sequences-
- width- - the maximum number of characters to write in one line
- Throws:
- IOException
 
 - 
readFastapublic static List<FastaSequence> readFasta(InputStream inStream) throws IOException Reads fasta sequences from inStream into the list of FastaSequence objects- Parameters:
- inStream- from
- Returns:
- list of FastaSequence objects
- Throws:
- IOException
 
 - 
writeFastapublic static void writeFasta(OutputStream os, List<FastaSequence> sequences) throws IOException Writes FastaSequence in the file, each sequence will take one line only- Parameters:
- os-
- sequences-
- Throws:
- IOException
 
 
- 
 
-