public final class SeqIOTools extends Object
Modifier and Type | Method and Description |
---|---|
static void |
biojavaToFile(int fileType,
OutputStream os,
Object biojava)
Deprecated.
Converts a Biojava object to the given filetype.
|
static void |
biojavaToFile(String formatName,
String alphabetName,
OutputStream os,
Object biojava)
Deprecated.
Writes a Biojava
SequenceIterator ,
SequenceDB , Sequence or Aligment
to an OutputStream |
static Object |
fileToBiojava(int fileType,
BufferedReader br)
Deprecated.
Reads a file and returns the corresponding Biojava object.
|
static Object |
fileToBiojava(String formatName,
String alphabetName,
BufferedReader br)
Deprecated.
Reads a file with the specified format and alphabet
|
static SequenceBuilderFactory |
formatToFactory(SequenceFormat format,
Alphabet alpha)
Deprecated.
as this essentially duplicates the operation
available in the method
identifyBuilderFactory . |
static FiniteAlphabet |
getAlphabet(int identifier)
Deprecated.
getAlphabet accepts a value which represents a
sequence format and returns the relevant
FiniteAlphabet object. |
static SequenceBuilderFactory |
getBuilderFactory(int identifier)
Deprecated.
getBuilderFactory accepts a value which represents
a sequence format and returns the relevant
SequenceBuilderFactory object. |
static SequenceBuilderFactory |
getEmblBuilderFactory()
Deprecated.
Get a default SequenceBuilderFactory for handling EMBL
files.
|
static SequenceBuilderFactory |
getFastaBuilderFactory()
Deprecated.
Get a default SequenceBuilderFactory for handling FASTA
files.
|
static SequenceBuilderFactory |
getGenbankBuilderFactory()
Deprecated.
Get a default SequenceBuilderFactory for handling GenBank
files.
|
static SequenceBuilderFactory |
getGenpeptBuilderFactory()
Deprecated.
Get a default SequenceBuilderFactory for handling Genpept
files.
|
static SequenceFormat |
getSequenceFormat(int identifier)
Deprecated.
getSequenceFormat accepts a value which represents
a sequence format and returns the relevant
SequenceFormat object. |
static SequenceBuilderFactory |
getSwissprotBuilderFactory()
Deprecated.
Get a default SequenceBuilderFactory for handling Swissprot
files.
|
static int |
guessFileType(File seqFile)
Deprecated.
because there is no standard file naming convention
and guessing by file name is inherantly error prone and bad.
|
static int |
identifyFormat(String formatName,
String alphabetName)
Deprecated.
identifyFormat performs a case-insensitive mapping
of a pair of common sequence format name (such as 'embl',
'genbank' or 'fasta') and alphabet name (such as 'dna', 'rna',
'protein', 'aa') to an integer. |
static SequenceIterator |
readEmbl(BufferedReader br)
Deprecated.
Iterate over the sequences in an EMBL-format stream.
|
static SequenceIterator |
readEmblNucleotide(BufferedReader br)
Deprecated.
Iterate over the sequences in an EMBL-format stream.
|
static SequenceIterator |
readEmblRNA(BufferedReader br)
Deprecated.
Iterate over the sequences in an EMBL-format stream, but for RNA.
|
static SequenceIterator |
readFasta(BufferedReader br,
SymbolTokenization sTok)
Deprecated.
Read a fasta file.
|
static SequenceIterator |
readFasta(BufferedReader br,
SymbolTokenization sTok,
SequenceBuilderFactory seqFactory)
Deprecated.
Read a fasta file using a custom type of SymbolList.
|
static SequenceDB |
readFasta(InputStream seqFile,
Alphabet alpha)
Deprecated.
Create a sequence database from a fasta file provided as an
input stream.
|
static SequenceIterator |
readFastaDNA(BufferedReader br)
Deprecated.
Iterate over the sequences in an FASTA-format stream of DNA sequences.
|
static SequenceIterator |
readFastaProtein(BufferedReader br)
Deprecated.
Iterate over the sequences in an FASTA-format stream of Protein sequences.
|
static SequenceIterator |
readFastaRNA(BufferedReader br)
Deprecated.
Iterate over the sequences in an FASTA-format stream of RNA sequences.
|
static SequenceIterator |
readGenbank(BufferedReader br)
Deprecated.
Iterate over the sequences in an Genbank-format stream.
|
static SequenceIterator |
readGenbankXml(BufferedReader br)
Deprecated.
Iterate over the sequences in an GenbankXML-format stream.
|
static SequenceIterator |
readGenpept(BufferedReader br)
Deprecated.
Iterate over the sequences in an Genpept-format stream.
|
static SequenceIterator |
readSwissprot(BufferedReader br)
Deprecated.
Iterate over the sequences in an Swissprot-format stream.
|
static void |
writeEmbl(OutputStream os,
Sequence seq)
Deprecated.
Writes a single Sequence to an OutputStream in EMBL format.
|
static void |
writeEmbl(OutputStream os,
SequenceIterator in)
Deprecated.
Writes a stream of Sequences to an OutputStream in EMBL format.
|
static void |
writeFasta(OutputStream os,
Sequence seq)
Deprecated.
Writes a single Sequence to an OutputStream in Fasta format.
|
static void |
writeFasta(OutputStream os,
SequenceDB db)
Deprecated.
Write a sequenceDB to an output stream in fasta format.
|
static void |
writeFasta(OutputStream os,
SequenceIterator in)
Deprecated.
Writes sequences from a SequenceIterator to an OutputStream in
Fasta Format.
|
static void |
writeGenbank(OutputStream os,
Sequence seq)
Deprecated.
Writes a single Sequence to an OutputStream in Genbank format.
|
static void |
writeGenbank(OutputStream os,
SequenceIterator in)
Deprecated.
Writes a stream of Sequences to an OutputStream in Genbank
format.
|
static void |
writeGenpept(OutputStream os,
Sequence seq)
Deprecated.
Writes a single Sequence to an OutputStream in Genpept format.
|
static void |
writeGenpept(OutputStream os,
SequenceIterator in)
Deprecated.
Writes a stream of Sequences to an OutputStream in Genpept
format.
|
static void |
writeSwissprot(OutputStream os,
Sequence seq)
Deprecated.
Writes a single Sequence to an OutputStream in SwissProt format.
|
static void |
writeSwissprot(OutputStream os,
SequenceIterator in)
Deprecated.
Writes a stream of Sequences to an OutputStream in SwissProt
format.
|
public static SequenceBuilderFactory getEmblBuilderFactory()
SmartSequenceBuilder.FACTORY
public static SequenceIterator readEmbl(BufferedReader br)
br
- A reader for the EMBL source or fileSequenceIterator
that iterates over each
Sequence
in the filepublic static SequenceIterator readEmblRNA(BufferedReader br)
br
- A reader for the EMBL source or fileSequenceIterator
that iterates over each
Sequence
in the filepublic static SequenceIterator readEmblNucleotide(BufferedReader br)
br
- A reader for the EMBL source or fileSequenceIterator
that iterates over each
Sequence
in the filepublic static SequenceBuilderFactory getGenbankBuilderFactory()
SmartSequenceBuilder.FACTORY
public static SequenceIterator readGenbank(BufferedReader br)
br
- A reader for the Genbank source or fileSequenceIterator
that iterates over each
Sequence
in the filepublic static SequenceIterator readGenbankXml(BufferedReader br)
br
- A reader for the GenbanXML source or fileSequenceIterator
that iterates over each
Sequence
in the filepublic static SequenceBuilderFactory getGenpeptBuilderFactory()
SmartSequenceBuilder.FACTORY
public static SequenceIterator readGenpept(BufferedReader br)
br
- A reader for the Genpept source or fileSequenceIterator
that iterates over each
Sequence
in the filepublic static SequenceBuilderFactory getSwissprotBuilderFactory()
SmartSequenceBuilder.FACTORY
public static SequenceIterator readSwissprot(BufferedReader br)
br
- A reader for the Swissprot source or fileSequenceIterator
that iterates over each
Sequence
in the filepublic static SequenceBuilderFactory getFastaBuilderFactory()
SmartSequenceBuilder.FACTORY
public static SequenceIterator readFasta(BufferedReader br, SymbolTokenization sTok)
br
- the BufferedReader to read data fromsTok
- a SymbolTokenization that understands the sequencespublic static SequenceIterator readFasta(BufferedReader br, SymbolTokenization sTok, SequenceBuilderFactory seqFactory)
br
- the BufferedReader to read data fromsTok
- a SymbolTokenization that understands the sequencesseqFactory
- a factory used to build a SymbolListSequenceIterator
that iterates over each
Sequence
in the filepublic static SequenceIterator readFastaDNA(BufferedReader br)
br
- the BufferedReader to read data fromSequenceIterator
that iterates over each
Sequence
in the filepublic static SequenceIterator readFastaRNA(BufferedReader br)
br
- the BufferedReader to read data fromSequenceIterator
that iterates over each
Sequence
in the filepublic static SequenceIterator readFastaProtein(BufferedReader br)
br
- the BufferedReader to read data fromSequenceIterator
that iterates over each
Sequence
in the filepublic static SequenceDB readFasta(InputStream seqFile, Alphabet alpha) throws BioException
seqFile
- The file containg the fasta formatted sequencesalpha
- The Alphabet
of the sequence, ie DNA, RNA etcSequenceDB
containing all the Sequences
in the file.BioException
- if problems occur during reading of the
stream.public static void writeFasta(OutputStream os, SequenceDB db) throws IOException
os
- the stream to write the fasta formatted data to.db
- the database of Sequence
s to writeIOException
- if there was an error while writing.public static void writeFasta(OutputStream os, SequenceIterator in) throws IOException
os
- The stream to write fasta formatted data toin
- The source of input Sequences
IOException
- if there was an error while writing.public static void writeFasta(OutputStream os, Sequence seq) throws IOException
os
- the OutputStream.seq
- the Sequence.IOException
- if there was an error while writing.public static void writeEmbl(OutputStream os, SequenceIterator in) throws IOException
os
- the OutputStream.in
- a SequenceIterator.IOException
- if there was an error while writing.public static void writeEmbl(OutputStream os, Sequence seq) throws IOException
os
- the OutputStream.seq
- the Sequence.IOException
- if there was an error while writing.public static void writeSwissprot(OutputStream os, SequenceIterator in) throws IOException, BioException
os
- the OutputStream.in
- a SequenceIterator.BioException
- if the Sequence
cannot be converted to SwissProt
formatIOException
- if there was an error while writing.public static void writeSwissprot(OutputStream os, Sequence seq) throws IOException, BioException
os
- the OutputStream.seq
- the Sequence.BioException
- if the Sequence
cannot be written to SwissProt formatIOException
- if there was an error while writing.public static void writeGenpept(OutputStream os, SequenceIterator in) throws IOException, BioException
os
- the OutputStream.in
- a SequenceIterator.BioException
- if the Sequence
cannot be written to Genpept formatIOException
- if there was an error while writing.public static void writeGenpept(OutputStream os, Sequence seq) throws IOException, BioException
os
- the OutputStream.seq
- the Sequence.BioException
- if the Sequence
cannot be written to Genpept formatIOException
- if there was an error while writing.public static void writeGenbank(OutputStream os, SequenceIterator in) throws IOException
os
- the OutputStream.in
- a SequenceIterator.IOException
- if there was an error while writing.public static void writeGenbank(OutputStream os, Sequence seq) throws IOException
os
- the OutputStream.seq
- the Sequence.IOException
- if there was an error while writing.public static int identifyFormat(String formatName, String alphabetName)
identifyFormat
performs a case-insensitive mapping
of a pair of common sequence format name (such as 'embl',
'genbank' or 'fasta') and alphabet name (such as 'dna', 'rna',
'protein', 'aa') to an integer. The value returned will be one
of the public static final fields in
SeqIOConstants
, or a bitwise-or combination of
them. The method will reject known illegal combinations of
format and alphabet (such as swissprot + dna) by throwing an
IllegalArgumentException
. It will return the
SeqIOConstants.UNKNOWN
value when either format or
alphabet are unknown.formatName
- a String
.alphabetName
- a String
.int
.public static SequenceFormat getSequenceFormat(int identifier) throws BioException
getSequenceFormat
accepts a value which represents
a sequence format and returns the relevant
SequenceFormat
object.identifier
- an int
which represents a binary
value with bits set according to the scheme described in
SeqIOConstants
.SequenceFormat
.BioException
- if an error occurs.public static SequenceBuilderFactory getBuilderFactory(int identifier) throws BioException
getBuilderFactory
accepts a value which represents
a sequence format and returns the relevant
SequenceBuilderFactory
object.identifier
- an int
which represents a binary
value with bits set according to the scheme described in
SeqIOConstants
.SequenceBuilderFactory
.BioException
- if an error occurs.public static FiniteAlphabet getAlphabet(int identifier) throws BioException
getAlphabet
accepts a value which represents a
sequence format and returns the relevant
FiniteAlphabet
object.identifier
- an int
which represents a binary
value with bits set according to the scheme described in
SeqIOConstants
.FiniteAlphabet
.BioException
- if an error occurs.public static int guessFileType(File seqFile) throws IOException, FileNotFoundException
seqFile
- the File
to read from.IOException
- if seqFile
cannot be readFileNotFoundException
- if seqFile
cannot be foundpublic static SequenceBuilderFactory formatToFactory(SequenceFormat format, Alphabet alpha) throws BioException
identifyBuilderFactory
.SequenceBuilder
object for some combination of
Alphabet
and SequenceFormat
format
- currently supports FastaFormat
,
GenbankFormat
, EmblLikeFormat
alpha
- currently only supports the DNA and Protein
alphabetsSequenceBuilderFactory
BioException
- if the combination of alpha and format is
unrecognized.public static Object fileToBiojava(String formatName, String alphabetName, BufferedReader br) throws BioException
formatName
- the name of the format eg genbank or
swissprot (case insensitive)alphabetName
- the name of the alphabet eg dna or rna or
protein (case insensitive)br
- a BufferedReader for the inputBioException
- if an error occurs while reading or a
unrecognized format, alphabet combination is used (eg swissprot
and DNA).public static Object fileToBiojava(int fileType, BufferedReader br) throws BioException
fileType
- a value that describes the file typebr
- the reader for the inputSequenceIterator
if the file type is a
sequence file, or a Alignment
if the file is a sequence
alignment.BioException
- if the file cannot be parsedpublic static void biojavaToFile(String formatName, String alphabetName, OutputStream os, Object biojava) throws BioException, IOException, IllegalSymbolException
SequenceIterator
,
SequenceDB
, Sequence
or Aligment
to an OutputStream
formatName
- eg fasta, GenBank (case insensitive)alphabetName
- eg DNA, RNA (case insensititve)os
- where to write tobiojava
- the object to writeBioException
- problems getting data from the biojava object.IOException
- if there are IO problemsIllegalSymbolException
- a Symbol cannot be parsedpublic static void biojavaToFile(int fileType, OutputStream os, Object biojava) throws BioException, IOException, IllegalSymbolException
fileType
- a value that describes the type of sequence fileos
- the stream to write the formatted results tobiojava
- a SequenceIterator
, SequenceDB
,
Sequence
, or Alignment
BioException
- if biojava
cannot be
converted to that format.IOException
- if the output cannot be written to
os
IllegalSymbolException
- if biojava
contains a Symbol
that cannot be understood by the
parser.Copyright © 2020 BioJava. All rights reserved.