Package org.biojava.bio.seq.io
Class SeqIOConstants
- java.lang.Object
-
- org.biojava.bio.seq.io.SeqIOConstants
-
public final class SeqIOConstants extends Object
SeqIOConstants
contains constants used to identify sequence formats, alphabets etc, in the context of reading and writing sequences.An
int
used to specify symbol alphabet and sequence format type is derived thus:- The two least significant bytes are reserved for format types such as RAW, FASTA, EMBL etc.
- The two most significant bytes are reserved for alphabet and symbol information such as AMBIGUOUS, DNA, RNA, AA etc.
-
Bitwise OR combinations of each component
int
are used to specify combinations of format type and symbol information. To derive anint
identifier for DNA with ambiguity codes in Fasta format, bitwise OR the AMBIGUOUS, DNA and FASTA values.
- Author:
- Keith James
-
-
Field Summary
Fields Modifier and Type Field Description static int
AA
AA
indicates that a sequence contains AA (amino acid) symbols.static int
AMBIGUOUS
AMBIGUOUS
indicates that a sequence contains ambiguity symbols.static int
DNA
DNA
indicates that a sequence contains DNA (deoxyribonucleic acid) symbols.static int
EMBL
EMBL
indicates that the sequence format is EMBL.static int
EMBL_AA
EMBL_AA
premade EMBL | AA.static int
EMBL_DNA
EMBL_DNA
premade EMBL | DNA.static int
EMBL_RNA
EMBL_RNA
premade EMBL | RNA.static int
FASTA
FASTA
indicates that the sequence format is Fasta.static int
FASTA_AA
FASTA_AA
premade FASTA | AA.static int
FASTA_DNA
FASTA_DNA
premade FASTA | DNA.static int
FASTA_RNA
FASTA_RNA
premade FASTA | RNA.static int
GCG
GCG
indicates that the sequence format is GCG.static int
GENBANK
GENBANK
indicates that the sequence format is GENBANK.static int
GENBANK_AA
GENBANK_DNA
premade GENBANK | AA.static int
GENBANK_DNA
GENBANK_DNA
premade GENBANK | DNA.static int
GENBANK_RNA
GENBANK_DNA
premade GENBANK | RNA.static int
GENPEPT
GENPEPT
indicates that the sequence format is GENPEPT.static int
GFF
GFF
indicates that the sequence format is GFF.static int
IG
IG
indicates that the sequence format is IG.static int
INTEGER
INTEGER
indicates that a sequence contains integer alphabet symbols, such as used to describe sequence quality data.static LifeScienceIdentifier
LSID_EMBL_AA
LSID_EMBL_AA
sequence format LSID for EMBL AA.static LifeScienceIdentifier
LSID_EMBL_DNA
LSID_EMBL_DNA
sequence format LSID for EMBL DNA.static LifeScienceIdentifier
LSID_EMBL_RNA
LSID_EMBL_RNA
sequence format LSID for EMBL RNA.static LifeScienceIdentifier
LSID_FASTA_AA
LSID_FASTA_AA
sequence format LSID for Fasta AA.static LifeScienceIdentifier
LSID_FASTA_DNA
LSID_FASTA_DNA
sequence format LSID for Fasta DNA.static LifeScienceIdentifier
LSID_FASTA_RNA
LSID_FASTA_RNA
sequence format LSID for Fasta RNA.static LifeScienceIdentifier
LSID_GENBANK_AA
LSID_GENBANK_AA
sequence format LSID for Genbank AA.static LifeScienceIdentifier
LSID_GENBANK_DNA
LSID_GENBANK_DNA
sequence format LSID for Genbank DNA.static LifeScienceIdentifier
LSID_GENBANK_RNA
LSID_GENBANK_RNA
sequence format LSID for Genbank RNA.static LifeScienceIdentifier
LSID_SWISSPROT
LSID_SWISSPROT
sequence format LSID for Swissprot.static int
NBRF
NBRF
indicates that the sequence format is NBRF.static int
PDB
PDB
indicates that the sequence format is PDB.static int
PHRED
PHRED
indicates that the sequence format is PHRED.static int
RAW
RAW
indicates that the sequence format is raw (symbols only).static int
REFSEQ
REFSEQ
indicates that the sequence format is REFSEQ.static int
REFSEQ_AA
REFSEQ_AA
premade REFSEQ | AA.static int
REFSEQ_DNA
REFSEQ_DNA
premade REFSEQ | DNA.static int
REFSEQ_RNA
REFSEQ_RNA
premade REFSEQ | RNA.static int
RNA
RNA
indicates that a sequence contains RNA (ribonucleic acid) symbols.static int
SWISSPROT
SWISSPROT
indicates that the sequence format is SWISSPROT.static int
UNKNOWN
UNKNOWN
indicates that the sequence format is unknown.
-
Constructor Summary
Constructors Constructor Description SeqIOConstants()
-
-
-
Field Detail
-
AMBIGUOUS
public static final int AMBIGUOUS
AMBIGUOUS
indicates that a sequence contains ambiguity symbols. The first bit of the most significant word of the int is set.- See Also:
- Constant Field Values
-
DNA
public static final int DNA
DNA
indicates that a sequence contains DNA (deoxyribonucleic acid) symbols. The second bit of the most significant word of the int is set.- See Also:
- Constant Field Values
-
RNA
public static final int RNA
RNA
indicates that a sequence contains RNA (ribonucleic acid) symbols. The third bit of the most significant word of the int is set.- See Also:
- Constant Field Values
-
AA
public static final int AA
AA
indicates that a sequence contains AA (amino acid) symbols. The fourth bit of the most significant word of the int is set.- See Also:
- Constant Field Values
-
INTEGER
public static final int INTEGER
INTEGER
indicates that a sequence contains integer alphabet symbols, such as used to describe sequence quality data. The fifth bit of the most significant word of the int is set.- See Also:
- Constant Field Values
-
UNKNOWN
public static final int UNKNOWN
UNKNOWN
indicates that the sequence format is unknown.- See Also:
- Constant Field Values
-
RAW
public static final int RAW
RAW
indicates that the sequence format is raw (symbols only).- See Also:
- Constant Field Values
-
FASTA
public static final int FASTA
FASTA
indicates that the sequence format is Fasta.- See Also:
- Constant Field Values
-
NBRF
public static final int NBRF
NBRF
indicates that the sequence format is NBRF.- See Also:
- Constant Field Values
-
IG
public static final int IG
IG
indicates that the sequence format is IG.- See Also:
- Constant Field Values
-
EMBL
public static final int EMBL
EMBL
indicates that the sequence format is EMBL.- See Also:
- Constant Field Values
-
SWISSPROT
public static final int SWISSPROT
SWISSPROT
indicates that the sequence format is SWISSPROT. Always protein, so already had the AA bit set.- See Also:
- Constant Field Values
-
GENBANK
public static final int GENBANK
GENBANK
indicates that the sequence format is GENBANK.- See Also:
- Constant Field Values
-
GENPEPT
public static final int GENPEPT
GENPEPT
indicates that the sequence format is GENPEPT. Always protein, so already had the AA bit set.- See Also:
- Constant Field Values
-
REFSEQ
public static final int REFSEQ
REFSEQ
indicates that the sequence format is REFSEQ.- See Also:
- Constant Field Values
-
GCG
public static final int GCG
GCG
indicates that the sequence format is GCG.- See Also:
- Constant Field Values
-
GFF
public static final int GFF
GFF
indicates that the sequence format is GFF.- See Also:
- Constant Field Values
-
PDB
public static final int PDB
PDB
indicates that the sequence format is PDB. Always protein, so already had the AA bit set.- See Also:
- Constant Field Values
-
PHRED
public static final int PHRED
PHRED
indicates that the sequence format is PHRED. Always DNA, so already had the DNA bit set. Also has INTEGER bit set for quality data.- See Also:
- Constant Field Values
-
EMBL_DNA
public static final int EMBL_DNA
EMBL_DNA
premade EMBL | DNA.- See Also:
- Constant Field Values
-
EMBL_RNA
public static final int EMBL_RNA
EMBL_RNA
premade EMBL | RNA.- See Also:
- Constant Field Values
-
EMBL_AA
public static final int EMBL_AA
EMBL_AA
premade EMBL | AA.- See Also:
- Constant Field Values
-
GENBANK_DNA
public static final int GENBANK_DNA
GENBANK_DNA
premade GENBANK | DNA.- See Also:
- Constant Field Values
-
GENBANK_RNA
public static final int GENBANK_RNA
GENBANK_DNA
premade GENBANK | RNA.- See Also:
- Constant Field Values
-
GENBANK_AA
public static final int GENBANK_AA
GENBANK_DNA
premade GENBANK | AA.- See Also:
- Constant Field Values
-
REFSEQ_DNA
public static final int REFSEQ_DNA
REFSEQ_DNA
premade REFSEQ | DNA.- See Also:
- Constant Field Values
-
REFSEQ_RNA
public static final int REFSEQ_RNA
REFSEQ_RNA
premade REFSEQ | RNA.- See Also:
- Constant Field Values
-
REFSEQ_AA
public static final int REFSEQ_AA
REFSEQ_AA
premade REFSEQ | AA.- See Also:
- Constant Field Values
-
FASTA_DNA
public static final int FASTA_DNA
FASTA_DNA
premade FASTA | DNA.- See Also:
- Constant Field Values
-
FASTA_RNA
public static final int FASTA_RNA
FASTA_RNA
premade FASTA | RNA.- See Also:
- Constant Field Values
-
FASTA_AA
public static final int FASTA_AA
FASTA_AA
premade FASTA | AA.- See Also:
- Constant Field Values
-
LSID_FASTA_DNA
public static final LifeScienceIdentifier LSID_FASTA_DNA
LSID_FASTA_DNA
sequence format LSID for Fasta DNA.
-
LSID_FASTA_RNA
public static final LifeScienceIdentifier LSID_FASTA_RNA
LSID_FASTA_RNA
sequence format LSID for Fasta RNA.
-
LSID_FASTA_AA
public static final LifeScienceIdentifier LSID_FASTA_AA
LSID_FASTA_AA
sequence format LSID for Fasta AA.
-
LSID_EMBL_DNA
public static final LifeScienceIdentifier LSID_EMBL_DNA
LSID_EMBL_DNA
sequence format LSID for EMBL DNA.
-
LSID_EMBL_RNA
public static final LifeScienceIdentifier LSID_EMBL_RNA
LSID_EMBL_RNA
sequence format LSID for EMBL RNA.
-
LSID_EMBL_AA
public static final LifeScienceIdentifier LSID_EMBL_AA
LSID_EMBL_AA
sequence format LSID for EMBL AA.
-
LSID_GENBANK_DNA
public static final LifeScienceIdentifier LSID_GENBANK_DNA
LSID_GENBANK_DNA
sequence format LSID for Genbank DNA.
-
LSID_GENBANK_RNA
public static final LifeScienceIdentifier LSID_GENBANK_RNA
LSID_GENBANK_RNA
sequence format LSID for Genbank RNA.
-
LSID_GENBANK_AA
public static final LifeScienceIdentifier LSID_GENBANK_AA
LSID_GENBANK_AA
sequence format LSID for Genbank AA.
-
LSID_SWISSPROT
public static final LifeScienceIdentifier LSID_SWISSPROT
LSID_SWISSPROT
sequence format LSID for Swissprot.
-
-
Constructor Detail
-
SeqIOConstants
public SeqIOConstants()
-
-