Package org.biojava.bio.seq.io
Class SeqIOConstants
- java.lang.Object
-
- org.biojava.bio.seq.io.SeqIOConstants
-
public final class SeqIOConstants extends Object
SeqIOConstantscontains constants used to identify sequence formats, alphabets etc, in the context of reading and writing sequences.An
intused to specify symbol alphabet and sequence format type is derived thus:- The two least significant bytes are reserved for format types such as RAW, FASTA, EMBL etc.
- The two most significant bytes are reserved for alphabet and symbol information such as AMBIGUOUS, DNA, RNA, AA etc.
-
Bitwise OR combinations of each component
intare used to specify combinations of format type and symbol information. To derive anintidentifier for DNA with ambiguity codes in Fasta format, bitwise OR the AMBIGUOUS, DNA and FASTA values.
- Author:
- Keith James
-
-
Field Summary
Fields Modifier and Type Field Description static intAAAAindicates that a sequence contains AA (amino acid) symbols.static intAMBIGUOUSAMBIGUOUSindicates that a sequence contains ambiguity symbols.static intDNADNAindicates that a sequence contains DNA (deoxyribonucleic acid) symbols.static intEMBLEMBLindicates that the sequence format is EMBL.static intEMBL_AAEMBL_AApremade EMBL | AA.static intEMBL_DNAEMBL_DNApremade EMBL | DNA.static intEMBL_RNAEMBL_RNApremade EMBL | RNA.static intFASTAFASTAindicates that the sequence format is Fasta.static intFASTA_AAFASTA_AApremade FASTA | AA.static intFASTA_DNAFASTA_DNApremade FASTA | DNA.static intFASTA_RNAFASTA_RNApremade FASTA | RNA.static intGCGGCGindicates that the sequence format is GCG.static intGENBANKGENBANKindicates that the sequence format is GENBANK.static intGENBANK_AAGENBANK_DNApremade GENBANK | AA.static intGENBANK_DNAGENBANK_DNApremade GENBANK | DNA.static intGENBANK_RNAGENBANK_DNApremade GENBANK | RNA.static intGENPEPTGENPEPTindicates that the sequence format is GENPEPT.static intGFFGFFindicates that the sequence format is GFF.static intIGIGindicates that the sequence format is IG.static intINTEGERINTEGERindicates that a sequence contains integer alphabet symbols, such as used to describe sequence quality data.static LifeScienceIdentifierLSID_EMBL_AALSID_EMBL_AAsequence format LSID for EMBL AA.static LifeScienceIdentifierLSID_EMBL_DNALSID_EMBL_DNAsequence format LSID for EMBL DNA.static LifeScienceIdentifierLSID_EMBL_RNALSID_EMBL_RNAsequence format LSID for EMBL RNA.static LifeScienceIdentifierLSID_FASTA_AALSID_FASTA_AAsequence format LSID for Fasta AA.static LifeScienceIdentifierLSID_FASTA_DNALSID_FASTA_DNAsequence format LSID for Fasta DNA.static LifeScienceIdentifierLSID_FASTA_RNALSID_FASTA_RNAsequence format LSID for Fasta RNA.static LifeScienceIdentifierLSID_GENBANK_AALSID_GENBANK_AAsequence format LSID for Genbank AA.static LifeScienceIdentifierLSID_GENBANK_DNALSID_GENBANK_DNAsequence format LSID for Genbank DNA.static LifeScienceIdentifierLSID_GENBANK_RNALSID_GENBANK_RNAsequence format LSID for Genbank RNA.static LifeScienceIdentifierLSID_SWISSPROTLSID_SWISSPROTsequence format LSID for Swissprot.static intNBRFNBRFindicates that the sequence format is NBRF.static intPDBPDBindicates that the sequence format is PDB.static intPHREDPHREDindicates that the sequence format is PHRED.static intRAWRAWindicates that the sequence format is raw (symbols only).static intREFSEQREFSEQindicates that the sequence format is REFSEQ.static intREFSEQ_AAREFSEQ_AApremade REFSEQ | AA.static intREFSEQ_DNAREFSEQ_DNApremade REFSEQ | DNA.static intREFSEQ_RNAREFSEQ_RNApremade REFSEQ | RNA.static intRNARNAindicates that a sequence contains RNA (ribonucleic acid) symbols.static intSWISSPROTSWISSPROTindicates that the sequence format is SWISSPROT.static intUNKNOWNUNKNOWNindicates that the sequence format is unknown.
-
Constructor Summary
Constructors Constructor Description SeqIOConstants()
-
-
-
Field Detail
-
AMBIGUOUS
public static final int AMBIGUOUS
AMBIGUOUSindicates that a sequence contains ambiguity symbols. The first bit of the most significant word of the int is set.- See Also:
- Constant Field Values
-
DNA
public static final int DNA
DNAindicates that a sequence contains DNA (deoxyribonucleic acid) symbols. The second bit of the most significant word of the int is set.- See Also:
- Constant Field Values
-
RNA
public static final int RNA
RNAindicates that a sequence contains RNA (ribonucleic acid) symbols. The third bit of the most significant word of the int is set.- See Also:
- Constant Field Values
-
AA
public static final int AA
AAindicates that a sequence contains AA (amino acid) symbols. The fourth bit of the most significant word of the int is set.- See Also:
- Constant Field Values
-
INTEGER
public static final int INTEGER
INTEGERindicates that a sequence contains integer alphabet symbols, such as used to describe sequence quality data. The fifth bit of the most significant word of the int is set.- See Also:
- Constant Field Values
-
UNKNOWN
public static final int UNKNOWN
UNKNOWNindicates that the sequence format is unknown.- See Also:
- Constant Field Values
-
RAW
public static final int RAW
RAWindicates that the sequence format is raw (symbols only).- See Also:
- Constant Field Values
-
FASTA
public static final int FASTA
FASTAindicates that the sequence format is Fasta.- See Also:
- Constant Field Values
-
NBRF
public static final int NBRF
NBRFindicates that the sequence format is NBRF.- See Also:
- Constant Field Values
-
IG
public static final int IG
IGindicates that the sequence format is IG.- See Also:
- Constant Field Values
-
EMBL
public static final int EMBL
EMBLindicates that the sequence format is EMBL.- See Also:
- Constant Field Values
-
SWISSPROT
public static final int SWISSPROT
SWISSPROTindicates that the sequence format is SWISSPROT. Always protein, so already had the AA bit set.- See Also:
- Constant Field Values
-
GENBANK
public static final int GENBANK
GENBANKindicates that the sequence format is GENBANK.- See Also:
- Constant Field Values
-
GENPEPT
public static final int GENPEPT
GENPEPTindicates that the sequence format is GENPEPT. Always protein, so already had the AA bit set.- See Also:
- Constant Field Values
-
REFSEQ
public static final int REFSEQ
REFSEQindicates that the sequence format is REFSEQ.- See Also:
- Constant Field Values
-
GCG
public static final int GCG
GCGindicates that the sequence format is GCG.- See Also:
- Constant Field Values
-
GFF
public static final int GFF
GFFindicates that the sequence format is GFF.- See Also:
- Constant Field Values
-
PDB
public static final int PDB
PDBindicates that the sequence format is PDB. Always protein, so already had the AA bit set.- See Also:
- Constant Field Values
-
PHRED
public static final int PHRED
PHREDindicates that the sequence format is PHRED. Always DNA, so already had the DNA bit set. Also has INTEGER bit set for quality data.- See Also:
- Constant Field Values
-
EMBL_DNA
public static final int EMBL_DNA
EMBL_DNApremade EMBL | DNA.- See Also:
- Constant Field Values
-
EMBL_RNA
public static final int EMBL_RNA
EMBL_RNApremade EMBL | RNA.- See Also:
- Constant Field Values
-
EMBL_AA
public static final int EMBL_AA
EMBL_AApremade EMBL | AA.- See Also:
- Constant Field Values
-
GENBANK_DNA
public static final int GENBANK_DNA
GENBANK_DNApremade GENBANK | DNA.- See Also:
- Constant Field Values
-
GENBANK_RNA
public static final int GENBANK_RNA
GENBANK_DNApremade GENBANK | RNA.- See Also:
- Constant Field Values
-
GENBANK_AA
public static final int GENBANK_AA
GENBANK_DNApremade GENBANK | AA.- See Also:
- Constant Field Values
-
REFSEQ_DNA
public static final int REFSEQ_DNA
REFSEQ_DNApremade REFSEQ | DNA.- See Also:
- Constant Field Values
-
REFSEQ_RNA
public static final int REFSEQ_RNA
REFSEQ_RNApremade REFSEQ | RNA.- See Also:
- Constant Field Values
-
REFSEQ_AA
public static final int REFSEQ_AA
REFSEQ_AApremade REFSEQ | AA.- See Also:
- Constant Field Values
-
FASTA_DNA
public static final int FASTA_DNA
FASTA_DNApremade FASTA | DNA.- See Also:
- Constant Field Values
-
FASTA_RNA
public static final int FASTA_RNA
FASTA_RNApremade FASTA | RNA.- See Also:
- Constant Field Values
-
FASTA_AA
public static final int FASTA_AA
FASTA_AApremade FASTA | AA.- See Also:
- Constant Field Values
-
LSID_FASTA_DNA
public static final LifeScienceIdentifier LSID_FASTA_DNA
LSID_FASTA_DNAsequence format LSID for Fasta DNA.
-
LSID_FASTA_RNA
public static final LifeScienceIdentifier LSID_FASTA_RNA
LSID_FASTA_RNAsequence format LSID for Fasta RNA.
-
LSID_FASTA_AA
public static final LifeScienceIdentifier LSID_FASTA_AA
LSID_FASTA_AAsequence format LSID for Fasta AA.
-
LSID_EMBL_DNA
public static final LifeScienceIdentifier LSID_EMBL_DNA
LSID_EMBL_DNAsequence format LSID for EMBL DNA.
-
LSID_EMBL_RNA
public static final LifeScienceIdentifier LSID_EMBL_RNA
LSID_EMBL_RNAsequence format LSID for EMBL RNA.
-
LSID_EMBL_AA
public static final LifeScienceIdentifier LSID_EMBL_AA
LSID_EMBL_AAsequence format LSID for EMBL AA.
-
LSID_GENBANK_DNA
public static final LifeScienceIdentifier LSID_GENBANK_DNA
LSID_GENBANK_DNAsequence format LSID for Genbank DNA.
-
LSID_GENBANK_RNA
public static final LifeScienceIdentifier LSID_GENBANK_RNA
LSID_GENBANK_RNAsequence format LSID for Genbank RNA.
-
LSID_GENBANK_AA
public static final LifeScienceIdentifier LSID_GENBANK_AA
LSID_GENBANK_AAsequence format LSID for Genbank AA.
-
LSID_SWISSPROT
public static final LifeScienceIdentifier LSID_SWISSPROT
LSID_SWISSPROTsequence format LSID for Swissprot.
-
-
Constructor Detail
-
SeqIOConstants
public SeqIOConstants()
-
-