Package org.biojava.nbio.aaproperties
Class PeptideProperties
- java.lang.Object
-
- org.biojava.nbio.aaproperties.PeptideProperties
-
public class PeptideProperties extends Object
This is an adaptor class which enable the ease of generating protein properties. At least one adaptor method is written for each available properties provided in IPeptideProperties.- Since:
- 3.0.2
- Version:
- 2011.08.22
- Author:
- kohchuanhock
- See Also:
IPeptideProperties
,PeptidePropertiesImpl
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
PeptideProperties.SingleLetterAACode
Enumeration of 20 standard amino acid code
-
Field Summary
Fields Modifier and Type Field Description static Set<Character>
standardAASet
Contains the 20 standard AA code in a set
-
Constructor Summary
Constructors Constructor Description PeptideProperties()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static Map<AminoAcidCompound,Double>
getAAComposition(String sequence)
An adaptor method to return the composition of the 20 standard amino acid in the sequence.static Map<Character,Double>
getAACompositionChar(String sequence)
An adaptor method to return the composition of the 20 standard amino acid in the sequence.static Map<String,Double>
getAACompositionString(String sequence)
An adaptor method to return the composition of the 20 standard amino acid in the sequence.static double
getAbsorbance(String sequence, boolean assumeCysReduced)
An adaptor method to returns the absorbance (optical density) of sequence.static double
getApliphaticIndex(String sequence)
An adaptor method to return the apliphatic index of sequence.static double
getAromaticity(String sequence)
An adaptor method to return the aromaticity value of sequence.static double
getAvgHydropathy(String sequence)
An adaptor method to return the average hydropathy value of sequence.static int[]
getChargesOfAminoAcids(String sequence)
Returns the array of charges of each amino acid in a protein.static double
getEnrichment(String sequence, char aminoAcidCode)
An adaptor method to return the composition of specified amino acid in the sequence.static double
getEnrichment(String sequence, String aminoAcidCode)
An adaptor method to return the composition of specified amino acid in the sequence.static double
getEnrichment(String sequence, PeptideProperties.SingleLetterAACode aminoAcidCode)
An adaptor method to return the composition of specified amino acid in the sequence.static double
getExtinctionCoefficient(String sequence, boolean assumeCysReduced)
An adaptor method to return the extinction coefficient of sequence.static double
getInstabilityIndex(String sequence)
An adaptor method to return the instability index of sequence.static double
getIsoelectricPoint(String sequence)
static double
getIsoelectricPoint(String sequence, boolean useExpasyValues)
An adaptor method to return the isoelectric point of sequence.static double
getMolecularWeight(String sequence)
An adaptor method to return the molecular weight of sequence.static double
getMolecularWeight(String sequence, File aminoAcidCompositionFile)
An adaptor method to return the molecular weight of sequence.static double
getMolecularWeight(String sequence, File elementMassFile, File aminoAcidCompositionFile)
An adaptor method to return the molecular weight of sequence.static double
getMolecularWeightBasedOnXML(String sequence, AminoAcidCompositionTable aminoAcidCompositionTable)
An adaptor method that returns the molecular weight of sequence.static double
getNetCharge(String sequence)
static double
getNetCharge(String sequence, boolean useExpasyValues)
static double
getNetCharge(String sequence, boolean useExpasyValues, double pHPoint)
An adaptor method to return the net charge of sequence at pH 7.static int[]
getPolarityOfAminoAcids(String sequence)
Returns the array of polarity values of each amino acid in a protein sequence.static AminoAcidCompositionTable
obtainAminoAcidCompositionTable(File aminoAcidCompositionFile)
An adaptor method would initialize amino acid composition table based on the input xml files and stores the table for usage in future calls to IPeptideProperties.getMolecularWeightBasedOnXML(ProteinSequence, AminoAcidCompositionTable).static AminoAcidCompositionTable
obtainAminoAcidCompositionTable(File elementMassFile, File aminoAcidCompositionFile)
An adaptor method would initialize amino acid composition table based on the input xml files and stores the table for usage in future calls to IPeptideProperties.getMolecularWeightBasedOnXML(ProteinSequence, AminoAcidCompositionTable).
-
-
-
Field Detail
-
standardAASet
public static Set<Character> standardAASet
Contains the 20 standard AA code in a set
-
-
Constructor Detail
-
PeptideProperties
public PeptideProperties()
-
-
Method Detail
-
getMolecularWeight
public static final double getMolecularWeight(String sequence)
An adaptor method to return the molecular weight of sequence. The sequence argument must be a protein sequence consisting of only non-ambiguous characters. This method will sum the molecular weight of each amino acid in the sequence. Molecular weights are based on here.- Parameters:
sequence
- a protein sequence consisting of non-ambiguous characters only- Returns:
- the total molecular weight of sequence + weight of water molecule
-
getMolecularWeight
public static final double getMolecularWeight(String sequence, File elementMassFile, File aminoAcidCompositionFile) throws FileNotFoundException, JAXBException
An adaptor method to return the molecular weight of sequence. The sequence argument must be a protein sequence consisting of only non-ambiguous characters. This method will sum the molecular weight of each amino acid in the sequence. Molecular weights are based on the input xml file.- Parameters:
sequence
- a protein sequence consisting of non-ambiguous characters onlyelementMassFile
- xml file that details the mass of each elements and isotopesaminoAcidCompositionFile
- xml file that details the composition of amino acids- Returns:
- the total molecular weight of sequence + weight of water molecule
- Throws:
FileNotFoundException
- thrown if either elementMassFile or aminoAcidCompositionFile are not foundJAXBException
- thrown if unable to properly parse either elementMassFile or aminoAcidCompositionFile
-
getMolecularWeight
public static final double getMolecularWeight(String sequence, File aminoAcidCompositionFile) throws FileNotFoundException, JAXBException
An adaptor method to return the molecular weight of sequence. The sequence argument must be a protein sequence consisting of only non-ambiguous characters. This method will sum the molecular weight of each amino acid in the sequence. Molecular weights are based on the input files. These input files must be XML using the defined schema. Note that it assumes that ElementMass.xml file can be found in default location.- Parameters:
sequence
- a protein sequence consisting of non-ambiguous characters only xml file that details the mass of each elements and isotopesaminoAcidCompositionFile
- xml file that details the composition of amino acids- Returns:
- the total molecular weight of sequence + weight of water molecule
- Throws:
JAXBException
- thrown if unable to properly parse either elementMassFile or aminoAcidCompositionFileFileNotFoundException
- thrown if either elementMassFile or aminoAcidCompositionFile are not found
-
obtainAminoAcidCompositionTable
public static final AminoAcidCompositionTable obtainAminoAcidCompositionTable(File aminoAcidCompositionFile) throws JAXBException, FileNotFoundException
An adaptor method would initialize amino acid composition table based on the input xml files and stores the table for usage in future calls to IPeptideProperties.getMolecularWeightBasedOnXML(ProteinSequence, AminoAcidCompositionTable). Note that ElementMass.xml is assumed to be able to be seen in default location.- Parameters:
aminoAcidCompositionFile
- xml file that details the composition of amino acids- Returns:
- the initialized amino acid composition table
- Throws:
JAXBException
- thrown if unable to properly parse either elementMassFile or aminoAcidCompositionFileFileNotFoundException
- thrown if either elementMassFile or aminoAcidCompositionFile are not found
-
obtainAminoAcidCompositionTable
public static final AminoAcidCompositionTable obtainAminoAcidCompositionTable(File elementMassFile, File aminoAcidCompositionFile) throws JAXBException, FileNotFoundException
An adaptor method would initialize amino acid composition table based on the input xml files and stores the table for usage in future calls to IPeptideProperties.getMolecularWeightBasedOnXML(ProteinSequence, AminoAcidCompositionTable).- Parameters:
elementMassFile
- xml file that details the mass of each elements and isotopesaminoAcidCompositionFile
- xml file that details the composition of amino acids- Returns:
- the initialized amino acid composition table
- Throws:
JAXBException
- thrown if unable to properly parse either elementMassFile or aminoAcidCompositionFileFileNotFoundException
- thrown if either elementMassFile or aminoAcidCompositionFile are not found
-
getMolecularWeightBasedOnXML
public static double getMolecularWeightBasedOnXML(String sequence, AminoAcidCompositionTable aminoAcidCompositionTable)
An adaptor method that returns the molecular weight of sequence. The sequence argument must be a protein sequence consisting of only non-ambiguous characters. This method will sum the molecular weight of each amino acid in the sequence. Molecular weights are based on the AminoAcidCompositionTable. Those input files must be XML using the defined schema.- Parameters:
sequence
- a protein sequence consisting of non-ambiguous characters onlyaminoAcidCompositionTable
- a amino acid composition table obtained by calling IPeptideProperties.obtainAminoAcidCompositionTable- Returns:
- the total molecular weight of sequence + weight of water molecule thrown if the method IPeptideProperties.setMolecularWeightXML(File, File) is not successfully called before calling this method.
-
getAbsorbance
public static final double getAbsorbance(String sequence, boolean assumeCysReduced)
An adaptor method to returns the absorbance (optical density) of sequence. The sequence argument must be a protein sequence consisting of only non-ambiguous characters. The computation of absorbance (optical density) follows the documentation in here.- Parameters:
sequence
- a protein sequence consisting of non-ambiguous characters onlyassumeCysReduced
- true if Cys are assumed to be reduced and false if Cys are assumed to form cystines- Returns:
- the absorbance (optical density) of sequence
-
getExtinctionCoefficient
public static final double getExtinctionCoefficient(String sequence, boolean assumeCysReduced)
An adaptor method to return the extinction coefficient of sequence. The sequence argument must be a protein sequence consisting of only non-ambiguous characters. The extinction coefficient indicates how much light a protein absorbs at a certain wavelength. It is useful to have an estimation of this coefficient for following a protein which a spectrophotometer when purifying it. The computation of extinction coefficient follows the documentation in here.- Parameters:
sequence
- a protein sequence consisting of non-ambiguous characters onlyassumeCysReduced
- true if Cys are assumed to be reduced and false if Cys are assumed to form cystines- Returns:
- the extinction coefficient of sequence
-
getInstabilityIndex
public static final double getInstabilityIndex(String sequence)
An adaptor method to return the instability index of sequence. The sequence argument must be a protein sequence consisting of only non-ambiguous characters. The instability index provides an estimate of the stability of your protein in a test tube. The computation of instability index follows the documentation in here.- Parameters:
sequence
- a protein sequence consisting of non-ambiguous characters only- Returns:
- the instability index of sequence
-
getApliphaticIndex
public static final double getApliphaticIndex(String sequence)
An adaptor method to return the apliphatic index of sequence. The sequence argument must be a protein sequence consisting of only non-ambiguous characters. The aliphatic index of a protein is defined as the relative volume occupied by aliphatic side chains (alanine, valine, isoleucine, and leucine). It may be regarded as a positive factor for the increase of thermostability of globular proteins. The computation of aliphatic index follows the documentation in here. A protein whose instability index is smaller than 40 is predicted as stable, a value above 40 predicts that the protein may be unstable.- Parameters:
sequence
- a protein sequence consisting of non-ambiguous characters only- Returns:
- the aliphatic index of sequence
-
getAvgHydropathy
public static final double getAvgHydropathy(String sequence)
An adaptor method to return the average hydropathy value of sequence. The sequence argument must be a protein sequence consisting of only non-ambiguous characters. The average value for a sequence is calculated as the sum of hydropathy values of all the amino acids, divided by the number of residues in the sequence. Hydropathy values are based on (Kyte, J. and Doolittle, R.F. (1982) A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105-132).- Parameters:
sequence
- a protein sequence consisting of non-ambiguous characters only- Returns:
- the average hydropathy value of sequence
-
getIsoelectricPoint
public static final double getIsoelectricPoint(String sequence, boolean useExpasyValues)
An adaptor method to return the isoelectric point of sequence. The sequence argument must be a protein sequence consisting of only non-ambiguous characters. The isoelectric point is the pH at which the protein carries no net electrical charge. The isoelectric point will be computed based on approach stated in here pKa values used will be either those used by Expasy which referenced "Electrophoresis 1994, 15, 529-539" OR A.Lehninger, Principles of Biochemistry, 4th Edition (2005), Chapter 3, page78, Table 3-1.- Parameters:
sequence
- a protein sequence consisting of non-ambiguous characters onlyuseExpasyValues
- whether to use Expasy values (Default) or Innovagen values- Returns:
- the isoelectric point of sequence
-
getIsoelectricPoint
public static final double getIsoelectricPoint(String sequence)
-
getNetCharge
public static final double getNetCharge(String sequence, boolean useExpasyValues, double pHPoint)
An adaptor method to return the net charge of sequence at pH 7. The sequence argument must be a protein sequence consisting of only non-ambiguous characters. The net charge will be computed using the approach stated in here pKa values used will be either those used by Expasy which referenced "Electrophoresis 1994, 15, 529-539" OR A.Lehninger, Principles of Biochemistry, 4th Edition (2005), Chapter 3, page78, Table 3-1.- Parameters:
sequence
- a protein sequence consisting of non-ambiguous characters onlyuseExpasyValues
- whether to use Expasy values (Default) or Innovagen valuespHPoint
- the pH value to use for computation of the net charge. Default at 7.- Returns:
- the net charge of sequence at given pHPoint
-
getNetCharge
public static final double getNetCharge(String sequence, boolean useExpasyValues)
-
getNetCharge
public static final double getNetCharge(String sequence)
-
getEnrichment
public static final double getEnrichment(String sequence, PeptideProperties.SingleLetterAACode aminoAcidCode)
An adaptor method to return the composition of specified amino acid in the sequence. The sequence argument must be a protein sequence consisting of only non-ambiguous characters. The aminoAcidCode must be a non-ambiguous character. The composition of an amino acid is the total number of its occurrence, divided by the total length of the sequence.- Parameters:
sequence
- a protein sequence consisting of non-ambiguous characters onlyaminoAcidCode
- the code of the amino acid to compute- Returns:
- the composition of specified amino acid in the sequence
- See Also:
PeptideProperties.SingleLetterAACode
-
getEnrichment
public static final double getEnrichment(String sequence, char aminoAcidCode)
An adaptor method to return the composition of specified amino acid in the sequence. The sequence argument must be a protein sequence consisting of only non-ambiguous characters. The aminoAcidCode must be a non-ambiguous character. The composition of an amino acid is the total number of its occurrence, divided by the total length of the sequence.- Parameters:
sequence
- a protein sequence consisting of non-ambiguous characters onlyaminoAcidCode
- the code of the amino acid to compute- Returns:
- the composition of specified amino acid in the sequence
-
getEnrichment
public static final double getEnrichment(String sequence, String aminoAcidCode)
An adaptor method to return the composition of specified amino acid in the sequence. The sequence argument must be a protein sequence consisting of only non-ambiguous characters. The aminoAcidCode must be a non-ambiguous character. The composition of an amino acid is the total number of its occurrence, divided by the total length of the sequence.- Parameters:
sequence
- a protein sequence consisting of non-ambiguous characters onlyaminoAcidCode
- the code of the amino acid to compute- Returns:
- the composition of specified amino acid in the sequence
-
getAAComposition
public static final Map<AminoAcidCompound,Double> getAAComposition(String sequence)
An adaptor method to return the composition of the 20 standard amino acid in the sequence. The sequence argument must be a protein sequence consisting of only non-ambiguous characters. The composition of an amino acid is the total number of its occurrence, divided by the total length of the sequence.- Parameters:
sequence
- a protein sequence consisting of non-ambiguous characters only- Returns:
- the composition of the 20 standard amino acid in the sequence
- See Also:
AminoAcidCompound
-
getAACompositionString
public static final Map<String,Double> getAACompositionString(String sequence)
An adaptor method to return the composition of the 20 standard amino acid in the sequence. The sequence argument must be a protein sequence consisting of only non-ambiguous characters. The composition of an amino acid is the total number of its occurrence, divided by the total length of the sequence.- Parameters:
sequence
- a protein sequence consisting of non-ambiguous characters only- Returns:
- the composition of the 20 standard amino acid in the sequence
-
getAACompositionChar
public static final Map<Character,Double> getAACompositionChar(String sequence)
An adaptor method to return the composition of the 20 standard amino acid in the sequence. The sequence argument must be a protein sequence consisting of only non-ambiguous characters. The composition of an amino acid is the total number of its occurrence, divided by the total length of the sequence.- Parameters:
sequence
- a protein sequence consisting of non-ambiguous characters only- Returns:
- the composition of the 20 standard amino acid in the sequence
-
getChargesOfAminoAcids
public static final int[] getChargesOfAminoAcids(String sequence)
Returns the array of charges of each amino acid in a protein. At pH=7, two are negative charged: aspartic acid (Asp, D) and glutamic acid (Glu, E) (acidic side chains), and three are positive charged: lysine (Lys, K), arginine (Arg, R) and histidine (His, H) (basic side chains).- Parameters:
sequence
- a protein sequence consisting of non-ambiguous characters only- Returns:
- the array of charges of amino acids in the protein (1 if amino acid is positively charged, -1 if negatively charged, 0 if not charged)
-
getPolarityOfAminoAcids
public static final int[] getPolarityOfAminoAcids(String sequence)
Returns the array of polarity values of each amino acid in a protein sequence.- Parameters:
sequence
- a protein sequence consisting of non-ambiguous characters only- Returns:
- the array of polarity of amino acids in the protein (1 if amino acid is polar, 0 if not)
-
getAromaticity
public static final double getAromaticity(String sequence)
An adaptor method to return the aromaticity value of sequence. The sequence argument must be a protein sequence consisting of only non-ambiguous characters.Calculates the aromaticity value of a protein according to Lobry, 1994. It is simply the relative frequency of Phe+Trp+Tyr. *
- Parameters:
sequence
- a protein sequence consisting of non-ambiguous characters only- Returns:
- the aromaticity value of sequence
-
-