Class StructureIO

java.lang.Object
org.biojava.nbio.structure.StructureIO

public class StructureIO extends Object
A class that provides static access methods for easy lookup of protein structure related components
Since:
3.0.5
Author:
Andreas Prlic
  • Constructor Details

  • Method Details

    • getStructure

      Loads a structure based on a name. Supported naming conventions are:
                      Formal specification for how to specify the name:
      
                      name     := pdbID
                                                 | pdbID '.' chainID
                                                 | pdbID '.' range
                                                 | scopID
                                                 | biol
                                                 | pdp
                      range         := '('? range (',' range)? ')'?
                                                 | chainID
                                                 | chainID '_' resNum '-' resNum
                      pdbID         := [1-9][a-zA-Z0-9]{3}
                                                 | PDB_[a-zA-Z0-9]{8}
                      chainID       := [a-zA-Z0-9]
                      scopID        := 'd' pdbID [a-z_][0-9_]
                      biol              := 'BIO:' pdbID [:]? [0-9]+
                      resNum        := [-+]?[0-9]+[A-Za-z]?
      
      
                      Example structures:
                      1TIM                #whole structure - asym unit (short format)
                      4HHB.C              #single chain
                      4GCR.A_1-83         #one domain, by residue number
                      3AA0.A,B            #two chains treated as one structure
                      PDB_00001TIM        #whole structure - asym unit (extended format)
                      PDB_00004HHB.C      #single chain
                      PDB_00004GCR.A_1-83 #one domain, by residue number
                      PDB_00003AA0.A,B    #two chains treated as one structure
                      d2bq6a1     #scop domain
                      BIO:1fah   #biological assembly nr 1 for 1fah
                      BIO:1fah:0 #asym unit for 1fah
                      BIO:1fah:1 #biological assembly nr 1 for 1fah
                      BIO:1fah:2 #biological assembly nr 2 for 1fah
      
       
      With the additional set of rules:
      • If only a PDB code is provided, the whole structure will be return including ligands, but the first model only (for NMR).
      • Chain IDs are case sensitive, PDB ids are not. To specify a particular chain write as: 4hhb.A or 4HHB.A
      • To specify a SCOP domain write a scopId e.g. d2bq6a1
      • URLs are accepted as well
      Parameters:
      name -
      Returns:
      a Structure object, or null if name appears improperly formated (eg too short, etc)
      Throws:
      IOException - The PDB file cannot be cached due to IO errors
      StructureException - The name appeared valid but did not correspond to a structure. Also thrown by some submethods upon errors, eg for poorly formatted subranges.
    • setAtomCache

      public static void setAtomCache(AtomCache c)
    • getAtomCache

      public static AtomCache getAtomCache()
    • getBiologicalAssembly

      public static Structure getBiologicalAssembly(String pdbId, boolean multiModel) throws IOException, StructureException
      Returns the first biological assembly that is available for the given PDB id.

      The output Structure will be different depending on the multiModel parameter:

      • the symmetry-expanded chains are added as new models, one per transformId. All original models but the first one are discarded.
      • as original with symmetry-expanded chains added with renamed chain ids and names (in the form originalAsymId_transformId and originalAuthId_transformId)

      For more documentation on quaternary structures see: http://pdb101.rcsb.org/learn/guide-to-understanding-pdb-data/biological-assemblies

      Parameters:
      pdbId -
      multiModel - if true the output Structure will be a multi-model one with one transformId per model, if false the outputStructure will be as the original with added chains with renamed asymIds (in the form originalAsymId_transformId and originalAuthId_transformId).
      Returns:
      a Structure object or null if that assembly is not available
      Throws:
      StructureException
      IOException
    • getBiologicalAssembly

      Returns the first biological assembly that is available for the given PDB id, using multiModel=false

      For more documentation on quaternary structures see: http://pdb101.rcsb.org/learn/guide-to-understanding-pdb-data/biological-assemblies

      Parameters:
      pdbId -
      Returns:
      a Structure object or null if that assembly is not available
      Throws:
      StructureException
      IOException
    • getBiologicalAssembly

      public static Structure getBiologicalAssembly(String pdbId, int biolAssemblyNr, boolean multiModel) throws IOException, StructureException
      Returns the biological assembly for the given PDB id and bioassembly identifier.

      The output Structure will be different depending on the multiModel parameter:

      • the symmetry-expanded chains are added as new models, one per transformId. All original models but the first one are discarded.
      • as original with symmetry-expanded chains added with renamed chain ids and names (in the form originalAsymId_transformId and originalAuthId_transformId)
      Parameters:
      pdbId -
      biolAssemblyNr - - the ith biological assembly that is available for a PDB ID (we start counting at 1, 0 represents the asym unit).
      multiModel - if true the output Structure will be a multi-model one with one transformId per model, if false the outputStructure will be as the original with added chains with renamed asymIds (in the form originalAsymId_transformId and originalAuthId_transformId).
      Returns:
      a Structure object or null if that assembly is not available
      Throws:
      StructureException - if there is no bioassembly available for given biolAssemblyNr or some other problems encountered while loading it
      IOException
    • getBiologicalAssembly

      public static Structure getBiologicalAssembly(String pdbId, int biolAssemblyNr) throws IOException, StructureException
      Returns the biological assembly for the given PDB id and bioassembly identifier, using multiModel=false
      Parameters:
      pdbId -
      biolAssemblyNr - - the ith biological assembly that is available for a PDB ID (we start counting at 1, 0 represents the asym unit).
      Returns:
      a Structure object or null if that assembly is not available
      Throws:
      StructureException - if there is no bioassembly available for given biolAssemblyNr or some other problems encountered while loading it
      IOException
    • getBiologicalAssemblies

      public static List<Structure> getBiologicalAssemblies(String pdbId, boolean multiModel) throws IOException, StructureException
      Returns all biological assemblies for the given PDB id.

      The output Structure will be different depending on the multiModel parameter:

      • the symmetry-expanded chains are added as new models, one per transformId. All original models but the first one are discarded.
      • as original with symmetry-expanded chains added with renamed chain ids and names (in the form originalAsymId_transformId and originalAuthId_transformId)
      If only one biological assembly is required use getBiologicalAssembly(String) or getBiologicalAssembly(String, int) instead.
      Parameters:
      pdbId -
      multiModel - if true the output Structure will be a multi-model one with one transformId per model, if false the outputStructure will be as the original with added chains with renamed asymIds (in the form originalAsymId_transformId and originalAuthId_transformId).
      Returns:
      Throws:
      IOException
      StructureException
      Since:
      5.0
    • getBiologicalAssemblies

      Returns all biological assemblies for the given PDB id, using multiModel=false

      If only one biological assembly is required use getBiologicalAssembly(String) or getBiologicalAssembly(String, int) instead.

      Parameters:
      pdbId -
      Returns:
      Throws:
      IOException
      StructureException
      Since:
      5.0
    • guessFiletype

      public static StructureFiletype guessFiletype(String filename)
      Attempts to guess the type of a structure file based on the extension
      Parameters:
      filename -
      Returns: