Class StructureIO


  • public class StructureIO
    extends Object
    A class that provides static access methods for easy lookup of protein structure related components
    Since:
    3.0.5
    Author:
    Andreas Prlic
    • Method Detail

      • getStructure

        public static Structure getStructure​(String name)
                                      throws IOException,
                                             StructureException
        Loads a structure based on a name. Supported naming conventions are:
                        Formal specification for how to specify the name:
        
                        name     := pdbID
                                                   | pdbID '.' chainID
                                                   | pdbID '.' range
                                                   | scopID
                                                   | biol
                                                   | pdp
                        range         := '('? range (',' range)? ')'?
                                                   | chainID
                                                   | chainID '_' resNum '-' resNum
                        pdbID         := [0-9][a-zA-Z0-9]{3}
                        chainID       := [a-zA-Z0-9]
                        scopID        := 'd' pdbID [a-z_][0-9_]
                        biol              := 'BIO:' pdbID [:]? [0-9]+
                        pdp                       := 'PDP:' pdbID[A-Za-z0-9_]+
                        resNum        := [-+]?[0-9]+[A-Za-z]?
        
        
                        Example structures:
                        1TIM            #whole structure - asym unit
                        4HHB.C          #single chain
                        4GCR.A_1-83 #one domain, by residue number
                        3AA0.A,B    #two chains treated as one structure
                        d2bq6a1     #scop domain
                        BIO:1fah   #biological assembly nr 1 for 1fah
                        BIO:1fah:0 #asym unit for 1fah
                        BIO:1fah:1 #biological assembly nr 1 for 1fah
                        BIO:1fah:2 #biological assembly nr 2 for 1fah
        
         
        With the additional set of rules:
        • If only a PDB code is provided, the whole structure will be return including ligands, but the first model only (for NMR).
        • Chain IDs are case sensitive, PDB ids are not. To specify a particular chain write as: 4hhb.A or 4HHB.A
        • To specify a SCOP domain write a scopId e.g. d2bq6a1. Some flexibility can be allowed in SCOP domain names, see #setStrictSCOP(boolean)
        • URLs are accepted as well
        Parameters:
        name -
        Returns:
        a Structure object, or null if name appears improperly formated (eg too short, etc)
        Throws:
        IOException - The PDB file cannot be cached due to IO errors
        StructureException - The name appeared valid but did not correspond to a structure. Also thrown by some submethods upon errors, eg for poorly formatted subranges.
      • getBiologicalAssembly

        public static Structure getBiologicalAssembly​(String pdbId,
                                                      boolean multiModel)
                                               throws IOException,
                                                      StructureException
        Returns the first biological assembly that is available for the given PDB id.

        The output Structure will be different depending on the multiModel parameter:

      • the symmetry-expanded chains are added as new models, one per transformId. All original models but the first one are discarded.
      • as original with symmetry-expanded chains added with renamed chain ids and names (in the form originalAsymId_transformId and originalAuthId_transformId)
      • For more documentation on quaternary structures see: {@link http://pdb101.rcsb.org/learn/guide-to-understanding-pdb-data/biological-assemblies}

Parameters:
pdbId -
multiModel - if true the output Structure will be a multi-model one with one transformId per model, if false the outputStructure will be as the original with added chains with renamed asymIds (in the form originalAsymId_transformId and originalAuthId_transformId).
Returns:
a Structure object or null if that assembly is not available
Throws:
StructureException
IOException
Parameters:
pdbId -
biolAssemblyNr - - the ith biological assembly that is available for a PDB ID (we start counting at 1, 0 represents the asym unit).
multiModel - if true the output Structure will be a multi-model one with one transformId per model, if false the outputStructure will be as the original with added chains with renamed asymIds (in the form originalAsymId_transformId and originalAuthId_transformId).
Returns:
a Structure object or null if that assembly is not available
Throws:
StructureException - if there is no bioassembly available for given biolAssemblyNr or some other problems encountered while loading it
IOException