public class StructureTools extends Object
Modifier and Type | Field and Description |
---|---|
static String |
C_ATOM_NAME
The atom name for the backbone carbonyl
|
static String |
C1_ATOM_NAME
The atom name of the backbone C1' in RNA
|
static String |
C2_ATOM_NAME
The atom name of the backbone C2' in RNA
|
static String |
C3_ATOM_NAME
The atom name of the backbone C3' in RNA
|
static String |
C4_ATOM_NAME
The atom name of the backbone C4' in RNA
|
static String |
CA_ATOM_NAME
The atom name of the backbone C-alpha atom.
|
static String |
CB_ATOM_NAME
The atom name of the side-chain C-beta atom
|
static String |
N_ATOM_NAME
The atom name for the backbone amide nitrogen
|
static String |
NUCLEOTIDE_REPRESENTATIVE
The atom used as representative for nucleotides, equivalent to
CA_ATOM_NAME for proteins |
static String |
O_ATOM_NAME
The atom name for the backbone carbonyl oxygen
|
static String |
O2_ATOM_NAME
The atom name of the backbone O2' in RNA
|
static String |
O3_ATOM_NAME
The atom name of the backbone O3' in RNA
|
static String |
O4_ATOM_NAME
The atom name of the backbone O4' in RNA
|
static String |
O5_ATOM_NAME
The atom name of the backbone O4' in RNA
|
static String |
OP1_ATOM_NAME
The atom name of the backbone O4' in RNA
|
static String |
OP2_ATOM_NAME
The atom name of the backbone O4' in RNA
|
static String |
P_ATOM_NAME
The atom name of the backbone phosphate in RNA
|
static double |
RATIO_RESIDUES_TO_TOTAL
Below this ratio of aminoacid/nucleotide residues to the sequence total,
we use simple majority of aminoacid/nucleotide residues to decide the
character of the chain (protein/nucleotide)
|
static char |
UNKNOWN_GROUP_LABEL
The character to use for unknown compounds in sequence strings
|
Constructor and Description |
---|
StructureTools() |
Modifier and Type | Method and Description |
---|---|
static Atom[] |
cloneAtomArray(Atom[] ca)
Provides an equivalent copy of Atoms in a new array.
|
static Atom[] |
cloneCAArray(Atom[] ca)
Deprecated.
Use the better-named
cloneAtomArray(Atom[]) instead |
static Group[] |
cloneGroups(Atom[] ca)
Clone a set of representative Atoms, but returns the parent groups
|
static Character |
convert_3code_1code(String code3)
Deprecated.
Use
get1LetterCodeAmino(String) instead |
static String |
convertAtomsToSeq(Atom[] atoms) |
static Atom[] |
duplicateCA2(Atom[] ca2)
Utility method for working with circular permutations.
|
static List<Group> |
filterLigands(List<Group> allGroups)
Removes all polymeric and solvent groups from a list of groups
|
static Character |
get1LetterCode(String groupCode3)
Convert a three letter amino acid or nucleotide code into a single
character code.
|
static Character |
get1LetterCodeAmino(String groupCode3)
Convert three character amino acid codes into single character e.g.
|
static Atom[] |
getAllAtomArray(Chain c)
Returns and array of all atoms of the chain (first model), including
Hydrogens (if present) and all HETATOMs.
|
static Atom[] |
getAllAtomArray(Structure s)
Convert all atoms of the structure (first model) into an Atom array
|
static Atom[] |
getAllNonHAtomArray(Chain c,
boolean hetAtoms)
Returns and array of all non-Hydrogen atoms in the given Chain,
optionally including HET atoms or not Waters are not included.
|
static Atom[] |
getAllNonHAtomArray(Structure s,
boolean hetAtoms)
Returns and array of all non-Hydrogen atoms in the given Structure,
optionally including HET atoms or not.
|
static Atom[] |
getAtomArray(Chain c,
String[] atomNames)
Returns an array of the requested Atoms from the Chain object.
|
static Atom[] |
getAtomArray(Structure s,
String[] atomNames)
Returns an array of the requested Atoms from the Structure object.
|
static Atom[] |
getAtomArrayAllModels(Structure s,
String[] atomNames)
Returns an array of the requested Atoms from the Structure object.
|
static Atom[] |
getAtomCAArray(Chain c)
Returns an Atom array of the C-alpha atoms.
|
static Atom[] |
getAtomCAArray(Structure s)
Return an Atom array of the C-alpha atoms.
|
static AtomContactSet |
getAtomsCAInContact(Chain chain,
double cutoff)
Returns the set of intra-chain contacts for the given chain for C-alpha
atoms (including non-standard aminoacids appearing as HETATM groups),
i.e.
|
static AtomContactSet |
getAtomsInContact(Chain chain1,
Chain chain2,
double cutoff,
boolean hetAtoms)
Returns the set of inter-chain contacts between the two given chains for
all non-H atoms.
|
static AtomContactSet |
getAtomsInContact(Chain chain1,
Chain chain2,
String[] atomNames,
double cutoff,
boolean hetAtoms)
Returns the set of inter-chain contacts between the two given chains for
the given atom names.
|
static AtomContactSet |
getAtomsInContact(Chain chain,
double cutoff)
Returns the set of intra-chain contacts for the given chain for all non-H
atoms of non-hetatoms, i.e.
|
static AtomContactSet |
getAtomsInContact(Chain chain,
String[] atomNames,
double cutoff)
Returns the set of intra-chain contacts for the given chain for given
atom names, i.e.
|
static Atom[] |
getBackboneAtomArray(Structure s)
Return an Atom array of the main chain atoms: CA, C, N, O Any group that
contains those atoms will be included, be it a standard aminoacid or not
|
static Group |
getGroupByPDBResidueNumber(Structure struc,
ResidueNumber pdbResNum)
Get a group represented by a ResidueNumber.
|
static Map<Group,Double> |
getGroupDistancesWithinShell(Structure structure,
Atom centroid,
Set<ResidueNumber> excludeResidues,
double radius,
boolean includeWater,
boolean useAverageDistance)
Finds Groups in
structure that contain at least one Atom that is
within radius Angstroms of centroid . |
static Set<Group> |
getGroupsWithinShell(Structure structure,
Atom atom,
Set<ResidueNumber> excludeResidues,
double distance,
boolean includeWater) |
static Set<Group> |
getGroupsWithinShell(Structure structure,
Group group,
double distance,
boolean includeWater)
Returns a Set of Groups in a structure within the distance specified of a
given group.
|
static int |
getNrAtoms(Structure s)
Count how many Atoms are contained within a Structure object.
|
static int |
getNrGroups(Structure s)
Count how many groups are contained within a structure object.
|
static GroupType |
getPredominantGroupType(Chain c)
Get the predominant
GroupType for a given Chain, following these
rules: if the ratio of number of residues of a certain
GroupType to total non-water residues is above the threshold
0.95, then that GroupType is
returned if there is no GroupType that is above the
threshold then the GroupType with most members is chosen, logging
it |
static Structure |
getReducedStructure(Structure s,
int chainNr)
Deprecated.
Use
StructureIdentifier.reduce(Structure) instead (v. 4.2.0) |
static Structure |
getReducedStructure(Structure s,
String chainId)
Deprecated.
Use
StructureIdentifier.reduce(Structure) instead (v. 4.2.0) |
static Atom[] |
getRepresentativeAtomArray(Chain c)
Gets a representative atom for each group that is part of the chain
backbone.
|
static Atom[] |
getRepresentativeAtomArray(Structure s)
Gets a representative atom for each group that is part of the chain
backbone.
|
static AtomContactSet |
getRepresentativeAtomsInContact(Chain chain,
double cutoff)
Returns the set of intra-chain contacts for the given chain for C-alpha
or C3' atoms (including non-standard aminoacids appearing as HETATM
groups), i.e.
|
static Structure |
getStructure(String name)
Short version of
getStructure(String, PDBFileParser, AtomCache)
which creates new parsers when needed |
static Structure |
getStructure(String name,
PDBFileParser parser,
AtomCache cache)
Flexibly get a structure from an input String.
|
static Structure |
getSubRanges(Structure s,
String ranges)
Deprecated.
Use
StructureIdentifier instead (4.2.0) |
static List<Group> |
getUnalignedGroups(Atom[] ca)
List of groups from the structure not included in ca (e.g.
|
static boolean |
isChainPureNonPolymer(Chain c)
Returns true if the given chain is composed of non-polymeric groups only
|
static boolean |
isChainWaterOnly(Chain c)
Returns true if the given chain is composed of water molecules only
|
static boolean |
isNucleicAcid(Chain c)
Tell whether given chain is DNA or RNA
|
static boolean |
isNucleotide(String groupCode3)
Test if the three-letter code of an ATOM entry corresponds to a
nucleotide or to an aminoacid.
|
static boolean |
isProtein(Chain c)
Tell whether given chain is a protein chain
|
static Structure |
removeModels(Structure s)
Remove all models from a Structure and keep only the first
|
public static final String CA_ATOM_NAME
public static final String N_ATOM_NAME
public static final String C_ATOM_NAME
public static final String O_ATOM_NAME
public static final String CB_ATOM_NAME
public static final String C1_ATOM_NAME
public static final String C2_ATOM_NAME
public static final String C3_ATOM_NAME
public static final String C4_ATOM_NAME
public static final String O2_ATOM_NAME
public static final String O3_ATOM_NAME
public static final String O4_ATOM_NAME
public static final String O5_ATOM_NAME
public static final String OP1_ATOM_NAME
public static final String OP2_ATOM_NAME
public static final String P_ATOM_NAME
public static final String NUCLEOTIDE_REPRESENTATIVE
CA_ATOM_NAME
for proteinspublic static final char UNKNOWN_GROUP_LABEL
public static final double RATIO_RESIDUES_TO_TOTAL
public static final int getNrAtoms(Structure s)
s
- the structure objectpublic static final int getNrGroups(Structure s)
s
- the structure objectpublic static final Atom[] getAtomArray(Structure s, String[] atomNames)
AminoAcid
or HetatomImpl
group. If the group does not contain all requested atoms then no atoms
are added for that group. For structures with more than one model, only
model 0 will be used.s
- the structure to get the atoms fromatomNames
- contains the atom names to be used.public static final Atom[] getAtomArrayAllModels(Structure s, String[] atomNames)
getAtomArray(Structure, String[])
this method
iterates over all chains. Iterates over all chains and groups and checks
if the requested atoms are in this group, no matter if this is a
AminoAcid
or HetatomImpl
group. If the group does not
contain all requested atoms then no atoms are added for that group. For
structures with more than one model, only model 0 will be used.s
- the structure to get the atoms fromatomNames
- contains the atom names to be used.public static final Atom[] getAllAtomArray(Structure s)
s
- input structurepublic static final Atom[] getAllAtomArray(Chain c)
c
- input chainpublic static List<Group> getUnalignedGroups(Atom[] ca)
ca
- an array of atomspublic static final Atom[] getAllNonHAtomArray(Structure s, boolean hetAtoms)
s
- hetAtoms
- if true HET atoms are included in array, if false they are notpublic static final Atom[] getAllNonHAtomArray(Chain c, boolean hetAtoms)
c
- hetAtoms
- if true HET atoms are included in array, if false they are notpublic static final Atom[] getAtomArray(Chain c, String[] atomNames)
c
- the Chain to get the atoms fromatomNames
- contains the atom names to be used.public static final Atom[] getAtomCAArray(Chain c)
c
- the structure objectgetRepresentativeAtomArray(Chain)
public static final Atom[] getRepresentativeAtomArray(Chain c)
ReducedChemCompProvider
was used to load the
structure.
For amino acids, the representative is a CA carbon. For nucleotides, the
representative is the "P". Other group
types will be ignored.c
- @Deprecated public static final Atom[] cloneCAArray(Atom[] ca)
cloneAtomArray(Atom[])
insteadca
- array of representative atoms, e.g. CA atomspublic static final Atom[] cloneAtomArray(Atom[] ca)
ca
- array of representative atoms, e.g. CA atomspublic static Group[] cloneGroups(Atom[] ca)
ca
- Atom arraypublic static Atom[] duplicateCA2(Atom[] ca2)
ca2
- atom arraypublic static Atom[] getAtomCAArray(Structure s)
s
- the structure objectgetRepresentativeAtomArray(Structure)
public static Atom[] getRepresentativeAtomArray(Structure s)
ReducedChemCompProvider
was used to load the
structure.
For amino acids, the representative is a CA carbon. For nucleotides, the
representative is the "P". Other group
types will be ignored.s
- Input structurepublic static Atom[] getBackboneAtomArray(Structure s)
s
- the structure objectpublic static final Character get1LetterCodeAmino(String groupCode3)
aminoAcids
map)groupCode3
- a three character amino acid representation String#get1LetterCode(String)}
@Deprecated public static final Character convert_3code_1code(String code3)
get1LetterCodeAmino(String)
insteadcode3
- public static final Character get1LetterCode(String groupCode3)
UNKNOWN_GROUP_LABEL
.
Returned null for nucleotides prior to version 4.0.1.groupCode3
- three letter representationpublic static final boolean isNucleotide(String groupCode3)
groupCode3
- 3-character code for a group.@Deprecated public static final Structure getReducedStructure(Structure s, String chainId) throws StructureException
StructureIdentifier.reduce(Structure)
instead (v. 4.2.0)s
- chainId
- StructureException
@Deprecated public static final Structure getReducedStructure(Structure s, int chainNr) throws StructureException
StructureIdentifier.reduce(Structure)
instead (v. 4.2.0)s
- chainNr
- can be -1 to request all chains of model 0, otherwise will
only add chain at this positionStructureException
@Deprecated public static final Structure getSubRanges(Structure s, String ranges) throws StructureException
StructureIdentifier
instead (4.2.0)getReducedStructure(Structure, int)
and
getReducedStructure(Structure, String)
, also provides a way to
specify sub-regions of a structure with the following specification:
4GCR (A:1-83) 1CDG (A:407-495,A:582-686) 1CDG (A_407-495,A_582-686)
s
- The full structureranges
- A comma-separated list of ranges, optionally surrounded by
parenthesesIllegalArgumentException
- for malformed range stringsStructureException
- for errors when reducing the Structurepublic static final Group getGroupByPDBResidueNumber(Structure struc, ResidueNumber pdbResNum) throws StructureException
struc
- a Structure
pdbResNum
- a ResidueNumber
StructureException
- if the group cannot be found.public static AtomContactSet getAtomsInContact(Chain chain, String[] atomNames, double cutoff)
FileParsingParameters.setAlignSeqRes(boolean)
needs
to be set to true for this to work.chain
- atomNames
- the array with atom names to be used. Beware: CA will do both
C-alphas an Calciums if null all non-H atoms of non-hetatoms
will be usedcutoff
- public static AtomContactSet getAtomsInContact(Chain chain, double cutoff)
FileParsingParameters.setAlignSeqRes(boolean)
needs to be set to
true for this to work.chain
- cutoff
- public static AtomContactSet getAtomsCAInContact(Chain chain, double cutoff)
FileParsingParameters.setAlignSeqRes(boolean)
needs to be set to
true for this to work.chain
- cutoff
- #getRepresentativeAtomsInContact(Chain, double)}
public static AtomContactSet getRepresentativeAtomsInContact(Chain chain, double cutoff)
chain
- cutoff
- public static AtomContactSet getAtomsInContact(Chain chain1, Chain chain2, String[] atomNames, double cutoff, boolean hetAtoms)
FileParsingParameters.setAlignSeqRes(boolean)
needs to be set to
true for this to work.chain1
- chain2
- atomNames
- the array with atom names to be used. For Calphas use {"CA"},
if null all non-H atoms will be used. Note HET atoms are
ignored unless this parameter is null.cutoff
- hetAtoms
- if true HET atoms are included, if false they are notpublic static AtomContactSet getAtomsInContact(Chain chain1, Chain chain2, double cutoff, boolean hetAtoms)
FileParsingParameters.setAlignSeqRes(boolean)
needs to be set to
true for this to work.chain1
- chain2
- cutoff
- hetAtoms
- if true HET atoms are included, if false they are notpublic static Map<Group,Double> getGroupDistancesWithinShell(Structure structure, Atom centroid, Set<ResidueNumber> excludeResidues, double radius, boolean includeWater, boolean useAverageDistance)
structure
that contain at least one Atom that is
within radius
Angstroms of centroid
.structure
- The structure from which to find Groupscentroid
- The centroid of the shellexcludeResidues
- A list of ResidueNumbers to excluderadius
- The radius from centroid
, in AngstromsincludeWater
- Whether to include Groups whose only atoms are wateruseAverageDistance
- When set to true, distances are the arithmetic mean (1-norm)
of the distances of atoms that belong to the group and that
are within the shell; otherwise, distances are the minimum of
these valuespublic static Set<Group> getGroupsWithinShell(Structure structure, Atom atom, Set<ResidueNumber> excludeResidues, double distance, boolean includeWater)
public static Set<Group> getGroupsWithinShell(Structure structure, Group group, double distance, boolean includeWater)
Returns a Set of Groups in a structure within the distance specified of a given group.
Updated 18-Sep-2015 sroughley to return a Set so only a unique set of Groups returned
structure
- The structure to work withgroup
- The 'query' groupdistance
- The cutoff distanceincludeWater
- Should water residues be included in the output?LinkedHashSet
of Group
s within at least one atom
with distance
of at least one atom in group
public static Structure removeModels(Structure s)
s
- original Structurepublic static List<Group> filterLigands(List<Group> allGroups)
public static Structure getStructure(String name) throws IOException, StructureException
getStructure(String, PDBFileParser, AtomCache)
which creates new parsers when neededname
- IOException
StructureException
public static Structure getStructure(String name, PDBFileParser parser, AtomCache cache) throws IOException, StructureException
name
- Some reference to the protein structureparser
- A clean PDBFileParser to use if it is a file. If null, a
PDBFileParser will be instantiated if needed.cache
- An AtomCache to use if the structure can be fetched from the
PDB. If null, a AtomCache will be instantiated if needed.IOException
- if name is an existing file, but doesn't parse correctlyStructureException
- if the format is unknown, or if AtomCache throws an
exception.public static boolean isProtein(Chain c)
c
- getPredominantGroupType(Chain)
public static boolean isNucleicAcid(Chain c)
c
- getPredominantGroupType(Chain)
public static GroupType getPredominantGroupType(Chain c)
GroupType
for a given Chain, following these
rules: GroupType
to total non-water residues is above the threshold
0.95, then that GroupType
is
returnedGroupType
that is above the
threshold then the GroupType
with most members is chosen, logging
it
See also ChemComp.getPolymerType()
and
ChemComp.getResidueType()
which follow the PDB chemical component
dictionary and provide a much more accurate description of groups and
their linking.
c
- public static boolean isChainWaterOnly(Chain c)
c
- public static boolean isChainPureNonPolymer(Chain c)
c
- Copyright © 2000–2016 BioJava. All rights reserved.