Class SubunitClustererParameters
- java.lang.Object
-
- org.biojava.nbio.structure.cluster.SubunitClustererParameters
-
- All Implemented Interfaces:
Serializable
public class SubunitClustererParameters extends Object implements Serializable
The SubunitClustererParameters specifies the options used for the clustering of the subunits in structures using theSubunitClusterer
.- Since:
- 5.0.0
- Author:
- Peter Rose, Aleix Lafita
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description SubunitClustererParameters()
Initialize with "local" metrics by default.SubunitClustererParameters(boolean useGlobalMetrics)
"Local" metrics are scoring SubunitClustererMethod.SEQUENCE: sequence identity of a local alignment (normalised by the number of aligned residues) sequence coverage of the alignment (normalised by the length of the longer sequence) SubunitClustererMethod.STRUCTURE: RMSD of the aligned substructures and structure coverage of the alignment (normalised by the length of the larger structure) Two thresholds for each method are required.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description int
getAbsoluteMinimumSequenceLength()
If the shortest subunit sequence length is higher or equal the minimumSequenceLengthFraction times the median subunit sequence length, then the minimumSequenceLength is set to shortest subunit sequence length, but not shorter than the absoluteMinimumSequenceLength.SubunitClustererMethod
getClustererMethod()
Method to cluster subunits.int
getMinimumSequenceLength()
Get the minimum number of residues of a subunits to be considered in the clusters.double
getMinimumSequenceLengthFraction()
If the shortest subunit sequence length is higher or equal the minimumSequenceLengthFraction times the median subunit sequence length, then the minimumSequenceLength is set to shortest subunit sequence length, but not shorter than the absoluteMinimumSequenceLength.double
getRMSDThreshold()
Structure similarity threshold (measured with RMSD) to consider for the structural subunit clustering.double
getSequenceCoverageThreshold()
The minimum coverage of the sequence alignment between two subunits to be clustered together.double
getSequenceIdentityThreshold()
Sequence identity threshold to consider for the subunits clustering.double
getStructureCoverageThreshold()
The minimum coverage of the structure alignment between two subunits to be clustered together.String
getSuperpositionAlgorithm()
Method to superpose subunits (i.e., structural aligner).double
getTMThreshold()
Structure similarity threshold (measured with TMScore) to consider for the structural subunit clustering.boolean
isHighConfidenceScores(double sequenceIdentity, double sequenceCoverage)
Whether the subunits can be considered "identical" by sequence alignment.boolean
isInternalSymmetry()
The internal symmetry option divides eachSubunit
of eachSubunitCluster
into its internally symmetric repeats.boolean
isOptimizeAlignment()
Whether the alignment algorithm should try its best to optimize the alignment, or we are happy with a quick and dirty result.boolean
isUseEntityIdForSeqIdentityDetermination()
Whether to use the entity id of subunits to infer that sequences are identical.boolean
isUseGlobalMetrics()
Use metrics calculated relative to the whole sequence or structure, rather than the aligned part onlyboolean
isUseRMSD()
Use RMSD for evaluating structure similarityboolean
isUseSequenceCoverage()
Use sequence coverage for evaluating sequence similarityboolean
isUseStructureCoverage()
Use structure coverage for evaluating sequence similarityboolean
isUseTMScore()
Use TMScore for evaluating structure similarityvoid
setAbsoluteMinimumSequenceLength(int absoluteMinimumSequenceLength)
If the shortest subunit sequence length is higher or equal the minimumSequenceLengthFraction times the median subunit sequence length, then the minimumSequenceLength is set to shortest subunit sequence length, but not shorter than the absoluteMinimumSequenceLength.void
setClustererMethod(SubunitClustererMethod method)
Method to cluster subunits.void
setInternalSymmetry(boolean internalSymmetry)
The internal symmetry option divides eachSubunit
of eachSubunitCluster
into its internally symmetric repeats.void
setMinimumSequenceLength(int minimumSequenceLength)
Set the minimum number of residues of a subunits to be considered in the clusters.void
setMinimumSequenceLengthFraction(double minimumSequenceLengthFraction)
If the shortest subunit sequence length is higher or equal the minimumSequenceLengthFraction times the median subunit sequence length, then the minimumSequenceLength is set to shortest subunit sequence length, but not shorter than the absoluteMinimumSequenceLength.void
setOptimizeAlignment(boolean optimizeAlignment)
Whether the alignment algorithm should try its best to optimize the alignment, or we are happy with a quick and dirty result.void
setRMSDThreshold(double rmsdThreshold)
Structure similarity threshold (measured with RMSD) to consider for the structural subunit clustering.void
setSequenceCoverageThreshold(double sequenceCoverageThreshold)
The minimum coverage of the sequence alignment between two subunits to be clustered together.void
setSequenceIdentityThreshold(double sequenceIdentityThreshold)
Sequence identity threshold to consider for the sequence subunit clustering.void
setStructureCoverageThreshold(double structureCoverageThreshold)
The minimum coverage of the structure alignment between two subunits to be clustered together.void
setSuperpositionAlgorithm(String superpositionAlgorithm)
Method to cluster subunits.void
setTMThreshold(double tmThreshold)
Structure similarity threshold (measured with TMScore) to consider for the structural subunit clustering.void
setUseEntityIdForSeqIdentityDetermination(boolean useEntityIdForSeqIdentityDetermination)
Whether to use the entity id of subunits to infer that sequences are identical.void
setUseGlobalMetrics(boolean useGlobalMetrics)
Use metrics calculated relative to the whole sequence or structure, rather than the aligned part onlyvoid
setUseRMSD(boolean useRMSD)
Use RMSD for evaluating structure similarityvoid
setUseSequenceCoverage(boolean useSequenceCoverage)
Use sequence coverage for evaluating sequence similarityvoid
setUseStructureCoverage(boolean useStructureCoverage)
Use structure coverage for evaluating sequence similarityvoid
setUseTMScore(boolean useTMScore)
Use TMScore for evaluating structure similarityString
toString()
-
-
-
Constructor Detail
-
SubunitClustererParameters
public SubunitClustererParameters(boolean useGlobalMetrics)
"Local" metrics are scoring SubunitClustererMethod.SEQUENCE: sequence identity of a local alignment (normalised by the number of aligned residues) sequence coverage of the alignment (normalised by the length of the longer sequence) SubunitClustererMethod.STRUCTURE: RMSD of the aligned substructures and structure coverage of the alignment (normalised by the length of the larger structure) Two thresholds for each method are required. "Global" metrics are scoring SubunitClustererMethod.SEQUENCE: sequence identity of a global alignment (normalised by the length of the alignment) SubunitClustererMethod.STRUCTURE: TMScore of the aligned structures (normalised by the length of the larger structure) One threshold for each method is required.
-
SubunitClustererParameters
public SubunitClustererParameters()
Initialize with "local" metrics by default.
-
-
Method Detail
-
getMinimumSequenceLength
public int getMinimumSequenceLength()
Get the minimum number of residues of a subunits to be considered in the clusters.- Returns:
- minimumSequenceLength
-
setMinimumSequenceLength
public void setMinimumSequenceLength(int minimumSequenceLength)
Set the minimum number of residues of a subunits to be considered in the clusters.- Parameters:
minimumSequenceLength
-
-
getAbsoluteMinimumSequenceLength
public int getAbsoluteMinimumSequenceLength()
If the shortest subunit sequence length is higher or equal the minimumSequenceLengthFraction times the median subunit sequence length, then the minimumSequenceLength is set to shortest subunit sequence length, but not shorter than the absoluteMinimumSequenceLength.This adaptive feature allows the consideration of structures mainly constructed by very short chains, such as collagen (1A3I)
- Returns:
- the absoluteMinimumSequenceLength
-
setAbsoluteMinimumSequenceLength
public void setAbsoluteMinimumSequenceLength(int absoluteMinimumSequenceLength)
If the shortest subunit sequence length is higher or equal the minimumSequenceLengthFraction times the median subunit sequence length, then the minimumSequenceLength is set to shortest subunit sequence length, but not shorter than the absoluteMinimumSequenceLength.This adaptive feature allows the consideration of structures mainly constructed by very short chains, such as collagen (1A3I)
- Parameters:
absoluteMinimumSequenceLength
-
-
getMinimumSequenceLengthFraction
public double getMinimumSequenceLengthFraction()
If the shortest subunit sequence length is higher or equal the minimumSequenceLengthFraction times the median subunit sequence length, then the minimumSequenceLength is set to shortest subunit sequence length, but not shorter than the absoluteMinimumSequenceLength.This adaptive feature allows the consideration of structures mainly constructed by very short chains, such as collagen (1A3I)
- Returns:
- the minimumSequenceLengthFraction
-
setMinimumSequenceLengthFraction
public void setMinimumSequenceLengthFraction(double minimumSequenceLengthFraction)
If the shortest subunit sequence length is higher or equal the minimumSequenceLengthFraction times the median subunit sequence length, then the minimumSequenceLength is set to shortest subunit sequence length, but not shorter than the absoluteMinimumSequenceLength.This adaptive feature allows the consideration of structures mainly constructed by very short chains, such as collagen (1A3I)
- Parameters:
minimumSequenceLengthFraction
-
-
getSequenceIdentityThreshold
public double getSequenceIdentityThreshold()
Sequence identity threshold to consider for the subunits clustering.Two subunits with sequence identity equal or higher than the threshold will be clustered together.
- Returns:
- sequenceIdentityThreshold
-
setSequenceIdentityThreshold
public void setSequenceIdentityThreshold(double sequenceIdentityThreshold)
Sequence identity threshold to consider for the sequence subunit clustering.Two subunits with sequence identity equal or higher than the threshold will be clustered together.
- Parameters:
sequenceIdentityThreshold
-
-
getSequenceCoverageThreshold
public double getSequenceCoverageThreshold()
The minimum coverage of the sequence alignment between two subunits to be clustered together.- Returns:
- sequenceCoverageThreshold
-
setSequenceCoverageThreshold
public void setSequenceCoverageThreshold(double sequenceCoverageThreshold)
The minimum coverage of the sequence alignment between two subunits to be clustered together.- Parameters:
sequenceCoverageThreshold
-
-
getRMSDThreshold
public double getRMSDThreshold()
Structure similarity threshold (measured with RMSD) to consider for the structural subunit clustering.- Returns:
- rmsdThreshold
-
setRMSDThreshold
public void setRMSDThreshold(double rmsdThreshold)
Structure similarity threshold (measured with RMSD) to consider for the structural subunit clustering.- Parameters:
rmsdThreshold
-
-
getTMThreshold
public double getTMThreshold()
Structure similarity threshold (measured with TMScore) to consider for the structural subunit clustering.- Returns:
- tmThreshold
-
setTMThreshold
public void setTMThreshold(double tmThreshold)
Structure similarity threshold (measured with TMScore) to consider for the structural subunit clustering.- Parameters:
tmThreshold
-
-
getStructureCoverageThreshold
public double getStructureCoverageThreshold()
The minimum coverage of the structure alignment between two subunits to be clustered together.- Returns:
- structureCoverageThreshold
-
setStructureCoverageThreshold
public void setStructureCoverageThreshold(double structureCoverageThreshold)
The minimum coverage of the structure alignment between two subunits to be clustered together.- Parameters:
structureCoverageThreshold
-
-
getClustererMethod
public SubunitClustererMethod getClustererMethod()
Method to cluster subunits.- Returns:
- clustererMethod
-
setClustererMethod
public void setClustererMethod(SubunitClustererMethod method)
Method to cluster subunits.- Parameters:
method
-
-
isInternalSymmetry
public boolean isInternalSymmetry()
The internal symmetry option divides eachSubunit
of eachSubunitCluster
into its internally symmetric repeats.The
SubunitClustererMethod.STRUCTURE
must be chosen to consider internal symmetry, otherwise this parameter will be ignored.- Returns:
- true if internal symmetry is considered, false otherwise
-
setInternalSymmetry
public void setInternalSymmetry(boolean internalSymmetry)
The internal symmetry option divides eachSubunit
of eachSubunitCluster
into its internally symmetric repeats.The
SubunitClustererMethod.STRUCTURE
must be chosen to consider internal symmetry, otherwise this parameter will be ignored.- Parameters:
internalSymmetry
- true if internal symmetry is considered, false otherwise
-
getSuperpositionAlgorithm
public String getSuperpositionAlgorithm()
Method to superpose subunits (i.e., structural aligner).- Returns:
- superpositionAlgorithm
-
setSuperpositionAlgorithm
public void setSuperpositionAlgorithm(String superpositionAlgorithm)
Method to cluster subunits.- Parameters:
superpositionAlgorithm
-
-
isOptimizeAlignment
public boolean isOptimizeAlignment()
Whether the alignment algorithm should try its best to optimize the alignment, or we are happy with a quick and dirty result. Effect depends on implementation of the specific algorithm's method. *- Returns:
- optimizeAlignment
-
setOptimizeAlignment
public void setOptimizeAlignment(boolean optimizeAlignment)
Whether the alignment algorithm should try its best to optimize the alignment, or we are happy with a quick and dirty result. Effect depends on implementation of the specific algorithm's method. *- Parameters:
optimizeAlignment
-
-
isUseRMSD
public boolean isUseRMSD()
Use RMSD for evaluating structure similarity- Returns:
- useRMSD
-
setUseRMSD
public void setUseRMSD(boolean useRMSD)
Use RMSD for evaluating structure similarity- Parameters:
useRMSD
-
-
isUseTMScore
public boolean isUseTMScore()
Use TMScore for evaluating structure similarity- Returns:
- useTMScore
-
setUseTMScore
public void setUseTMScore(boolean useTMScore)
Use TMScore for evaluating structure similarity- Parameters:
useTMScore
-
-
isUseSequenceCoverage
public boolean isUseSequenceCoverage()
Use sequence coverage for evaluating sequence similarity- Returns:
- useSequenceCoverage
-
setUseSequenceCoverage
public void setUseSequenceCoverage(boolean useSequenceCoverage)
Use sequence coverage for evaluating sequence similarity- Parameters:
useSequenceCoverage
-
-
isUseStructureCoverage
public boolean isUseStructureCoverage()
Use structure coverage for evaluating sequence similarity- Returns:
- useStructureCoverage
-
setUseStructureCoverage
public void setUseStructureCoverage(boolean useStructureCoverage)
Use structure coverage for evaluating sequence similarity- Parameters:
useStructureCoverage
-
-
isUseGlobalMetrics
public boolean isUseGlobalMetrics()
Use metrics calculated relative to the whole sequence or structure, rather than the aligned part only- Returns:
- useGlobalMetrics
-
setUseGlobalMetrics
public void setUseGlobalMetrics(boolean useGlobalMetrics)
Use metrics calculated relative to the whole sequence or structure, rather than the aligned part only- Parameters:
useGlobalMetrics
-
-
isHighConfidenceScores
public boolean isHighConfidenceScores(double sequenceIdentity, double sequenceCoverage)
Whether the subunits can be considered "identical" by sequence alignment. For local sequence alignment (normalized by the number of aligned pairs) this means 0.95 or higher identity and 0.75 or higher coverage. For global sequence alignment (normalised by the alignment length) this means 0.85 or higher sequence identity.- Parameters:
sequenceIdentity
-sequenceCoverage
-- Returns:
- true if the sequence alignment scores are equal to or better than the "high confidence" scores, false otherwise.
-
isUseEntityIdForSeqIdentityDetermination
public boolean isUseEntityIdForSeqIdentityDetermination()
Whether to use the entity id of subunits to infer that sequences are identical. Only applies if theSubunitClustererMethod
is a sequence based one.- Returns:
- the flag
- Since:
- 5.4.0
-
setUseEntityIdForSeqIdentityDetermination
public void setUseEntityIdForSeqIdentityDetermination(boolean useEntityIdForSeqIdentityDetermination)
Whether to use the entity id of subunits to infer that sequences are identical. Only applies if theSubunitClustererMethod
is a sequence based one. Note this requiresFileParsingParameters.setAlignSeqRes(boolean)
to be set to true.- Parameters:
useEntityIdForSeqIdentityDetermination
- the flag to be set- Since:
- 5.4.0
-
-