Class SubunitClustererParameters

java.lang.Object
org.biojava.nbio.structure.cluster.SubunitClustererParameters
All Implemented Interfaces:
Serializable

public class SubunitClustererParameters extends Object implements Serializable
The SubunitClustererParameters specifies the options used for the clustering of the subunits in structures using the SubunitClusterer.
Since:
5.0.0
Author:
Peter Rose, Aleix Lafita
See Also:
  • Constructor Summary

    Constructors
    Constructor
    Description
    Initialize with "local" metrics by default.
    SubunitClustererParameters(boolean useGlobalMetrics)
    "Local" metrics are scoring SubunitClustererMethod.SEQUENCE: sequence identity of a local alignment (normalised by the number of aligned residues) sequence coverage of the alignment (normalised by the length of the longer sequence) SubunitClustererMethod.STRUCTURE: RMSD of the aligned substructures and structure coverage of the alignment (normalised by the length of the larger structure) Two thresholds for each method are required.
  • Method Summary

    Modifier and Type
    Method
    Description
    int
    If the shortest subunit sequence length is higher or equal the minimumSequenceLengthFraction times the median subunit sequence length, then the minimumSequenceLength is set to shortest subunit sequence length, but not shorter than the absoluteMinimumSequenceLength.
    Method to cluster subunits.
    int
    Get the minimum number of residues of a subunits to be considered in the clusters.
    double
    If the shortest subunit sequence length is higher or equal the minimumSequenceLengthFraction times the median subunit sequence length, then the minimumSequenceLength is set to shortest subunit sequence length, but not shorter than the absoluteMinimumSequenceLength.
    double
    Structure similarity threshold (measured with RMSD) to consider for the structural subunit clustering.
    double
    The minimum coverage of the sequence alignment between two subunits to be clustered together.
    double
    Sequence identity threshold to consider for the subunits clustering.
    double
    The minimum coverage of the structure alignment between two subunits to be clustered together.
    Method to superpose subunits (i.e., structural aligner).
    double
    Structure similarity threshold (measured with TMScore) to consider for the structural subunit clustering.
    boolean
    isHighConfidenceScores(double sequenceIdentity, double sequenceCoverage)
    Whether the subunits can be considered "identical" by sequence alignment.
    boolean
    The internal symmetry option divides each Subunit of each SubunitCluster into its internally symmetric repeats.
    boolean
    Whether the alignment algorithm should try its best to optimize the alignment, or we are happy with a quick and dirty result.
    boolean
    Whether to use the entity id of subunits to infer that sequences are identical.
    boolean
    Use metrics calculated relative to the whole sequence or structure, rather than the aligned part only
    boolean
    Use RMSD for evaluating structure similarity
    boolean
    Use sequence coverage for evaluating sequence similarity
    boolean
    Use structure coverage for evaluating sequence similarity
    boolean
    Use TMScore for evaluating structure similarity
    void
    setAbsoluteMinimumSequenceLength(int absoluteMinimumSequenceLength)
    If the shortest subunit sequence length is higher or equal the minimumSequenceLengthFraction times the median subunit sequence length, then the minimumSequenceLength is set to shortest subunit sequence length, but not shorter than the absoluteMinimumSequenceLength.
    void
    Method to cluster subunits.
    void
    setInternalSymmetry(boolean internalSymmetry)
    The internal symmetry option divides each Subunit of each SubunitCluster into its internally symmetric repeats.
    void
    setMinimumSequenceLength(int minimumSequenceLength)
    Set the minimum number of residues of a subunits to be considered in the clusters.
    void
    setMinimumSequenceLengthFraction(double minimumSequenceLengthFraction)
    If the shortest subunit sequence length is higher or equal the minimumSequenceLengthFraction times the median subunit sequence length, then the minimumSequenceLength is set to shortest subunit sequence length, but not shorter than the absoluteMinimumSequenceLength.
    void
    setOptimizeAlignment(boolean optimizeAlignment)
    Whether the alignment algorithm should try its best to optimize the alignment, or we are happy with a quick and dirty result.
    void
    setRMSDThreshold(double rmsdThreshold)
    Structure similarity threshold (measured with RMSD) to consider for the structural subunit clustering.
    void
    setSequenceCoverageThreshold(double sequenceCoverageThreshold)
    The minimum coverage of the sequence alignment between two subunits to be clustered together.
    void
    setSequenceIdentityThreshold(double sequenceIdentityThreshold)
    Sequence identity threshold to consider for the sequence subunit clustering.
    void
    setStructureCoverageThreshold(double structureCoverageThreshold)
    The minimum coverage of the structure alignment between two subunits to be clustered together.
    void
    setSuperpositionAlgorithm(String superpositionAlgorithm)
    Method to cluster subunits.
    void
    setTMThreshold(double tmThreshold)
    Structure similarity threshold (measured with TMScore) to consider for the structural subunit clustering.
    void
    setUseEntityIdForSeqIdentityDetermination(boolean useEntityIdForSeqIdentityDetermination)
    Whether to use the entity id of subunits to infer that sequences are identical.
    void
    setUseGlobalMetrics(boolean useGlobalMetrics)
    Use metrics calculated relative to the whole sequence or structure, rather than the aligned part only
    void
    setUseRMSD(boolean useRMSD)
    Use RMSD for evaluating structure similarity
    void
    setUseSequenceCoverage(boolean useSequenceCoverage)
    Use sequence coverage for evaluating sequence similarity
    void
    setUseStructureCoverage(boolean useStructureCoverage)
    Use structure coverage for evaluating sequence similarity
    void
    setUseTMScore(boolean useTMScore)
    Use TMScore for evaluating structure similarity
     

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
  • Constructor Details

    • SubunitClustererParameters

      public SubunitClustererParameters(boolean useGlobalMetrics)
      "Local" metrics are scoring SubunitClustererMethod.SEQUENCE: sequence identity of a local alignment (normalised by the number of aligned residues) sequence coverage of the alignment (normalised by the length of the longer sequence) SubunitClustererMethod.STRUCTURE: RMSD of the aligned substructures and structure coverage of the alignment (normalised by the length of the larger structure) Two thresholds for each method are required. "Global" metrics are scoring SubunitClustererMethod.SEQUENCE: sequence identity of a global alignment (normalised by the length of the alignment) SubunitClustererMethod.STRUCTURE: TMScore of the aligned structures (normalised by the length of the larger structure) One threshold for each method is required.
    • SubunitClustererParameters

      Initialize with "local" metrics by default.
  • Method Details

    • getMinimumSequenceLength

      Get the minimum number of residues of a subunits to be considered in the clusters.
      Returns:
      minimumSequenceLength
    • setMinimumSequenceLength

      public void setMinimumSequenceLength(int minimumSequenceLength)
      Set the minimum number of residues of a subunits to be considered in the clusters.
      Parameters:
      minimumSequenceLength -
    • getAbsoluteMinimumSequenceLength

      If the shortest subunit sequence length is higher or equal the minimumSequenceLengthFraction times the median subunit sequence length, then the minimumSequenceLength is set to shortest subunit sequence length, but not shorter than the absoluteMinimumSequenceLength.

      This adaptive feature allows the consideration of structures mainly constructed by very short chains, such as collagen (1A3I)

      Returns:
      the absoluteMinimumSequenceLength
    • setAbsoluteMinimumSequenceLength

      public void setAbsoluteMinimumSequenceLength(int absoluteMinimumSequenceLength)
      If the shortest subunit sequence length is higher or equal the minimumSequenceLengthFraction times the median subunit sequence length, then the minimumSequenceLength is set to shortest subunit sequence length, but not shorter than the absoluteMinimumSequenceLength.

      This adaptive feature allows the consideration of structures mainly constructed by very short chains, such as collagen (1A3I)

      Parameters:
      absoluteMinimumSequenceLength -
    • getMinimumSequenceLengthFraction

      If the shortest subunit sequence length is higher or equal the minimumSequenceLengthFraction times the median subunit sequence length, then the minimumSequenceLength is set to shortest subunit sequence length, but not shorter than the absoluteMinimumSequenceLength.

      This adaptive feature allows the consideration of structures mainly constructed by very short chains, such as collagen (1A3I)

      Returns:
      the minimumSequenceLengthFraction
    • setMinimumSequenceLengthFraction

      public void setMinimumSequenceLengthFraction(double minimumSequenceLengthFraction)
      If the shortest subunit sequence length is higher or equal the minimumSequenceLengthFraction times the median subunit sequence length, then the minimumSequenceLength is set to shortest subunit sequence length, but not shorter than the absoluteMinimumSequenceLength.

      This adaptive feature allows the consideration of structures mainly constructed by very short chains, such as collagen (1A3I)

      Parameters:
      minimumSequenceLengthFraction -
    • getSequenceIdentityThreshold

      Sequence identity threshold to consider for the subunits clustering.

      Two subunits with sequence identity equal or higher than the threshold will be clustered together.

      Returns:
      sequenceIdentityThreshold
    • setSequenceIdentityThreshold

      public void setSequenceIdentityThreshold(double sequenceIdentityThreshold)
      Sequence identity threshold to consider for the sequence subunit clustering.

      Two subunits with sequence identity equal or higher than the threshold will be clustered together.

      Parameters:
      sequenceIdentityThreshold -
    • getSequenceCoverageThreshold

      The minimum coverage of the sequence alignment between two subunits to be clustered together.
      Returns:
      sequenceCoverageThreshold
    • setSequenceCoverageThreshold

      public void setSequenceCoverageThreshold(double sequenceCoverageThreshold)
      The minimum coverage of the sequence alignment between two subunits to be clustered together.
      Parameters:
      sequenceCoverageThreshold -
    • getRMSDThreshold

      public double getRMSDThreshold()
      Structure similarity threshold (measured with RMSD) to consider for the structural subunit clustering.
      Returns:
      rmsdThreshold
    • setRMSDThreshold

      public void setRMSDThreshold(double rmsdThreshold)
      Structure similarity threshold (measured with RMSD) to consider for the structural subunit clustering.
      Parameters:
      rmsdThreshold -
    • getTMThreshold

      public double getTMThreshold()
      Structure similarity threshold (measured with TMScore) to consider for the structural subunit clustering.
      Returns:
      tmThreshold
    • setTMThreshold

      public void setTMThreshold(double tmThreshold)
      Structure similarity threshold (measured with TMScore) to consider for the structural subunit clustering.
      Parameters:
      tmThreshold -
    • getStructureCoverageThreshold

      The minimum coverage of the structure alignment between two subunits to be clustered together.
      Returns:
      structureCoverageThreshold
    • setStructureCoverageThreshold

      public void setStructureCoverageThreshold(double structureCoverageThreshold)
      The minimum coverage of the structure alignment between two subunits to be clustered together.
      Parameters:
      structureCoverageThreshold -
    • getClustererMethod

      Method to cluster subunits.
      Returns:
      clustererMethod
    • setClustererMethod

      Method to cluster subunits.
      Parameters:
      method -
    • isInternalSymmetry

      public boolean isInternalSymmetry()
      The internal symmetry option divides each Subunit of each SubunitCluster into its internally symmetric repeats.

      The SubunitClustererMethod.STRUCTURE must be chosen to consider internal symmetry, otherwise this parameter will be ignored.

      Returns:
      true if internal symmetry is considered, false otherwise
    • setInternalSymmetry

      public void setInternalSymmetry(boolean internalSymmetry)
      The internal symmetry option divides each Subunit of each SubunitCluster into its internally symmetric repeats.

      The SubunitClustererMethod.STRUCTURE must be chosen to consider internal symmetry, otherwise this parameter will be ignored.

      Parameters:
      internalSymmetry - true if internal symmetry is considered, false otherwise
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • getSuperpositionAlgorithm

      Method to superpose subunits (i.e., structural aligner).
      Returns:
      superpositionAlgorithm
    • setSuperpositionAlgorithm

      public void setSuperpositionAlgorithm(String superpositionAlgorithm)
      Method to cluster subunits.
      Parameters:
      superpositionAlgorithm -
    • isOptimizeAlignment

      public boolean isOptimizeAlignment()
      Whether the alignment algorithm should try its best to optimize the alignment, or we are happy with a quick and dirty result. Effect depends on implementation of the specific algorithm's method. *
      Returns:
      optimizeAlignment
    • setOptimizeAlignment

      public void setOptimizeAlignment(boolean optimizeAlignment)
      Whether the alignment algorithm should try its best to optimize the alignment, or we are happy with a quick and dirty result. Effect depends on implementation of the specific algorithm's method. *
      Parameters:
      optimizeAlignment -
    • isUseRMSD

      public boolean isUseRMSD()
      Use RMSD for evaluating structure similarity
      Returns:
      useRMSD
    • setUseRMSD

      public void setUseRMSD(boolean useRMSD)
      Use RMSD for evaluating structure similarity
      Parameters:
      useRMSD -
    • isUseTMScore

      public boolean isUseTMScore()
      Use TMScore for evaluating structure similarity
      Returns:
      useTMScore
    • setUseTMScore

      public void setUseTMScore(boolean useTMScore)
      Use TMScore for evaluating structure similarity
      Parameters:
      useTMScore -
    • isUseSequenceCoverage

      public boolean isUseSequenceCoverage()
      Use sequence coverage for evaluating sequence similarity
      Returns:
      useSequenceCoverage
    • setUseSequenceCoverage

      public void setUseSequenceCoverage(boolean useSequenceCoverage)
      Use sequence coverage for evaluating sequence similarity
      Parameters:
      useSequenceCoverage -
    • isUseStructureCoverage

      public boolean isUseStructureCoverage()
      Use structure coverage for evaluating sequence similarity
      Returns:
      useStructureCoverage
    • setUseStructureCoverage

      public void setUseStructureCoverage(boolean useStructureCoverage)
      Use structure coverage for evaluating sequence similarity
      Parameters:
      useStructureCoverage -
    • isUseGlobalMetrics

      public boolean isUseGlobalMetrics()
      Use metrics calculated relative to the whole sequence or structure, rather than the aligned part only
      Returns:
      useGlobalMetrics
    • setUseGlobalMetrics

      public void setUseGlobalMetrics(boolean useGlobalMetrics)
      Use metrics calculated relative to the whole sequence or structure, rather than the aligned part only
      Parameters:
      useGlobalMetrics -
    • isHighConfidenceScores

      public boolean isHighConfidenceScores(double sequenceIdentity, double sequenceCoverage)
      Whether the subunits can be considered "identical" by sequence alignment. For local sequence alignment (normalized by the number of aligned pairs) this means 0.95 or higher identity and 0.75 or higher coverage. For global sequence alignment (normalised by the alignment length) this means 0.85 or higher sequence identity.
      Parameters:
      sequenceIdentity -
      sequenceCoverage -
      Returns:
      true if the sequence alignment scores are equal to or better than the "high confidence" scores, false otherwise.
    • isUseEntityIdForSeqIdentityDetermination

      Whether to use the entity id of subunits to infer that sequences are identical. Only applies if the SubunitClustererMethod is a sequence based one.
      Returns:
      the flag
      Since:
      5.4.0
    • setUseEntityIdForSeqIdentityDetermination

      public void setUseEntityIdForSeqIdentityDetermination(boolean useEntityIdForSeqIdentityDetermination)
      Whether to use the entity id of subunits to infer that sequences are identical. Only applies if the SubunitClustererMethod is a sequence based one. Note this requires FileParsingParameters.setAlignSeqRes(boolean) to be set to true.
      Parameters:
      useEntityIdForSeqIdentityDetermination - the flag to be set
      Since:
      5.4.0