public class StructureSequenceMatcher extends Object
Constructor and Description |
---|
StructureSequenceMatcher() |
Modifier and Type | Method and Description |
---|---|
static ProteinSequence |
getProteinSequenceForStructure(Structure struct,
Map<Integer,Group> groupIndexPosition)
Generates a ProteinSequence corresponding to the sequence of struct,
and maintains a mapping from the sequence back to the original groups.
|
static Structure |
getSubstructureMatchingProteinSequence(ProteinSequence sequence,
Structure wholeStructure)
|
static ResidueNumber[] |
matchSequenceToStructure(ProteinSequence seq,
Structure struct)
Given a sequence and the corresponding Structure, get the ResidueNumber
for each residue in the sequence.
|
static ProteinSequence |
removeGaps(ProteinSequence gapped)
Removes all gaps ('-') from a protein sequence
|
static <T> T[][] |
removeGaps(T[][] gapped)
Creates a new list consisting of all columns of gapped where no row
contained a null value.
|
public StructureSequenceMatcher()
public static Structure getSubstructureMatchingProteinSequence(ProteinSequence sequence, Structure wholeStructure)
wholeStructure
containing only the Groups
that are included in
sequence
. The resulting structure will contain only ATOM
residues; the SEQ-RES will be empty.
The Chains
of the Structure will be new instances (cloned), but the Groups
will not.sequence
- The input protein sequencewholeStructure
- The structure from which to take a substructureStructureException
#matchSequenceToStructure(ProteinSequence, Structure)}
public static ProteinSequence getProteinSequenceForStructure(Structure struct, Map<Integer,Group> groupIndexPosition)
struct
- Input structuregroupIndexPosition
- An empty map, which will be populated with
(residue index in returned ProteinSequence) -> (Group within struct)SeqRes2AtomAligner#getFullAtomSequence(List, Map)}, which
does the heavy lifting.
public static ResidueNumber[] matchSequenceToStructure(ProteinSequence seq, Structure struct)
Smith-Waterman alignment is used to match the sequences. Residues in the sequence but not the structure or mismatched between sequence and structure will have a null atom, while residues in the structure but not the sequence are ignored with a warning.
seq
- The protein sequence. Should match the sequence of struct very
closely.struct
- The corresponding protein structurepublic static ProteinSequence removeGaps(ProteinSequence gapped)
gapped
- public static <T> T[][] removeGaps(T[][] gapped)
gapped
- A rectangular matrix containing null to mark gapsCopyright © 2000–2019 BioJava. All rights reserved.