BioJava:CookBook:DP:WeightMatrix
How do I use a WeightMatrix to find a motif?
A Weight Matrix is a useful way of representing an alignment or a motif. It can also be used as a scoring matrix to detect a similar motif in a sequence. BioJava contains a class call WeightMatrix in the org.biojava.bio.dp package. There is also a WeightMatrixAnnotator which uses the WeightMatrix to add Features to any portion of the sequence being searched which exceed the scoring threshold.
The following program generates a WeightMatrix from an aligment and uses that matrix to annotate a Sequence with a threshold of 0.1
```java import java.util.*;
import org.biojava.bio.dist.*; import org.biojava.bio.dp.*; import org.biojava.bio.seq.*; import org.biojava.bio.symbol.*;
public class WeightMatrixDemo {
public static void main(String[] args) throws Exception{
//make an Alignment of a motif.
Map map = new HashMap();
map.put("seq0", DNATools.createDNA("aggag"));
map.put("seq1", DNATools.createDNA("aggaa"));
map.put("seq2", DNATools.createDNA("aggag"));
map.put("seq3", DNATools.createDNA("aagag"));
Alignment align = new SimpleAlignment(map);
//make a Distribution[] of the motif
Distribution[] dists =
DistributionTools.distOverAlignment(align, false, 0.01);
//make a Weight Matrix
WeightMatrix matrix = new SimpleWeightMatrix(dists);
//the sequence to score against
Sequence seq = DNATools.createDNASequence("aaagcctaggaagaggagctgat","seq");
//annotate the sequence with the weight matrix using a low threshold (0.1)
WeightMatrixAnnotator wma = new WeightMatrixAnnotator(matrix, 0.1);
seq = wma.annotate(seq);
//output match information
for (Iterator it = seq.features(); it.hasNext(); ) {
Feature f = (Feature)it.next();
Location loc = f.getLocation();
System.out.println("Match at " + loc.getMin()+"-"+loc.getMax());
System.out.println("\tscore : "+f.getAnnotation().getProperty("score"));
}
}
} ```