User talk:Seeker

I’ve noticed some misprints in BioJavaX Documentation and in the source code.

Here in BioJavaXDocs it is said that GenBank Field FEATURE can be outputted as follows:

“…For the source feature, the db_xref and organism fields are added to the output by calling getNCBITaxon().getNCBITaxID() and getNCBITaxon().getDisplayName() on the sequence (the latter is chopped before the first bracket if necessary)….”

If I clearly understand, the RichSequence object is ment by sequence. But there is no getNCBITaxon() method in the RichSequence class. There is getTaxon() method in the RichSequence class. Thus, exectly this method should be used here instead of getNCBITaxon() method.

I was working with the sequence file in GenBank format when I notised one irrational thing.

That file contained the following text fragment:

... FEATURES Location/Qualifiers source 1..4214630 /organism="Bacillus subtilis subsp. subtilis str. 168" /mol_type="genomic DNA" /strain="168" /db_xref="taxon:224308" gene 4866..6782 /gene="gyrB" /locus_tag="BSU00060" /note="synonym: novA" /db_xref="GeneID:939456" CDS 4866..6782 /gene="gyrB" /locus_tag="BSU00060" /EC_number="" /function="initation of replication cycle and DNA elongation" /note="decatenates newly replicated chromosomal DNA and relaxes positive and negative DNA supercoiling" /codon_start=1 /transl_table=11 /product="DNA topoisomerase IV subunit B" /protein_id="NP_387887.1" /db_xref="GI:16077074" /db_xref="GOA:P05652" /db_xref="UniProtKB/Swiss-Prot:P05652" /db_xref="GeneID:939456" /translation="MEQQQNSYDENQIQVLEGLEAVRKRPGMYIGSTNSKGLHHLVWE IVDNSIDEALAGYCTDINIQIEKDNSITVVDNGRGIPVGIHEKMGRPAVEVIMT" ...

I used the followng code to get values of notes /function, /note and /translation of the FEATURE Field:


RichSequenceIterator seqs = RichSequence.IOTools.readGenbankDNA(br, ns); RichSequence seq = seqs.nextRichSequence();

Iterator fsit = seq.getFeatureSet().iterator(); RichFeature rf = (RichFeature);

Set noteSet = rf.getNoteSet(); Iterator nit = noteSet.iterator();

String function = “”, note = “”;

while (nit.hasNext()) {

 SimpleNote sn = (SimpleNote);
 String snTermName = sn.getTerm().getName(); 

 if (fType.equals("CDS")) {

   if (snTermName.equals("function")) {

     function = sn.getValue();
     System.out.println("Function:\n" + function);
   } else if (snTermName.equals("note")) {

     note = sn.getValue();
     System.out.println("Note:\n" + note);
   } else if (snTermName.equals("translation")) {

     translation = sn.getValue();
     System.out.println("Translation:\n" + translation);

} ```

The output was as follows:

Function: initation of replication cycle and DNA elongation Note: decatenates newly replicated chromosomal DNA and relaxes positive and negative DNA supercoiling Translation: MEQQQNSYDENQIQVLEGLEAVRKRPGMYIGSTNSKGLHHLVWEIVDNSIDEALAGYCTDINIQIEKDNSITVVDNGRGIPVGIHEKMGRPAVEVIMT

As one can see from the output the getValue() method of the SimpleNote class returns String objects that contain new line symbols when its object represents function & note notes. I consider this rather irrational. One can also see that there are no new line symbols in the case of translation note. This is well.

I’ve fixed both the above problems today (4th Sept 2006). Richard

Thank you, Richard. This really works. –Seeker 06:08, 21 September 2006 (EDT)