BioJava CookBook
BioJava Cookbook for release 4.*
BioJava 3+ is a major re-write of BioJava 1. As such many things work differently. This cookbook provides examples how to work with the new codebase.
The page was inspired by various programming cookbooks and follows a “How do I…?” type approach. Each “How do I?” is linked to some example code that does what you want and sometimes more. Basically if you find the code you want and copy and paste it into your program you should be up and running quickly. I have endeavoured to over document the code to make it more obvious what I am doing so some of the code might look a bit bloated.
If you have any suggestions, questions or comments contact the biojava mailing list. To subscribe to this list go here
Tutorial
Many topics are also covered in the BioJava tutorial.
How Do I….?
Core Module - Working with Sequences
Required modules: biojava-core
- Overview of biojava-core?
- How are sequences created?
- How do I compare two DNA Sequences and create a consensus sequence?
- How do I read or write Fasta files?
- How do I read Genbank files?
- How do I view Features on a sequence?
Protein Structure
Required modules: biojava-structure, biojava-alignment Optional module : biojava-structure-gui for the 3D visualisation Optional external library : JmolApplet.jar for the 3D visualisation
- How can I parse a PDB file?
- How can I parse a .mmcif file?
- What is the BioJava structure datamodel?
- How can I do calculations on atoms?
- How can I access the header information of a PDB file?
- How does BioJava deal with SEQRES and ATOM groups?
- How can I mutate a residue?
- How can I calculate a structure alignment?
- How can I use a simple GUI to calculate an alignment?
- How can I interact with Jmol?
- How can I serialize to a database?
- How can I load data from the SCOP classification?
- How can I work with the Berkeley version of SCOP?
- How can I find residues binding a ligand?
- How to work with biological assemblies of proteins
- How to get information using RCSB’s RESTful services
- How do I calculate the true length of a structure?
Pairwise and Multiple Sequence Alignment
Required modules: biojava-alignment, biojava-core, biojava-phylo Required external library: forester.jar
- How can I read a Sequence Alignment in Stockholm format? (Pfam, Rfam)
- How can I calculate a Pairwise Sequence Alignment? (Smith Waterman, Needleman Wunsch)
- How can I calculate a Pairwise Sequence Alignment with DNA sequences?
- How can I create a Multiple Sequence Alignment?
- How can I profile the time and memory requirements of a Multiple Sequence Alignment?
Genome
Required modules: biojava-genome
Sequencing
Required modules: biojava-core,biojava-sequencing Required external library: guava-11.0.1.jar
Phylogenetic tree
Required modules: ‘‘biojava-core Required external library: forester.jar
Physico-Chemical Properties Computation
Required modules: biojava-aa-prop, biojava-structure and biojava-core
- How can I compute physico-chemical properties via APIs?
- How can I compute physico-chemical properties using Command Prompt?
- How can I compute PROFEAT properties via APIs?
Protein Disorder
Required modules: biojava-protein-disorder
- How can I predict disordered regions of the protein using its sequence?
- Can I use the predictor from the command line?
Protein Modification Identification
Required modules: biojava-modfinder, biojava-structure
- How can I identify protein modifications in a 3D structure?
- How can I get the list of supported protein modifications?
- How can I define and register a new protein modification?
Remote Web Service Calls
Required modules: biojava-core, biojava-ws
- How can I use NCBI’s QBlast service ?
- How can I use Blast XML Output in my program?
- How can I get Pfam annotations for a protein sequence using the Hmmer3 service?
Legacy 1.8.x CookBook
The CookBook for the legacy 1.8.x code base is available from here.