BioJava:BioJavaInside
From BioJava
If you use BioJava in an application or publication please cite:
BioJava: an Open-Source Framework for Bioinformatics
R.C.G. Holland; T. Down; M. Pocock; A. Prlić; D. Huen; K. James; S. Foisy; A. Dräger; A. Yates; M. Heuer; M.J. Schreiber
Bioinformatics 2008; doi: 10.1093/bioinformatics/btn397
Projects
The following projects make use of BioJava. If you know of other projects please add them to the list.
- DengueInfo: a Dengue genome information portal that uses BioJava in the middleware and talks to a biosql database.
- Dazzle: A BioJava based DAS server.
- Biosense: A commercial informatics offering from Inforsense that uses BioJava under the hood.
- Bioclipse: A free, open source, workbench for chemo- and bioinformatics with powerful editing and visualization capabilities for molecules, sequences, proteins, spectra etc.
- PROMPT: A free, open source framework and application for the comparison and mapping of protein sets. Uses BioJava for handling most input data formats.
- Cytoscape: An open source bioinformatics software platform for visualizing molecular interaction networks.
- BioWeka: An open source biological data mining application.
- Geneious: A molecular biology toolkit.
- MassSieve: An open source application to analyze mass spec proteomics data.
- Strap: A tool for multiple sequence alignment and sequence based structure alignment.
Publications
BioJava has been used in the following publications. If you know of other publications please add them.
- Hidalgo E, Leautaud V, and Demple B. The redox-regulated SoxR protein acts from a single DNA site as a repressor and an allosteric activator. EMBO J 1998 May 1; 17(9) 2629-36. doi:10.1093/emboj/17.9.2629 pmid:9564045.
- Jacobs GH, Stockwell PA, Schrieber MJ, Tate WP, and Brown CM. Transterm: a database of messenger RNA components and signals. Nucleic Acids Res 2000 Jan 1; 28(1) 293-5. pmid:10592251.
- Xie T and Hood L. ACGT-a comparative genomics tool. Bioinformatics 2003 May 22; 19(8) 1039-40. pmid:12761070.
- Schreiber M and Brown C. Compensation for nucleotide bias in a genome by representation as a discrete channel with noise. Bioinformatics 2002 Apr; 18(4) 507-12. pmid:12016048.
- Büssow K, Hoffmann S, and Sievert V. ORFer--retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files. BMC Bioinformatics 2002 Dec 19; 3 40. pmid:12493080.
- Aerts S, Thijs G, Coessens B, Staes M, Moreau Y, and De Moor B. Toucan: deciphering the cis-regulatory logic of coregulated genes. Nucleic Acids Res 2003 Mar 15; 31(6) 1753-64. pmid:12626717.
- di Bernardo D, Down T, and Hubbard T. ddbRNA: detection of conserved secondary structures in multiple alignments. Bioinformatics 2003 Sep 1; 19(13) 1606-11. pmid:12967955.
- Brown CM, Jacobs G, Stockwell P, and Schreiber M. Detection of signals in mRNAs that influence translation. Appl Bioinformatics 2003; 2(3 Suppl) S47-51. pmid:15130816.
- Carbone A, Zinovyev A, and Képès F. Codon adaptation index as a measure of dominating codon bias. Bioinformatics 2003 Nov 1; 19(16) 2005-15. pmid:14594704.
- Gurvich OL, Baranov PV, Zhou J, Hammer AW, Gesteland RF, and Atkins JF. Sequences that direct significant levels of frameshifting are frequent in coding regions of Escherichia coli. EMBO J 2003 Nov 3; 22(21) 5941-50. doi:10.1093/emboj/cdg561 pmid:14592990.
- Huang Y, Ni T, Zhou L, and Su S. JXP4BIGI: a generalized, Java XML-based approach for biological information gathering and integration. Bioinformatics 2003 Dec 12; 19(18) 2351-8. pmid:14668218.
- Sugawara H and Miyazaki S. Biological SOAP servers and web services provided by the public sequence data bank. Nucleic Acids Res 2003 Jul 1; 31(13) 3836-9. pmid:12824432.
- Zuyderduyn SD and Jones SJ. A knowledge discovery object model API for Java. BMC Bioinformatics 2003 Oct 28; 4 51. doi:10.1186/1471-2105-4-51 pmid:14583100.
- Aerts S, Van Loo P, Moreau Y, and De Moor B. A genetic algorithm for the detection of new cis-regulatory modules in sets of coregulated genes. Bioinformatics 2004 Aug 12; 20(12) 1974-6. doi:10.1093/bioinformatics/bth179 pmid:15044242.
- Dong X, Stothard P, Forsythe IJ, and Wishart DS. PlasMapper: a web server for drawing and auto-annotating plasmid maps. Nucleic Acids Res 2004 Jul 1; 32(Web Server issue) W660-4. doi:10.1093/nar/gkh410 pmid:15215471.
- Down TA and Hubbard TJ. What can we learn from noncoding regions of similarity between genomes?. BMC Bioinformatics 2004 Sep 15; 5 131. doi:10.1186/1471-2105-5-131 pmid:15369604.
- Hertz-Fowler C, Peacock CS, Wood V, Aslett M, Kerhornou A, Mooney P, Tivey A, Berriman M, Hall N, Rutherford K, Parkhill J, Ivens AC, Rajandream MA, and Barrell B. GeneDB: a resource for prokaryotic and eukaryotic organisms. Nucleic Acids Res 2004 Jan 1; 32(Database issue) D339-43. doi:10.1093/nar/gkh007 pmid:14681429.
- An HJ, Lee D, Lee KH, and Bhak J. The association of Alu repeats with the generation of potential AU-rich elements (ARE) at 3' untranslated regions. BMC Genomics 2004 Dec 21; 5(1) 97. doi:10.1186/1471-2164-5-97 pmid:15610565.
- Carbone A, Képès F, and Zinovyev A. Codon bias signatures, organization of microorganisms in codon space, and lifestyle. Mol Biol Evol 2005 Mar; 22(3) 547-61. doi:10.1093/molbev/msi040 pmid:15537809.
- Down TA and Hubbard TJ. NestedMICA: sensitive inference of over-represented motifs in nucleic acid sequence. Nucleic Acids Res 2005; 33(5) 1445-53. doi:10.1093/nar/gki282 pmid:15760844.
- Finak G, Godin N, Hallett M, Pepin F, Rajabi Z, Srivastava V, and Tang Z. BIAS: Bioinformatics Integrated Application Software. Bioinformatics 2005 Apr 15; 21(8) 1745-6. doi:10.1093/bioinformatics/bti170 pmid:15572471.
- Gorban AN, Popova TG, and Zinovyev AY. Four basic symmetry types in the universal 7-cluster structure of microbial genomic sequences. In Silico Biol 2005; 5(3) 265-82. pmid:15984937.
- Gouret P, Vitiello V, Balandraud N, Gilles A, Pontarotti P, and Danchin EG. FIGENIX: intelligent automation of genomic annotation: expertise integration in a new software platform. BMC Bioinformatics 2005 Aug 5; 6 198. doi:10.1186/1471-2105-6-198 pmid:16083500.
- Kersey P, Bower L, Morris L, Horne A, Petryszak R, Kanz C, Kanapin A, Das U, Michoud K, Phan I, Gattiker A, Kulikova T, Faruque N, Duggan K, Mclaren P, Reimholz B, Duret L, Penel S, Reuter I, and Apweiler R. Integr8 and Genome Reviews: integrated views of complete genomes and proteomes. Nucleic Acids Res 2005 Jan 1; 33(Database issue) D297-302. doi:10.1093/nar/gki039 pmid:15608201.
- Pain D, Chirn GW, Strassel C, and Kemp DM. Multiple retropseudogenes from pluripotent cell-specific gene expression indicates a potential signature for novel gene identification. J Biol Chem 2005 Feb 25; 280(8) 6265-8. doi:10.1074/jbc.C400587200 pmid:15640145.
- Prli? A, Down TA, and Hubbard TJ. Adding some SPICE to DAS. Bioinformatics 2005 Sep 1; 21 Suppl 2 ii40-1. doi:10.1093/bioinformatics/bti1106 pmid:16204122.
- Pudimat R, Schukat-Talamazzini EG, and Backofen R. A multiple-feature framework for modelling and predicting transcription factor binding sites. Bioinformatics 2005 Jul 15; 21(14) 3082-8. doi:10.1093/bioinformatics/bti477 pmid:15905283.
- Spindel ER, Pauley MA, Jia Y, Gravett C, Thompson SL, Boyle NF, Ojeda SR, and Norgren RB Jr. Leveraging human genomic information to identify nonhuman primate sequences for expression array development. BMC Genomics 2005 Nov 15; 6 160. doi:10.1186/1471-2164-6-160 pmid:16288651.
- Bindewald E, Schneider TD, and Shapiro BA. CorreLogo: an online server for 3D sequence logos of RNA and DNA alignments. Nucleic Acids Res 2006 Jul 1; 34(Web Server issue) W405-11. doi:10.1093/nar/gkl269 pmid:16845037.
- Down T, Leong B, and Hubbard TJ. A machine learning strategy to identify candidate binding sites in human protein-coding sequence. BMC Bioinformatics 2006 Sep 26; 7 419. doi:10.1186/1471-2105-7-419 pmid:17002805.
- Carter D and Durbin R. Vertebrate gene finding from multiple-species alignments using a two-level strategy. Genome Biol 2006; 7 Suppl 1 S6.1-12. doi:10.1186/gb-2006-7-s1-s6 pmid:16925840.
- Gille C and Robinson PN. HotSwap for bioinformatics: a STRAP tutorial. BMC Bioinformatics 2006 Feb 9; 7 64. doi:10.1186/1471-2105-7-64 pmid:16469097.
- Hasan S and Schreiber M. Recovering motifs from biased genomes: application of signal correction. Nucleic Acids Res 2006; 34(18) 5124-32. doi:10.1093/nar/gkl676 pmid:16990246.
- Hasan S, Daugelat S, Rao PS, and Schreiber M. Prioritizing genomic drug targets in pathogens: application to Mycobacterium tuberculosis. PLoS Comput Biol 2006 Jun 9; 2(6) e61. doi:10.1371/journal.pcbi.0020061 pmid:16789813.
- Lee CE, Gaëta B, Malming HR, Bain ME, Sewell WA, and Collins AM. Reconsidering the human immunoglobulin heavy-chain locus: 1. An evaluation of the expressed human IGHD gene repertoire. Immunogenetics 2006 Jan; 57(12) 917-25. doi:10.1007/s00251-005-0062-5 pmid:16402215.
- Liang C and Dandekar T. inGeno--an integrated genome and ortholog viewer for improved genome to genome comparisons. BMC Bioinformatics 2006 Oct 20; 7 461. doi:10.1186/1471-2105-7-461 pmid:17054788.
- Lu Q, Hao P, Curcin V, He W, Li YY, Luo QM, Guo YK, and Li YX. KDE Bioscience: platform for bioinformatics analysis workflows. J Biomed Inform 2006 Aug; 39(4) 440-50. doi:10.1016/j.jbi.2005.09.001 pmid:16260186.
- McDonald T, Sheng S, Stanley B, Chen D, Ko Y, Cole RN, Pedersen P, and Van Eyk JE. Expanding the subproteome of the inner mitochondria using protein separation technologies: one- and two-dimensional liquid chromatography and two-dimensional gel electrophoresis. Mol Cell Proteomics 2006 Dec; 5(12) 2392-411. doi:10.1074/mcp.T500036-MCP200 pmid:17000643.
- Powell BC and Hutchison CA 3rd. Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs. BMC Bioinformatics 2006 Jan 19; 7 31. doi:10.1186/1471-2105-7-31 pmid:16423288.
- Ross C and Shen QJ. Computational prediction and experimental verification of HVA1-like abscisic acid responsive promoters in rice (Oryza sativa). Plant Mol Biol 2006 Sep; 62(1-2) 233-46. doi:10.1007/s11103-006-9017-y pmid:16845480.
- Schmidt T and Frishman D. PROMPT: a protein mapping and comparison tool. BMC Bioinformatics 2006 Jul 4; 7 331. doi:10.1186/1471-2105-7-331 pmid:16817977.
- Vernikos GS and Parkhill J. Interpolated variable order motifs for identification of horizontally acquired DNA: revisiting the Salmonella pathogenicity islands. Bioinformatics 2006 Sep 15; 22(18) 2196-203. doi:10.1093/bioinformatics/btl369 pmid:16837528.
- Vizcaíno JA, González FJ, Suárez MB, Redondo J, Heinrich J, Delgado-Jarana J, Hermosa R, Gutiérrez S, Monte E, Llobell A, and Rey M. Generation, annotation and analysis of ESTs from Trichoderma harzianum CECT 2413. BMC Genomics 2006 Jul 27; 7 193. doi:10.1186/1471-2164-7-193 pmid:16872539.
- Andreeva A, Prli? A, Hubbard TJ, and Murzin AG. SISYPHUS--structural alignments for proteins with non-trivial relationships. Nucleic Acids Res 2007 Jan; 35(Database issue) D253-9. doi:10.1093/nar/gkl746 pmid:17068077.
- Bui HH, Botten J, Fusseder N, Pasquetto V, Mothe B, Buchmeier MJ, and Sette A. Protein sequence database for pathogenic arenaviruses. Immunome Res 2007 Feb 8; 3 1. doi:10.1186/1745-7580-3-1 pmid:17288609.
- Down TA, Bergman CM, Su J, and Hubbard TJ. Large-scale discovery of promoter motifs in Drosophila melanogaster. PLoS Comput Biol 2007 Jan 19; 3(1) e7. doi:10.1371/journal.pcbi.0030007 pmid:17238282.
- Gewehr JE, Szugat M, and Zimmer R. BioWeka--extending the Weka framework for bioinformatics. Bioinformatics 2007 Mar 1; 23(5) 651-3. doi:10.1093/bioinformatics/btl671 pmid:17237069.
- Hanekamp K, Bohnebeck U, Beszteri B, and Valentin K. PhyloGena--a user-friendly system for automated phylogenetic annotation of unknown sequences. Bioinformatics 2007 Apr 1; 23(7) 793-801. doi:10.1093/bioinformatics/btm016 pmid:17332025.
- Macías JR, Jiménez-Lozano N, and Carazo JM. Integrating electron microscopy information into existing Distributed Annotation Systems. J Struct Biol 2007 May; 158(2) 205-13. doi:10.1016/j.jsb.2007.02.004 pmid:17400476.
- Nikolajewa S, Pudimat R, Hiller M, Platzer M, and Backofen R. BioBayesNet: a web server for feature extraction and Bayesian network modeling of biological sequence data. Nucleic Acids Res 2007 Jul; 35(Web Server issue) W688-93. doi:10.1093/nar/gkm292 pmid:17537825.
- Spjuth O, Helmus T, Willighagen EL, Kuhn S, Eklund M, Wagener J, Murray-Rust P, Steinbeck C, and Wikberg JE. Bioclipse: an open source workbench for chemo- and bioinformatics. BMC Bioinformatics 2007 Feb 22; 8 59. doi:10.1186/1471-2105-8-59 pmid:17316423.
- Zajac P, Pettersson E, Gry M, Lundeberg J, and Ahmadian A. Expression profiling of signature gene sets with trinucleotide threading. Genomics 2008 Feb; 91(2) 209-17. doi:10.1016/j.ygeno.2007.10.012 pmid:18061398.
- Vernikos GS and Parkhill J. Resolving the structural features of genomic islands: a machine learning approach. Genome Res 2008 Feb; 18(2) 331-42. doi:10.1101/gr.7004508 pmid:18071028.
- Liang C and Dandekar T. inGeno--an integrated genome and ortholog viewer for improved genome to genome comparisons. BMC Bioinformatics 2006 Oct 20; 7 461. doi:10.1186/1471-2105-7-461 pmid:17054788.
- Chalk AM and Sonnhammer EL. siRNA specificity searching incorporating mismatch tolerance data. Bioinformatics 2008 May 15; 24(10) 1316-7. doi:10.1093/bioinformatics/btn121 pmid:18397893.
More biojava publications can be found in Google Scholar.

