Write a FASTA parser
Your solution should:
- be a Java code of high quality (maintability, reusability, OO design, etc)
- be efficient
- be capable of reading badly formatted FASTA files
- be capable of reading large files
- has convenient API
- be possible to extend to reading FASTQ files
Please refrain from using any libraries that are not part of a standard Java 6 development kit in your production code. For testing code fill free to use any library you feel comfortable with.
Write a FASTA writer
- Use your parser to read a FASTA file which contains sequences with ambiguous characters (you choose whether this is going to be ambiguous DNA or protein sequence)
- Write two FASTA output files one with sequences which contains ambiguous characters and another one without.
Please submit your completed exercise to gsocexercise at gmail dot com by Friday the 6 of April inclusive. Your submission should be a ZIP archive that contains an executable JAR file with your FASTA parser and writer as well as
- your source files in the src directory
- your documentation files in the docs directory
- the test data file named data.fasta up to 10Kb in size
- The executable JAR containing the program. This should be called runme.jar.
- a pure ASCII text file called choices.txt describing the significant design choices you made, uncertainties you had regarding the project, and the decisions you made when resolving them.