Class FastaReader<S extends Sequence<?>,C extends Compound>

java.lang.Object
org.biojava.nbio.core.sequence.io.FastaReader<S,C>

public class FastaReader<S extends Sequence<?>,C extends Compound> extends Object
Use FastaReaderHelper as an example of how to use this class where FastaReaderHelper should be the primary class used to read Fasta files
Author:
Scooter Willis ;lt;willishf at gmail dot com>
  • Constructor Summary

    Constructors
    Constructor
    Description
    FastaReader(File file, SequenceHeaderParserInterface<S,C> headerParser, SequenceCreatorInterface<C> sequenceCreator)
    If you are going to use the FileProxyProteinSequenceCreator then you need to use this constructor because we need details about the location of the file.
    If you are going to use FileProxyProteinSequenceCreator then do not use this constructor because we need details about local file offsets for quick reads.
  • Method Summary

    Modifier and Type
    Method
    Description
    void
     
    The parsing is done in this method.
    This method tries to process all the available fasta records in the File or InputStream, closes the underlying resource, and return the results in LinkedHashMap.
    You don't need to call close() after calling this method.
    process(int max)
    This method tries to parse maximum max records from the open File or InputStream, and leaves the underlying resource open.
    Subsequent calls to the same method continue parsing the rest of the file.
    This is particularly useful when dealing with very big data files, (e.g.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • FastaReader

      public FastaReader(InputStream is, SequenceHeaderParserInterface<S,C> headerParser, SequenceCreatorInterface<C> sequenceCreator)
      If you are going to use FileProxyProteinSequenceCreator then do not use this constructor because we need details about local file offsets for quick reads. InputStreams does not give you the name of the stream to access quickly via file seek. A seek in an inputstream is forced to read all the data so you don't gain anything.
      Parameters:
      is - inputStream
      headerParser -
      sequenceCreator -
    • FastaReader

      public FastaReader(File file, SequenceHeaderParserInterface<S,C> headerParser, SequenceCreatorInterface<C> sequenceCreator) throws FileNotFoundException
      If you are going to use the FileProxyProteinSequenceCreator then you need to use this constructor because we need details about the location of the file.
      Parameters:
      file -
      headerParser -
      sequenceCreator -
      Throws:
      FileNotFoundException - if the file does not exist, is a directory rather than a regular file, or for some other reason cannot be opened for reading.
      SecurityException - if a security manager exists and its checkRead method denies read access to the file.
  • Method Details

    • process

      public Map<String,S> process() throws IOException
      The parsing is done in this method.
      This method tries to process all the available fasta records in the File or InputStream, closes the underlying resource, and return the results in LinkedHashMap.
      You don't need to call close() after calling this method.
      Returns:
      HashMap containing all the parsed fasta records present, starting current fileIndex onwards.
      Throws:
      IOException - if an error occurs reading the input file
      See Also:
    • process

      public Map<String,S> process(int max) throws IOException
      This method tries to parse maximum max records from the open File or InputStream, and leaves the underlying resource open.
      Subsequent calls to the same method continue parsing the rest of the file.
      This is particularly useful when dealing with very big data files, (e.g. NCBI nr database), which can't fit into memory and will take long time before the first result is available.
      N.B.
      • This method can't be called after calling its NO-ARGUMENT twin.
      • remember to close the underlying resource when you are done.
      Parameters:
      max - maximum number of records to return, -1 for infinity.
      Returns:
      HashMap containing maximum max parsed fasta records present, starting current fileIndex onwards.
      Throws:
      IOException - if an error occurs reading the input file
      Since:
      3.0.6
      See Also:
    • close

      public void close() throws IOException
      Throws:
      IOException