fasta v0.0.1 FASTA.Parser

Provides a function FASTA.Parser.parse/1 to extract a collection of data from a FASTA string.

A FASTA string contains one or more pieces of sequence data. Each piece of data consists of a header line starting with the > character, followed by one or more lines of sequence characters.

Learn more about the FASTA format here.

Summary

Functions

Parses a FASTA string into a list of FASTA data

Functions

parse(fasta_string)

Parses a FASTA string into a list of FASTA data.

Each returned FASTA.Datum struct responds to:

  • header/0: returns the header line, stripped of the > character and leading and trailing whitespace characters
  • sequence/0: returns the sequence, stripped of whitespace characters

Parameters

  • fasta_string: FASTA-formatted string to parse

Example

iex> fasta_string = "> locus6 | Gorilla gorilla
...>                 ATCGTCGCTGATAGCTGCATCAG
...>
...>                 > locus7 | Gorilla gorilla
...>                 TGGGCTGCTATGCGGATGCAGAT"
...> FASTA.Parser.parse(fasta_string)
[
  %FASTA.Datum{header: "locus6 | Gorilla gorilla", sequence: "ATCGTCGCTGATAGCTGCATCAG"},
  %FASTA.Datum{header: "locus7 | Gorilla gorilla", sequence: "TGGGCTGCTATGCGGATGCAGAT"}
]