View Source Bio.Polymer (bio_ex_sequence v0.1.1)
Deals with conversions between polymers.
The sequences that this will work with must define an implementation for the
Bio.Polymeric
protocol. This is then used with the definition of
the to/1
callbacks for the Bio.Convertible
behaviour. These will
be given the kmer enumeration that they define with that function.
This module wraps the logic of accessing a given polymer's defined conversions. The primary idea is that I wanted to expose the ability to provide a non-default conversion without losing the semantics of a simple default when it's present.
To put that in more concrete terms, I wanted this to be viable:
iex>dna = DnaStrand.new("ttagccgt", label: "a label")
...>Bio.Polymer.convert(dna, RnaStrand)
{:ok, %RnaStrand{sequence: "uuagccgu", length: 8, label: "a label"}}
But, and this is the important part, other conversions are not well defined by defaults. For example:
iex>amino = AminoAcid.new("maktg")
...>Bio.Polymer.convert(amino, DnaStrand)
{:error, :undef_conversion}
The :undef_conversion
indicates that there is no viable default
implementation of the conversion between these polymers. It does not
indicate that there is none. Obviously one can convert from an amino acid to
some DNA strand. However, because this would imply making a selection from
the available codons, that is left to the logic of whatever application is
doing so.
The way that you would do that is straight forward, you would define a
conversion module and pass it to the convert/3
function as the keyword
argument :conversion
. For example, if we wanted to defined a mapping that
converted into a compressed DNA representation, we could do:
iex>defmodule CompressedAminoConversion do
...> use Bio.Convertible do
...> def to(DnaStrand), do: {:ok, &compressed/2, 1}
...> end
...>
...> def compressed({:ok, knumerable, data}, _) do
...> data = data
...> |> Map.drop([ :length ])
...> |> Map.to_list()
...> knumerable
...> |> Enum.map(&to_codon/1)
...> |> Enum.join("")
...> |> DnaStrand.new(data)
...> end
...>
...> defp to_codon(aa) do
...> case aa do
...> "a" -> "gcn"
...> "r" -> "cgn"
...> "n" -> "aay"
...> "d" -> "gay"
...> "c" -> "tgy"
...> "e" -> "gar"
...> "q" -> "car"
...> "g" -> "ggn"
...> "h" -> "cay"
...> "i" -> "ath"
...> "l" -> "ctn"
...> "k" -> "aar"
...> "m" -> "atg"
...> "f" -> "tty"
...> "p" -> "ccn"
...> "s" -> "tcn"
...> "t" -> "acn"
...> "w" -> "tgg"
...> "y" -> "tay"
...> "v" -> "gtn"
...> end
...> end
...>end
...>amino = AminoAcid.new("maktg", label: "polypeptide-∂")
...>Bio.Polymer.convert(amino, DnaStrand, conversion: CompressedAminoConversion)
{:ok, %DnaStrand{sequence: "atggcnaaracnggn", length: 15, label: "polypeptide-∂"}}
This is made possible because of the simple implementation of the
Bio.Polymeric
interface for the Bio.Sequence.AminoAcid
. If
you want to define your own convertible polymer types, you can. It requires
defining the module and the implementation of convert/1
. You can read the
Bio.Sequence.AminoAcid
source for more clarity on the details.
This package attempts to define reasonable defaults for all the occasions which it can. This includes converting DNA into RNA, and RNA to DNA. The conversions from DNA/RNA to Amino Acid are done using standard codon tables.
The Conversion module idea is provided as an escape hatch for more particular applications which may require bespoke logic. An example would be converting Amino Acids into a DNA sequence, as above. There are likely more use cases than I could possibly compile on my own, so I tried to come up with a way to alleviate that pressure.
Summary
Functions
Apply a conversion to a given datum.
Validate a given sequence struct according to its Bio.Polymeric
implementation.
Functions
Apply a conversion to a given datum.
The convert/3
function is at the core of using the Bio.Polymer
module. By passing the function a struct and the module you wish to convert
to, you are hooking into the underlying implementation of the
Bio.Convertible
for that module. This means that both the struct
you given as well as the module must have this implemented.
Examples
Given a struct and module with a known conversion:
iex>dna = DnaStrand.new("ttagccgt", label: "a label")
...>Bio.Polymer.convert(dna, RnaStrand)
{:ok, %RnaStrand{sequence: "uuagccgu", length: 8, label: "a label"}}
Given a struct and module with unknown conversions:
iex>amino = AminoAcid.new("maktg")
...>Bio.Polymer.convert(amino, DnaStrand)
{:error, :undef_conversion}
Given a struct that doesn't implement Bio.Sequential
:
iex>Bio.Polymer.convert(%SomeModule{}, DnaStrand)
{:error, :no_converter}
@spec validate( struct(), String.t() | nil ) :: {:ok, struct()} | {:error, :no_alpha} | {:error, {atom(), String.t(), integer()}} | {:error, [{atom(), String.t(), integer()}]}
Validate a given sequence struct according to its Bio.Polymeric
implementation.