View Source Bio.Polymer (bio_ex_sequence v0.1.1)

Deals with conversions between polymers.

The sequences that this will work with must define an implementation for the Bio.Polymeric protocol. This is then used with the definition of the to/1 callbacks for the Bio.Convertible behaviour. These will be given the kmer enumeration that they define with that function.

This module wraps the logic of accessing a given polymer's defined conversions. The primary idea is that I wanted to expose the ability to provide a non-default conversion without losing the semantics of a simple default when it's present.

To put that in more concrete terms, I wanted this to be viable:

iex>dna = DnaStrand.new("ttagccgt", label: "a label")
...>Bio.Polymer.convert(dna, RnaStrand)
{:ok, %RnaStrand{sequence: "uuagccgu", length: 8, label: "a label"}}

But, and this is the important part, other conversions are not well defined by defaults. For example:

iex>amino = AminoAcid.new("maktg")
...>Bio.Polymer.convert(amino, DnaStrand)
{:error, :undef_conversion}

The :undef_conversion indicates that there is no viable default implementation of the conversion between these polymers. It does not indicate that there is none. Obviously one can convert from an amino acid to some DNA strand. However, because this would imply making a selection from the available codons, that is left to the logic of whatever application is doing so.

The way that you would do that is straight forward, you would define a conversion module and pass it to the convert/3 function as the keyword argument :conversion. For example, if we wanted to defined a mapping that converted into a compressed DNA representation, we could do:

iex>defmodule CompressedAminoConversion do
...>  use Bio.Convertible do
...>    def to(DnaStrand), do: {:ok, &compressed/2, 1}
...>  end
...>
...>  def compressed({:ok, knumerable, data}, _) do
...>    data = data
...>           |> Map.drop([ :length ])
...>           |> Map.to_list()
...>    knumerable
...>    |> Enum.map(&to_codon/1)
...>    |> Enum.join("")
...>    |> DnaStrand.new(data)
...>  end
...>
...>  defp to_codon(aa) do
...>    case aa do
...>      "a" -> "gcn"
...>      "r" -> "cgn"
...>      "n" -> "aay"
...>      "d" -> "gay"
...>      "c" -> "tgy"
...>      "e" -> "gar"
...>      "q" -> "car"
...>      "g" -> "ggn"
...>      "h" -> "cay"
...>      "i" -> "ath"
...>      "l" -> "ctn"
...>      "k" -> "aar"
...>      "m" -> "atg"
...>      "f" -> "tty"
...>      "p" -> "ccn"
...>      "s" -> "tcn"
...>      "t" -> "acn"
...>      "w" -> "tgg"
...>      "y" -> "tay"
...>      "v" -> "gtn"
...>    end
...>  end
...>end
...>amino = AminoAcid.new("maktg", label: "polypeptide-∂")
...>Bio.Polymer.convert(amino, DnaStrand, conversion: CompressedAminoConversion)
{:ok, %DnaStrand{sequence: "atggcnaaracnggn", length: 15, label: "polypeptide-∂"}}

This is made possible because of the simple implementation of the Bio.Polymeric interface for the Bio.Sequence.AminoAcid. If you want to define your own convertible polymer types, you can. It requires defining the module and the implementation of convert/1. You can read the Bio.Sequence.AminoAcid source for more clarity on the details.

This package attempts to define reasonable defaults for all the occasions which it can. This includes converting DNA into RNA, and RNA to DNA. The conversions from DNA/RNA to Amino Acid are done using standard codon tables.

The Conversion module idea is provided as an escape hatch for more particular applications which may require bespoke logic. An example would be converting Amino Acids into a DNA sequence, as above. There are likely more use cases than I could possibly compile on my own, so I tried to come up with a way to alleviate that pressure.

Summary

Functions

Apply a conversion to a given datum.

Validate a given sequence struct according to its Bio.Polymeric implementation.

Functions

Link to this function

convert(data, module, opts \\ [])

View Source
@spec convert(struct(), module(), keyword()) ::
  {:ok, struct()} | {:error, :undef_conversion}

Apply a conversion to a given datum.

The convert/3 function is at the core of using the Bio.Polymer module. By passing the function a struct and the module you wish to convert to, you are hooking into the underlying implementation of the Bio.Convertible for that module. This means that both the struct you given as well as the module must have this implemented.

Examples

Given a struct and module with a known conversion:

iex>dna = DnaStrand.new("ttagccgt", label: "a label")
...>Bio.Polymer.convert(dna, RnaStrand)
{:ok, %RnaStrand{sequence: "uuagccgu", length: 8, label: "a label"}}

Given a struct and module with unknown conversions:

iex>amino = AminoAcid.new("maktg")
...>Bio.Polymer.convert(amino, DnaStrand)
{:error, :undef_conversion}

Given a struct that doesn't implement Bio.Sequential:

iex>Bio.Polymer.convert(%SomeModule{}, DnaStrand)
{:error, :no_converter}
Link to this function

valid?(data, alphabet \\ nil)

View Source
Link to this function

validate(data, alphabet \\ nil)

View Source
@spec validate(
  struct(),
  String.t() | nil
) ::
  {:ok, struct()}
  | {:error, :no_alpha}
  | {:error, {atom(), String.t(), integer()}}
  | {:error, [{atom(), String.t(), integer()}]}

Validate a given sequence struct according to its Bio.Polymeric implementation.