View Source Bio.Convertible behaviour (bio_ex_sequence v0.1.1)

Defines behavior for modules to act as a converter between sequences.

The core of this base module is to provide the default to/1 function. This will return the error tuple for undefined conversions. This alleviates the need of the user defined module to provide this implementation, and eliminates the possibility of the function to/1 raising due to no matching clauses.

To use this as a base for your converter you use the module and pass a block for defining the user-side to/1 calls. For example:

defmodule SomeConversion do
  use Bio.Convertible do
    def to(SomeModule), do: {:ok, &your_kmer_converter/2, 6}

    defp your_kmer_converter({:ok, kmers, data}, module) do
      # conversion logic
    end
  end
end

This defines the k-wise converter that will be used by Bio.Polymer.convert/3.

The function you define for the actual conversion will be given an :ok tuple with the kmers and any additional data defined for the struct that you're converting. How this data is partitioned is managed by the Bio.Polymeric definition.

As an example, the Bio.Sequence.DnaStrand struct simply drops the sequence key, retaining all other keys as the data given to the fn/2 you define. This allows you to retain any relevant information for a newly created struct.

If you wanted to define your own sequence, this then requires that you also implement the Bio.Polymeric interface. So if you were to implement SomeSequence, you would do the following:

defmodule SomeSequence do
  @behaviour Bio.Sequential

  @impl Bio.Sequential
  def converter, do: SomeConversion

  # implementation of other callbacks
end

The Bio.Sequential behavior ensures that we implement the converter/0 function which is called from the Bio.Polymer module. This in turn constructs the basic converter mechanic, and now you would implement Bio.Polymeric:

defimpl Bio.Polymeric, for: SomeSequence do
  def kmers(seq, k) do
    # your logic for splitting the polymer into k sized kmers
  end

  def valid?(seq, alpha) do
    # your logic checking if the polymer is valid
  end

  def validate(seq, alpha) do
    # your logic for validating the polymer
  end
end

Now you can simply call:

SomeSequence.new("some data")
|> Bio.Polymer.convert(SomeModule)

The Bio.Polymer.convert/3 function now handles calling your conversion method your_kmer_converter. This will be called with the kmers generated by Bio.Polymeric's implementation of kmers/2, which will be passed a value of k=6, as defined in the conversion callback.

This is perhaps overly complex, but here are the design goals:

  1. Allow a user to define a conversion from one type to another, regardless of if that type is internal.
  2. Allow that definition to leverage existing protocols for e.g. kmers.
  3. Allow a dead simple interface for working with internal components when extensions aren't needed.

A lot of complexity was pulled in to hit these targets, and I think that the goal is achieved. A user who doesn't need their own types can simply work with the Bio.Sequence types using the Bio.Polymer interface.

However, if they want to define e.g. a converter between Bio.Sequence.RnaStrand and Bio.Sequence.AminoAcid, they can simply define a module for that conversion and pass it to the Bio.Polymer.convert/3 function.

The complexity only arises to the user when they need their own sequences. Hopefully they'd have appreciated the necessity of the complexity before then.

Summary

Callbacks

Defines the converter's k-wise conversion function

Callbacks

@callback to(thing :: module()) ::
  {:ok, (term() -> term()), integer()} | {:error, :undef_conversion}

Defines the converter's k-wise conversion function

This is called within the Bio.Polymer.convert/3 function to acquire the k-wise conversion function for sequence to another.