View Source Bio.Sequence.Alphabets (bio_elixir v0.2.0)

Alphabets relevant to the sequences, coding schemes are expressed in essentially BNF. Values and interpretations for the scheme were accessed from here.

  • Bio.Sequence.Dna The DNA alphabets provided are:

    • common - The standard bases ATGCatgc
    • with_n - The standard alphabet, but with the ambiguous "any" character Nn
    • iupac - The IUPAC standard values ACGTRYSWKMBDHVNacgtryswkmbdhvn
  • Bio.Sequence.Rna

    • common - The standard bases ACGUacgu
    • with_n - The standard alphabet, but with the ambiguous "any" character Nn
    • iupac - The IUPAC standard values ACGURYSWKMBDHVNacguryswkmbdhvn
  • Bio.Sequence.AminoAcid

    • common - The standad 20 amino acid codes ARNDCEQGHILKMFPSTWYVarndceqghilkmfpstwyv
    • iupac - ABCDEFGHJIKLMNPQRSTVWXYZabcdefghjiklmnpqrstvwxyz

Coding Schemes

deoxyribonucleic-acid-codes

Deoxyribonucleic Acid codes

A ::= Adenine
C ::= Cytosine
G ::= Guanine
T ::= Thymine

R ::= A | G
Y ::= C | T
S ::= G | C
W ::= A | T
K ::= G | T
M ::= A | C

B ::= S | T (¬A)
D ::= R | T (¬C)
H ::= M | T (¬G)
V ::= M | G (¬T)
N ::= ANY

ribonucleic-acid-codes

Ribonucleic Acid codes

A ::= Adenine
C ::= Cytosine
G ::= Guanine
U ::= Uracil

R ::= A | G
Y ::= C | U
S ::= G | C
W ::= A | U
K ::= G | U
M ::= A | C

B ::= S | U (¬A)
D ::= R | U (¬C)
H ::= M | U (¬G)
V ::= M | G (¬U)
N ::= ANY

amino-acid-codes

Amino Acid codes

A ::= Alanine
C ::= Cysteine
D ::= Aspartic Acid
E ::= Glutamic Acid
F ::= Phenylalanine
G ::= Glycine
H ::= Histidine
I ::= Isoleucine
K ::= Lysine
L ::= Leucine
M ::= Methionine
N ::= Asparagine
P ::= Proline
Q ::= Glutamine
R ::= Arginine
S ::= Serine
T ::= Threonine
V ::= Valine
W ::= Tryptophan
Y ::= Tyrosine

B ::= D | N
Z ::= Q | E
J ::= I | L
X ::=  ANY