View Source Bio.Sequence.Alphabets (bio_ex_sequence v0.1.1)
Alphabets relevant to the sequences, coding schemes are expressed in essentially BNF. Values and interpretations for the scheme were accessed from here.
Also exposes the complementary elements for DNA/RNA allowing strands to be
complemented. These functions shouldn't be used directly, but look at
Bio.Sequence.Dna.complement/2
and Bio.Sequence.Rna.complement/1
for more
information.
Alphabets may be used in the declaration of Bio.BaseSequence
structs to
define how they should be validated. In case one is not supplied, a default
may be preferred. See Bio.Sequence.Dna
, Bio.Sequence.Rna
,
Bio.Sequence.AminoAcid
, and Bio.Polymer.valid?/2
for more information.
Bio.Sequence.Dna
The DNA alphabets provided are:common
- The standard basesATGCatgc
with_n
- The standard alphabet, but with the ambiguous "any" characterNn
iupac
- The IUPAC standard valuesACGTRYSWKMBDHVNacgtryswkmbdhvn
common
- The standard basesACGUacgu
with_n
- The standard alphabet, but with the ambiguous "any" characterNn
iupac
- The IUPAC standard valuesACGURYSWKMBDHVNacguryswkmbdhvn
common
- The standad 20 amino acid codesARNDCEQGHILKMFPSTWYVarndceqghilkmfpstwyv
iupac
-ABCDEFGHJIKLMNPQRSTVWXYZabcdefghjiklmnpqrstvwxyz
Coding Schemes
Deoxyribonucleic Acid codes
A ::= Adenine
C ::= Cytosine
G ::= Guanine
T ::= Thymine
R ::= A | G
Y ::= C | T
S ::= G | C
W ::= A | T
K ::= G | T
M ::= A | C
B ::= S | T (¬A)
D ::= R | T (¬C)
H ::= M | T (¬G)
V ::= M | G (¬T)
N ::= ANY
Ribonucleic Acid codes
A ::= Adenine
C ::= Cytosine
G ::= Guanine
U ::= Uracil
R ::= A | G
Y ::= C | U
S ::= G | C
W ::= A | U
K ::= G | U
M ::= A | C
B ::= S | U (¬A)
D ::= R | U (¬C)
H ::= M | U (¬G)
V ::= M | G (¬U)
N ::= ANY
Amino Acid codes
A ::= Alanine
C ::= Cysteine
D ::= Aspartic Acid
E ::= Glutamic Acid
F ::= Phenylalanine
G ::= Glycine
H ::= Histidine
I ::= Isoleucine
K ::= Lysine
L ::= Leucine
M ::= Methionine
N ::= Asparagine
P ::= Proline
Q ::= Glutamine
R ::= Arginine
S ::= Serine
T ::= Threonine
V ::= Valine
W ::= Tryptophan
Y ::= Tyrosine
B ::= D | N
Z ::= Q | E
J ::= I | L
X ::= ANY