DataMorph v0.1.0 DataMorph View Source
Create Elixir structs, maps with atom keys, and keyword lists from CSV/TSV data.
Note, we should never convert user input to atoms. This is because atoms are not garbage collected. Once an atom is created, it is never reclaimed.
Generating atoms from user input would mean the user can inject enough different names to exhaust our system memory, or we reach the Erlang VM limit for the maximum number of atoms which will bring our system down regardless.
Examples
Define a struct and return stream of structs created from a tsv
string, a
namespace
atom and name
string.
iex> "name\tiso\n" <>
...> "New Zealand\tnz\n" <>
...> "United Kingdom\tgb" \
...> |> DataMorph.structs_from_tsv(OpenRegister, "country") \
...> |> Enum.to_list
[
%OpenRegister.Country{iso: "nz", name: "New Zealand"},
%OpenRegister.Country{iso: "gb", name: "United Kingdom"}
]
Return stream of maps with atom keys created from a tsv
stream.
iex> "name\tiso-code\n" <>
...> "New Zealand\tnz\n" <>
...> "United Kingdom\tgb" \
...> |> String.split("\n") \
...> |> Stream.map(& &1) \
...> |> DataMorph.maps_from_tsv() \
...> |> Enum.to_list
[
%{iso_code: "nz", name: "New Zealand"},
%{iso_code: "gb", name: "United Kingdom"}
]
Return stream of keyword lists created from a tsv
string.
iex> "name\tiso-code\n" <>
...> "New Zealand\tnz\n" <>
...> "United Kingdom\tgb" \
...> |> DataMorph.keyword_lists_from_tsv() \
...> |> Enum.to_list
[
[name: "New Zealand", "iso-code": "nz"],
[name: "United Kingdom", "iso-code": "gb"]
]
Link to this section Summary
Functions
Takes stream and applies filter regexp
when not nil, and takes count
when
not nil.
Returns stream of keyword_lists created from csv
string or stream.
Returns stream of keyword_lists created from tsv
string or stream.
Returns stream of maps with atom keys created from csv
string or stream.
Returns stream of maps with atom keys created from tsv
string or stream.
Encode stream of to TSV and write to standard out.
Concat headers to stream, encode to TSV and write to standard out.
Defines a struct and returns stream of structs created from csv
string or
stream, and a namespace
and name
.
Defines a struct and returns stream of structs created from tsv
string or
stream, and a namespace
and name
.
Link to this section Functions
Takes stream and applies filter regexp
when not nil, and takes count
when
not nil.
Parmeters
stream
: stream of string linesregex
: nil or regexp to match lines via Stream.filter/2 and String.match?/2count
: optional take count to apply via Stream.take/2
Returns stream of keyword_lists created from csv
string or stream.
Useful when you want to retain the field order of the original stream.
Parmeters
csv
: CSV stream or stringoptions
: optionally pass in separator, e.g. separator: ";"
Returns stream of keyword_lists created from tsv
string or stream.
Useful when you want to retain the field order of the original stream.
Example
Return stream of keyword lists created from a tsv
string.
iex> "name\tiso-code\n" <>
...> "New Zealand\tnz\n" <>
...> "United Kingdom\tgb" \
...> |> DataMorph.keyword_lists_from_tsv() \
...> |> Enum.to_list
[
[name: "New Zealand", "iso-code": "nz"],
[name: "United Kingdom", "iso-code": "gb"]
]
Returns stream of maps with atom keys created from csv
string or stream.
Parmeters
csv
: CSV stream or stringoptions
: optionally pass in separator, e.g. separator: ";"
Returns stream of maps with atom keys created from tsv
string or stream.
Example
Return stream of maps with atom keys created from a tsv
stream.
iex> "name\tiso-code\n" <>
...> "New Zealand\tnz\n" <>
...> "United Kingdom\tgb" \
...> |> String.split("\n") \
...> |> Stream.map(& &1) \
...> |> DataMorph.maps_from_tsv() \
...> |> Enum.to_list
[
%{iso_code: "nz", name: "New Zealand"},
%{iso_code: "gb", name: "United Kingdom"}
]
Parmeters
tsv
: TSV stream or string
Encode stream of to TSV and write to standard out.
Example
Write to standard out stream of string lists as TSV lines.
iex> "name\tiso\n" <>
...> "New Zealand\tnz\n" <>
...> "United Kingdom\tgb" \
...> |> String.split("\n") \
...> |> DataMorph.structs_from_tsv("open-register", :iso_country) \
...> |> Stream.map(& [&1.iso, &1.name]) \
...> DataMorph.puts_tsv
nz\tNew Zealand
gb\tUnited Kingdom
Concat headers to stream, encode to TSV and write to standard out.
Example
Write to standard out stream of string lists as TSV lines with headers.
iex> "name\tiso\n" <>
...> "New Zealand\tnz\n" <>
...> "United Kingdom\tgb" \
...> |> String.split("\n") \
...> |> DataMorph.structs_from_tsv("open-register", :iso_country) \
...> |> Stream.map(& [&1.iso, &1.name]) \
...> DataMorph.puts_tsv("iso-code","name")
iso-code\tname
nz\tNew Zealand
gb\tUnited Kingdom
structs_from_csv(csv, namespace, name, options \\ [separator: ","])
View SourceDefines a struct and returns stream of structs created from csv
string or
stream, and a namespace
and name
.
See structs_from_tsv/3
for examples.
Parmeters
csv
: CSV stream or stringnamespace
: string or atom to form first part of struct aliasname
: string or atom to form last part of struct aliasoptions
: optionally pass in separator, e.g. separator: ";"
Defines a struct and returns stream of structs created from tsv
string or
stream, and a namespace
and name
.
Redefines struct when called again with same namespace
and name
but
different fields. It sets struct fields to be the union of the old and new
fields.
Example
Define a struct and return stream of structs created from a tsv
stream, and
a namespace
string and name
atom.
iex> "name\tiso\n" <>
...> "New Zealand\tnz\n" <>
...> "United Kingdom\tgb" \
...> |> String.split("\n") \
...> |> Stream.map(& &1) \
...> |> DataMorph.structs_from_tsv("open-register", :iso_country) \
...> |> Enum.to_list
[
%OpenRegister.IsoCountry{iso: "nz", name: "New Zealand"},
%OpenRegister.IsoCountry{iso: "gb", name: "United Kingdom"}
]
Example
Add additional new fields to struct when called again with different tsv
.
iex> "name\tiso\n" <>
...> "New Zealand\tnz\n" <>
...> "United Kingdom\tgb" \
...> |> DataMorph.structs_from_tsv(OpenRegister, "country") \
...> |> Enum.to_list
...>
...> "name\tacronym\n" <>
...> "New Zealand\tNZ\n" <>
...> "United Kingdom\tUK" \
...> |> DataMorph.structs_from_tsv(OpenRegister, "country") \
...> |> Enum.to_list
[
%OpenRegister.Country{acronym: "NZ", iso: nil, name: "New Zealand"},
%OpenRegister.Country{acronym: "UK", iso: nil, name: "United Kingdom"}
]
Parmeters
tsv
: TSV stream or stringnamespace
: string or atom to form first part of struct aliasname
: string or atom to form last part of struct alias