Blink behaviour (blink v0.6.1)
View SourceBlink provides efficient database seeding with a clean, declarative syntax.
Example
defmodule MyApp.Seeder do
use Blink
def call do
new()
|> with_table("users")
|> run(MyApp.Repo)
end
def table(_seeder, "users") do
[
%{id: 1, name: "Alice", email: "alice@example.com"},
%{id: 2, name: "Bob", email: "bob@example.com"}
]
end
endOverview
Blink simplifies database seeding by providing a structured way to build and insert rows:
- Create an empty
Seederwithnew/0. - Declare which tables to seed with
with_table/2. - Define
table/2clauses that return the rows to insert. - Run
run/2orrun/3to bulk-insert the rows.
Seeders
Seeders are the central data unit in Blink. A Seeder is a struct that holds
the rows you want to seed, any contextual data you need during the seeding
process, and internal state that Blink uses to execute the bulk insert.
%Blink.Seeder{
tables: %{
"table_name" => [...]
},
context: %{
"key" => [...]
},
table_order: ...,
table_opts: ...
}All keys in tables must match the name of a table in your database. Table
names can be either atoms or strings.
Tables
A mapping of table names to lists of rows. These rows will be persisted to the
database when run/2 or run/3 is called.
Context
Stores arbitrary data needed during the seeding process. This data is
available when building your seeds but is not inserted into the database by
run/2 or run/3. Use with_context/2 to declare context keys and define
corresponding context/2 clauses.
Custom Logic for Running the Seeder
By default, run/2 and run/3 bulk insert rows from the seeder into the
tables of a Postgres database. Internally they use Postgres' COPY command.
There are two ways to customize the insert behavior:
- Override the default implementation of
run/2orrun/3 - Pass a custom adapter to
run/3(e.g., for non-Postgres databases)
Summary
Callbacks
Builds and returns the data to be stored under a context key in the given
Seeder.
Specifies how to run the Seeder, performing a bulk insert of the seed data
from a Seeder into the given Ecto repository.
Builds and returns the rows to be stored under a table key in the given
Seeder.
Functions
Copies rows into a database table using database-specific bulk copy commands.
Reads a CSV file and returns a list or stream of maps.
Reads a JSON file and returns a list of maps.
Callbacks
@callback context(seeder :: Blink.Seeder.t(), key :: Blink.Seeder.key()) :: Enumerable.t()
Builds and returns the data to be stored under a context key in the given
Seeder.
Called internally by with_context/2. Each key passed to with_context must
have a corresponding context/2 clause.
run/2 and run/3 ignore context data and only insert data from :tables.
When the callback function is missing, an ArgumentError is raised.
@callback run(seeder :: Blink.Seeder.t(), repo :: Ecto.Repo.t()) :: :ok
Specifies how to run the Seeder, performing a bulk insert of the seed data
from a Seeder into the given Ecto repository.
This callback function is optional, since Blink ships with a default implementation.
@callback run(seeder :: Blink.Seeder.t(), repo :: Ecto.Repo.t(), opts :: Keyword.t()) :: :ok
@callback table(seeder :: Blink.Seeder.t(), table_name :: Blink.Seeder.key()) :: Enumerable.t()
Builds and returns the rows to be stored under a table key in the given
Seeder.
Called internally by with_table/2 and with_table/3. Each table name passed
to with_table must have a corresponding table/2 clause.
Data added to a Seeder with table/2 is inserted into the corresponding
database table when calling run/2 or run/3.
The callback can return either a list or a stream of maps. Returning a stream enables memory-efficient seeding of large datasets.
When the callback function is missing, an ArgumentError is raised.
Functions
@spec copy_to_table( rows :: Enumerable.t(), table_name :: String.t(), repo :: Ecto.Repo.t(), opts :: Keyword.t() ) :: :ok
Copies rows into a database table using database-specific bulk copy commands.
Parameters
rows- An enumerable (list or stream) of maps where each map represents a row to insert. All maps must have the same keys, which correspond to the table columns. Using a stream allows for memory-efficient seeding of large datasets.table_name- The name of the table to insert into (string or atom).repo- An Ecto repository module.opts- Keyword list of options::adapter- The adapter module to use. Defaults toBlink.Adapter.Postgres.
The following options are specific to
Blink.Adapter.Postgres::batch_size- Number of rows per batch (default: 8,000).:max_concurrency- Number of parallel COPY operations (default: 6).:timeout- Timeout in milliseconds for each batch operation (default::infinity).
Returns
:ok- When the copy operation succeeds
Raises an exception when the copy operation fails.
Examples
iex> rows = [%{id: 1, name: "Alice"}, %{id: 2, name: "Bob"}]
iex> copy_to_table(rows, "users", MyApp.Repo)
:ok
# Using a stream for memory-efficient seeding
iex> stream = Stream.map(1..1_000_000, fn i -> %{id: i, name: "User #{i}"} end)
iex> copy_to_table(stream, "users", MyApp.Repo)
:okNotes
The function assumes all rows have the same structure. Column names are extracted from the first row in the enumerable.
Currently only PostgreSQL is supported via Blink.Adapter.Postgres.
@spec from_csv(path :: String.t(), opts :: Keyword.t()) :: Enumerable.t()
Reads a CSV file and returns a list or stream of maps.
Each column header becomes a string key in the resulting maps. All values are returned as strings.
Parameters
path- Path to the CSV file (relative or absolute)opts- Keyword list of options::headers- List of header names to use, or:inferto read from the first row (default::infer):transform- Function to transform each row map (default: identity):stream- Whentrue, returns a stream instead of a list (default:false)
Examples
# Read CSV with headers in first row
from_csv("users.csv")
# Provide headers explicitly
from_csv("users.csv", headers: ["id", "name", "email"])
# Transform values
from_csv("users.csv", transform: fn row ->
Map.update!(row, "id", &String.to_integer/1)
end)
# Stream for memory-efficient processing
from_csv("large_users.csv", stream: true)Returns
A list of maps, or a stream of maps when stream: true.
Notes
For JSONB columns, use :transform to parse JSON strings into maps. The
Postgres adapter will automatically JSON-encode maps when inserting.
Reads a JSON file and returns a list of maps.
The JSON file must contain an array of objects at the root level. Each object becomes a map with string keys.
Parameters
path- Path to the JSON fileopts- Keyword list of options::transform- Function to transform each row map (default: identity)
Examples
# Read JSON file
from_json("users.json")
# Transform values
from_json("users.json", transform: fn row ->
Map.update!(row, "id", &String.to_integer/1)
end)Returns
A list of maps.