Mix.install([
  {:ducker, "~> 0.3.0"},
  {:adbc, "~> 0.7.9"},
  {:kino, "~> 0.16.0"},
  {:kino_db, "~> 0.3.0"},
  {:kino_explorer, "~> 0.1.24"}
])

Initialization

Create connection to DuckDB in-memory database

:ok = Adbc.download_driver!(:duckdb)
{:ok, db} = Kino.start_child({Adbc.Database, driver: :duckdb})
{:ok, conn} = Kino.start_child({Adbc.Connection, database: db})

# This example uses files in ducker's `priv/example`
work_dir = Path.join(:code.priv_dir(:ducker), "example")

You can attach to DuckDB database file by running these SQL queries

ATTACH '/path/to/your.db' AS file_db;
USE file_db;

Initialize Ducker to create SQL macros and tables needed by Ducker.

ducker = Ducker.new!(conn, work_dir)

Executing queries and tests

First, lets see what's inside our working directory

Ducker.FileHelper.list_files(Path.join(work_dir, "data"))
Ducker.FileHelper.list_files(Path.join(work_dir, "sources"))

In the data directory, we have 2 CSV files that contain data. In the sources directory, we have 1 SQL file that contains 2 queries and 1 YAML file that contains tests for 2 tables.

The following code will clear any test results, run all queries, and finally run all tests.

# clear previous test results
Ducker.reset_test_result!(ducker)

# run query from `sources` directory
# (relative to `work_dir`)
Ducker.execute_query_from_dir(ducker, "sources")
|> IO.inspect(label: "Executed queries")

# run data tests from `sources` directory
Ducker.execute_test_from_dir(ducker, "sources")
|> IO.inspect(label: "Executed tests")

:ok

Test results

All test results are recorded in the ducker_data_test_result table.

  • type is the type of the test.
  • entity is the name of the entity that the test is running on, it can be a table or a view.
  • label is the name of the test.
  • fail_count is the count of records that failed the test.
  • fail_query is an SQL query that can be run to show the records that failed the test.

The following code will show tests that failed.

result2 =
  Explorer.DataFrame.from_query!(
    conn,
    ~S"""
    select * from ducker_data_test_result
    where fail_count > 0
    """,
    []
  )
result = Explorer.DataFrame.from_query!(conn, ~S"SUMMARIZE mtcars ", [])