DuckDB Elixir Port - Implementation Agent Instructions

View Source

Mission

You are implementing a 100% exact port of the DuckDB Python client to Elixir. This is not an adaptation or interpretation—it is a faithful recreation of the Python API with Elixir idioms.

Critical Rules

  1. ALWAYS reference duckdb-python/ directory for implementation details
  2. NEVER guess or invent behavior—check the Python source first
  3. MUST use Test-Driven Development (TDD) approach
  4. MUST port tests from duckdb-python/tests/ before implementing features
  5. MUST use Mox for mocking during Elixir-side development
  6. MUST verify Docker environment builds before implementing
  7. MUST write implementation stubs that fail tests initially

Required Reading

Before starting ANY implementation work, you MUST read:

  1. docs/TECHNICAL_DESIGN.md - Complete technical architecture
  2. docs/IMPLEMENTATION_ROADMAP.md - Phased implementation plan
  3. docs/PYTHON_API_REFERENCE.md - Complete Python API catalog

Development Methodology: Test-Driven Development

TDD Workflow (MANDATORY)

For every feature you implement, follow this EXACT sequence:

Step 1: Port Python Tests

# 1. Find relevant test in duckdb-python/tests/
# Example: duckdb-python/tests/fast/test_connection.py

# 2. Port to ExUnit format
defmodule DuckdbEx.ConnectionTest do
  use ExUnit.Case

  # Port each Python test function to Elixir
  test "connect to memory database" do
    # This WILL fail initially
    {:ok, conn} = DuckdbEx.connect(:memory)
    assert conn != nil
  end
end

Step 2: Create Implementation Stubs

# lib/duckdb_ex/connection.ex
defmodule DuckdbEx.Connection do
  @moduledoc """
  DuckDB connection management.

  Reference: duckdb-python/src/duckdb_py/include/duckdb_python/pyconnection/pyconnection.hpp
  """

  @type t :: reference()

  @spec connect(String.t() | :memory, keyword()) :: {:ok, t()} | {:error, term()}
  def connect(_database, _opts \\\\ []) do
    # Stub implementation - SHOULD FAIL TESTS
    raise "Not implemented yet"
  end
end

Step 3: Run Tests (They MUST Fail)

mix test
# Expected: Failures because stubs raise/return wrong values

Step 4: Implement NIF Layer (Rust)

// native/duckdb_nif/src/connection.rs

use rustler::{Encoder, Env, Error, ResourceArc, Term};
use duckdb::Connection as DuckDBConnection;

#[derive(Debug)]
pub struct ConnectionResource {
    inner: Mutex<DuckDBConnection>,
}

#[rustler::nif]
fn new_connection(path: String) -> Result<ResourceArc<ConnectionResource>, Error> {
    let conn = DuckDBConnection::open(&path)
        .map_err(|e| Error::Term(Box::new(e.to_string())))?;

    Ok(ResourceArc::new(ConnectionResource {
        inner: Mutex::new(conn),
    }))
}

Step 5: Implement Elixir Wrapper

defmodule DuckdbEx.Connection do
  alias DuckdbEx.Native

  def connect(database, opts \\\\ []) do
    path = case database do
      :memory -> ":memory:"
      str when is_binary(str) -> str
    end

    case Native.new_connection(path) do
      {:ok, conn_ref} -> {:ok, conn_ref}
      {:error, reason} -> {:error, reason}
    end
  end
end

Step 6: Run Tests Again

mix test
# Expected: Tests now pass

Step 7: Verify Against Python Behavior

# Run equivalent Python code to verify exact behavior
python3 -c "import duckdb; conn = duckdb.connect(':memory:'); print(conn)"

Docker Environment Setup (DO THIS FIRST)

Create Dockerfile

# Dockerfile
FROM elixir:1.18-alpine

# Install build dependencies
RUN apk add --no-cache \
    build-base \
    git \
    curl \
    rust \
    cargo \
    sqlite-dev

# Install Hex and Rebar
RUN mix local.hex --force && \
    mix local.rebar --force

# Set working directory
WORKDIR /app

# Copy project files
COPY mix.exs mix.lock ./
COPY config ./config
RUN mix deps.get
RUN mix deps.compile

COPY . .

# Compile project
RUN mix compile

CMD ["iex", "-S", "mix"]

Create docker-compose.yml

# docker-compose.yml
version: '3.8'

services:
  dev:
    build: .
    volumes:
      - .:/app
      - build:/app/_build
      - deps:/app/deps
    environment:
      - MIX_ENV=dev
    command: iex -S mix

  test:
    build: .
    volumes:
      - .:/app
      - build:/app/_build
      - deps:/app/deps
    environment:
      - MIX_ENV=test
    command: mix test

volumes:
  build:
  deps:

Build and Verify

# Build the Docker environment
docker-compose build

# Run tests (should pass even if empty)
docker-compose run test

# Start dev shell
docker-compose run dev

Project Structure to Create

Phase 0 Deliverables

/home/home/p/g/n/duckdb_ex/
 Dockerfile                 # CREATE THIS
 docker-compose.yml         # CREATE THIS
 mix.exs                    # UPDATE: Add dependencies
 config/
    config.exs            # CREATE: Basic config
 lib/
    duckdb_ex/
        native.ex         # CREATE: NIF wrapper
        connection.ex     # CREATE: Connection module stub
        relation.ex       # CREATE: Relation module stub
        result.ex         # CREATE: Result module stub
        type.ex           # CREATE: Type module stub
        exceptions.ex     # CREATE: All exception modules
 native/
    duckdb_nif/
        Cargo.toml        # CREATE: Rust project
        src/
            lib.rs        # CREATE: NIF entry point
            connection.rs # CREATE: Connection resource
            error.rs      # CREATE: Error mapping
 test/
    test_helper.exs       # UPDATE: Test setup
    duckdb_ex_test.exs    # CREATE: Module-level tests
    connection_test.exs   # CREATE: Connection tests
    relation_test.exs     # CREATE: Relation tests
    support/
        fixtures/         # CREATE: Test data files
 docs/
     TECHNICAL_DESIGN.md   # ✓ Already created
     IMPLEMENTATION_ROADMAP.md  # ✓ Already created
     PYTHON_API_REFERENCE.md    # ✓ Already created

Dependencies to Add

mix.exs

defmodule DuckdbEx.MixProject do
  use Mix.Project

  def project do
    [
      app: :duckdb_ex,
      version: "0.1.0",
      elixir: "~> 1.18",
      start_permanent: Mix.env() == :prod,
      deps: deps(),

      # Add these
      compilers: [:rustler] ++ Mix.compilers(),
      rustler_crates: [
        duckdb_nif: [
          path: "native/duckdb_nif",
          mode: :release
        ]
      ]
    ]
  end

  def application do
    [
      extra_applications: [:logger]
    ]
  end

  defp deps do
    [
      # NIF framework
      {:rustler, "~> 0.35.0"},

      # Decimal precision
      {:decimal, "~> 2.0"},

      # JSON
      {:jason, "~> 1.4"},

      # Documentation
      {:ex_doc, "~> 0.34", only: :dev, runtime: false},

      # Testing
      {:mox, "~> 1.0", only: :test},
      {:stream_data, "~> 1.0", only: :test},

      # Optional: Explorer integration
      {:explorer, "~> 0.11", optional: true},

      # Optional: Nx integration
      {:nx, "~> 0.9", optional: true}
    ]
  end
end

native/duckdb_nif/Cargo.toml

[package]
name = "duckdb_nif"
version = "0.1.0"
edition = "2021"

[lib]
name = "duckdb_nif"
crate-type = ["cdylib"]

[dependencies]
rustler = "0.35"
duckdb = { version = "1.1", features = ["bundled"] }

Implementation Sequence

STOP! Before You Start

  1. ✅ Have you read ALL three docs files?
  2. ✅ Have you created the Docker environment?
  3. ✅ Does docker-compose run test work?
  4. ✅ Have you added all dependencies to mix.exs?
  5. ✅ Have you initialized the Rust NIF project?

If any answer is NO, STOP and complete that step first.

Phase 0: Infrastructure (START HERE)

Objective: Get the build system working

  1. Create Dockerfile and docker-compose.yml

    • Use templates above
    • Build: docker-compose build
    • Verify: docker-compose run dev
  2. Initialize Rustler

    # In project root
    mix rustler.new duckdb_nif
    
  3. Update mix.exs

    • Add all dependencies listed above
    • Configure Rustler compiler
    • Run: docker-compose run dev mix deps.get
  4. Create Basic NIF

    // native/duckdb_nif/src/lib.rs
    use rustler::{Env, Term};
    
    rustler::init!("Elixir.DuckdbEx.Native", [
        test_nif
    ]);
    
    #[rustler::nif]
    fn test_nif() -> String {
        "NIF is working!".to_string()
    }
    # lib/duckdb_ex/native.ex
    defmodule DuckdbEx.Native do
      use Rustler, otp_app: :duckdb_ex, crate: "duckdb_nif"
    
      def test_nif(), do: :erlang.nif_error(:nif_not_loaded)
    end
  5. Verify Build

    docker-compose run dev mix compile
    docker-compose run dev iex -S mix
    # In IEx:
    iex> DuckdbEx.Native.test_nif()
    "NIF is working!"
    
  6. Create Exception Modules

    # lib/duckdb_ex/exceptions.ex
    defmodule DuckdbEx.Exceptions do
      # Reference: duckdb-python/duckdb/__init__.py
    
      defmodule Error do
        defexception [:message]
      end
    
      # ... create ALL exception types from PYTHON_API_REFERENCE.md
      # Each should be a simple defexception with :message field
    end
  7. Create Module Stubs

    • lib/duckdb_ex/connection.ex - Empty module with @moduledoc
    • lib/duckdb_ex/relation.ex - Empty module with @moduledoc
    • lib/duckdb_ex/result.ex - Empty module with @moduledoc
    • lib/duckdb_ex/type.ex - Empty module with @moduledoc
  8. Create Test Infrastructure

    # test/test_helper.exs
    ExUnit.start()
    
    # Import Mox for mocking
    Mox.defmock(DuckdbEx.MockNative, for: DuckdbEx.NativeBehaviour)
  9. CHECKPOINT: Docker builds, tests run, NIF loads

Phase 1: Basic Connection (IMPLEMENT THIS AFTER PHASE 0)

Reference Files:

  • Python: duckdb-python/src/duckdb_py/pyconnection/
  • Tests: duckdb-python/tests/fast/test_connection.py

Step 1.1: Port Connection Tests

Create test/connection_test.exs:

defmodule DuckdbEx.ConnectionTest do
  use ExUnit.Case

  # Reference: duckdb-python/tests/fast/test_connection.py

  describe "connect/2" do
    test "connects to memory database" do
      {:ok, conn} = DuckdbEx.Connection.connect(:memory)
      assert is_reference(conn)
    end

    test "connects to file database" do
      path = "/tmp/test_#{:rand.uniform(10000)}.db"
      {:ok, conn} = DuckdbEx.Connection.connect(path)
      assert is_reference(conn)
      DuckdbEx.Connection.close(conn)
      File.rm(path)
    end

    test "returns error for invalid path" do
      {:error, _reason} = DuckdbEx.Connection.connect("/invalid/path/db.duckdb")
    end

    # Port MORE tests from Python test_connection.py
  end

  describe "close/1" do
    test "closes connection successfully" do
      {:ok, conn} = DuckdbEx.Connection.connect(:memory)
      assert :ok = DuckdbEx.Connection.close(conn)
    end
  end

  # Continue porting tests...
end

RUN TESTS: docker-compose run test - Should FAIL because connect/2 not implemented

Step 1.2: Implement NIF Layer

// native/duckdb_nif/src/connection.rs
use rustler::{Encoder, Env, Error, ResourceArc, Term};
use duckdb::Connection as DuckDBConnection;
use std::sync::Mutex;

pub struct ConnectionResource {
    pub inner: Mutex<DuckDBConnection>,
}

#[rustler::nif]
pub fn new_connection(path: String) -> Result<ResourceArc<ConnectionResource>, Error> {
    let db_path = if path == ":memory:" {
        ":memory:"
    } else {
        &path
    };

    let conn = DuckDBConnection::open(db_path)
        .map_err(|e| Error::Term(Box::new(format!("Connection error: {}", e))))?;

    Ok(ResourceArc::new(ConnectionResource {
        inner: Mutex::new(conn),
    }))
}

#[rustler::nif]
pub fn close_connection(conn: ResourceArc<ConnectionResource>) -> Result<(), Error> {
    // Connection is closed when resource is dropped
    drop(conn);
    Ok(())
}
// native/duckdb_nif/src/lib.rs
mod connection;

use rustler::{Env, Term};

fn on_load(env: Env, _info: Term) -> bool {
    rustler::resource!(connection::ConnectionResource, env);
    true
}

rustler::init!(
    "Elixir.DuckdbEx.Native",
    [
        connection::new_connection,
        connection::close_connection,
    ],
    load = on_load
);

Step 1.3: Implement Elixir Wrapper

# lib/duckdb_ex/connection.ex
defmodule DuckdbEx.Connection do
  @moduledoc """
  DuckDB connection management.

  This module provides a faithful port of the DuckDBPyConnection class.

  Reference: duckdb-python/src/duckdb_py/include/duckdb_python/pyconnection/pyconnection.hpp

  ## Examples

      iex> {:ok, conn} = DuckdbEx.Connection.connect(:memory)
      iex> DuckdbEx.Connection.close(conn)
      :ok
  """

  alias DuckdbEx.Native
  alias DuckdbEx.Exceptions

  @type t :: reference()

  @doc """
  Opens a connection to a DuckDB database.

  ## Parameters

  - `database` - Database path or `:memory:` for in-memory database
  - `opts` - Connection options (keyword list)
    - `:read_only` - Open in read-only mode (default: false)
    - `:config` - Database configuration map

  ## Examples

      {:ok, conn} = DuckdbEx.Connection.connect(:memory)
      {:ok, conn} = DuckdbEx.Connection.connect("/path/to/db.duckdb")
      {:ok, conn} = DuckdbEx.Connection.connect(:memory, config: %{"threads" => 4})

  Reference: duckdb.connect() in Python
  """
  @spec connect(String.t() | :memory, keyword()) :: {:ok, t()} | {:error, term()}
  def connect(database, opts \\\\ []) do
    path = database_path(database)

    case Native.new_connection(path) do
      {:ok, conn_ref} -> {:ok, conn_ref}
      {:error, reason} -> {:error, %Exceptions.ConnectionException{message: reason}}
    end
  end

  @doc """
  Closes the database connection.

  Reference: DuckDBPyConnection.close() in Python
  """
  @spec close(t()) :: :ok
  def close(conn) do
    case Native.close_connection(conn) do
      {:ok, _} -> :ok
      {:error, _} -> :ok  # Already closed
    end
  end

  defp database_path(:memory), do: ":memory:"
  defp database_path(path) when is_binary(path), do: path
end

Step 1.4: Run Tests

docker-compose run test
# Should now PASS basic connection tests

Step 1.5: Reference Python for Exact Behavior

Before moving on, verify behavior matches Python:

# In Python
python3 << EOF
import duckdb
conn = duckdb.connect(':memory:')
print(type(conn))
print(conn)
conn.close()
EOF

Compare output with Elixir version. Adjust if needed.

Continue with Each Feature...

For each subsequent feature (execute, fetch, types, relations, etc.):

  1. Read relevant section in PYTHON_API_REFERENCE.md
  2. Port tests from duckdb-python/tests/
  3. Run tests (should fail)
  4. Check Python source for exact implementation
  5. Implement Rust NIF
  6. Implement Elixir wrapper
  7. Run tests (should pass)
  8. Verify against Python behavior
  9. Document any differences

Testing Guidelines

Test Categories

  1. Unit Tests: Test individual functions
  2. Integration Tests: Test complete workflows
  3. Property Tests: Use StreamData for property-based testing
  4. Comparison Tests: Compare results with Python client

Example Property Test

defmodule DuckdbEx.TypePropertyTest do
  use ExUnit.Case
  use ExUnitProperties

  property "round-trip integer values" do
    check all int <- integer() do
      {:ok, conn} = DuckdbEx.Connection.connect(:memory)
      {:ok, result} = DuckdbEx.Connection.execute(conn, "SELECT ?", [int])
      {:ok, [{returned}]} = DuckdbEx.Result.fetch_all(result)
      assert returned == int
    end
  end
end

Test Fixtures

Copy test data from duckdb-python/tests/:

# Copy CSV/Parquet/JSON fixtures
cp -r duckdb-python/tests/fast/data test/support/fixtures/

Error Handling

NIF Error Mapping

All DuckDB errors must be mapped to appropriate Elixir exceptions:

// native/duckdb_nif/src/error.rs

pub fn map_duckdb_error(err: duckdb::Error) -> String {
    match err {
        duckdb::Error::DuckDBFailure(_, msg) => {
            // Parse msg to determine exception type
            if msg.contains("Binder Error") {
                format!("BinderException:{}", msg)
            } else if msg.contains("Catalog Error") {
                format!("CatalogException:{}", msg)
            }
            // ... map all error types
        }
        _ => format!("Error:{}", err)
    }
}
# lib/duckdb_ex/native.ex
defmodule DuckdbEx.Native do
  # ...

  @doc false
  def handle_error({:error, error_string}) do
    case String.split(error_string, ":", parts: 2) do
      ["BinderException", msg] ->
        {:error, %DuckdbEx.Exceptions.BinderException{message: msg}}
      ["CatalogException", msg] ->
        {:error, %DuckdbEx.Exceptions.CatalogException{message: msg}}
      # ... handle all exception types
      _ ->
        {:error, %DuckdbEx.Exceptions.Error{message: error_string}}
    end
  end
end

Documentation Requirements

Every module must have:

defmodule DuckdbEx.SomeModule do
  @moduledoc """
  Brief description of module.

  Longer description explaining purpose and usage.

  Reference: duckdb-python/path/to/corresponding/file.hpp

  ## Examples

      iex> # Working example
      iex> {:ok, result} = DuckdbEx.SomeModule.some_function()
  """

  @doc """
  Function description.

  ## Parameters

  - `param1` - Description
  - `param2` - Description

  ## Returns

  Description of return value

  ## Examples

      iex> DuckdbEx.SomeModule.some_function(arg)
      {:ok, result}

  Reference: Python equivalent function name and location
  """
  @spec some_function(term()) :: {:ok, term()} | {:error, term()}
  def some_function(param) do
    # implementation
  end
end

Common Pitfalls to Avoid

❌ DON'T

  1. Don't guess Python behavior - Always check source
  2. Don't skip tests - TDD is mandatory
  3. Don't implement without reading docs - Read ALL reference docs first
  4. Don't change API without documenting - Any deviation must be justified
  5. Don't use BEAM processes for connections - Use NIF resources
  6. Don't forget error handling - Every NIF call can fail

✅ DO

  1. Do reference Python source constantly
  2. Do port tests before implementing
  3. Do verify behavior against Python
  4. Do use proper type specs
  5. Do document everything
  6. Do write property tests
  7. Do test concurrent access
  8. Do check for memory leaks

When You Need Help

If you encounter:

  1. Ambiguous Python behavior: Run Python code to clarify
  2. Rust compilation errors: Check Rustler documentation
  3. Type conversion issues: Reference TECHNICAL_DESIGN.md type mapping
  4. Test failures: Compare with equivalent Python test output
  5. Performance issues: Profile and compare with Python

Success Criteria for Each Phase

Before considering a phase complete:

  • [ ] All Python tests ported
  • [ ] All ported tests passing
  • [ ] No memory leaks (test with :observer)
  • [ ] All public functions documented
  • [ ] Type specs complete
  • [ ] Behavior verified against Python
  • [ ] Code reviewed
  • [ ] Integration tests passing

Final Checklist

Before submitting implementation:

  • [ ] Docker environment builds successfully
  • [ ] All tests pass: docker-compose run test
  • [ ] Documentation generates: mix docs
  • [ ] No compiler warnings
  • [ ] Code formatted: mix format
  • [ ] Dialyzer passes (if configured)
  • [ ] CHANGELOG.md updated
  • [ ] Example code in README works

Remember

This is a port, not a redesign. When in doubt:

  1. Check duckdb-python/ source
  2. Run Python version to see behavior
  3. Port that exact behavior to Elixir
  4. Document if you must deviate

Your goal: An Elixir developer should be able to use DuckDB with the exact same semantics as the Python client, just with Elixir syntax.

Good luck! 🦆