DuckdbEx.Relation (DuckdbEx v0.2.0)

View Source

DuckDB Relational API.

This module implements the DuckDB Relation API, which enables lazy, composable query building with method chaining. Relations are not executed until a fetch operation is called.

Reference: duckdb-python/src/duckdb_py/include/duckdb_python/pyrelation.hpp

Overview

A relation represents a lazy SQL query that can be composed with various operations (filter, project, join, etc.) before execution. This enables:

  • Lazy evaluation: Queries are only executed when results are needed
  • Composability: Operations can be chained in any order
  • Optimization: DuckDB can optimize the entire query tree
  • Reusability: Base relations can be used in multiple query branches

Examples

# Create a relation from SQL
relation = DuckdbEx.Connection.sql(conn, "SELECT * FROM users")

# Chain operations (lazy - not executed yet)
result = relation
|> DuckdbEx.Relation.filter("age > 25")
|> DuckdbEx.Relation.project(["name", "email"])
|> DuckdbEx.Relation.order("name ASC")
|> DuckdbEx.Relation.limit(10)

# Execute and fetch results
{:ok, rows} = DuckdbEx.Relation.fetch_all(result)

Lazy Evaluation

Relations build SQL incrementally. Each operation returns a new relation with updated SQL, but nothing is executed until you call:

This allows DuckDB to optimize the entire query before execution.

Summary

Functions

Performs aggregation on the relation.

Convenience function for AVG aggregation.

Convenience function for COUNT aggregation.

Creates a table from the relation.

Creates a view from the relation.

Performs a cross join (cartesian product) of two relations.

Removes duplicate rows from the relation.

Returns rows in the first relation but not in the second.

Executes the relation and returns the result struct.

Fetches all rows from the relation.

Fetches multiple rows from the relation.

Fetches one row from the relation.

Filters rows based on a condition.

Inserts a row of values into a table relation.

Inserts rows from the relation into an existing table.

Intersects two relations.

Limits the number of rows returned.

Convenience function for MAX aggregation.

Convenience function for MIN aggregation.

Orders (sorts) the rows by specified columns.

Projects (selects) specific columns from the relation.

Sorts rows by one or more columns.

Convenience function for SUM aggregation.

Exports the relation to a CSV file.

Exports the relation to a Parquet file.

Alias for create/2.

Alias for create_view/3.

Unions two relations.

Returns distinct values for the specified columns.

Updates rows in a table relation.

Types

t()

@type t() :: %DuckdbEx.Relation{
  alias: String.t() | nil,
  conn: DuckdbEx.Connection.t(),
  source: term(),
  sql: String.t()
}

Functions

aggregate(relation, expressions, opts \\ [])

@spec aggregate(t(), String.t() | [String.t()], keyword()) :: t()

Performs aggregation on the relation.

Supports both simple aggregations and GROUP BY aggregations. Aggregation expressions can include common functions like COUNT, SUM, AVG, MIN, MAX, and any SQL aggregate function supported by DuckDB.

Parameters

  • relation - The relation to aggregate
  • expressions - Aggregation expression(s) as string or list of strings
  • opts - Options (keyword list)
    • :group_by - List of columns to group by (optional)

Returns

A new relation with aggregation applied

Examples

# Simple aggregation
relation |> DuckdbEx.Relation.aggregate("count(*) as total")

# Multiple aggregations
relation |> DuckdbEx.Relation.aggregate([
  "count(*) as count",
  "sum(amount) as total",
  "avg(amount) as average"
])

# Group by single column
relation
|> DuckdbEx.Relation.aggregate("sum(sales) as total", group_by: ["region"])

# Group by multiple columns
relation
|> DuckdbEx.Relation.aggregate(
  ["sum(sales) as total", "count(*) as count"],
  group_by: ["region", "year"]
)

# Filter after aggregation (HAVING clause)
relation
|> DuckdbEx.Relation.aggregate("sum(amount) as total", group_by: ["category"])
|> DuckdbEx.Relation.filter("total > 1000")

Reference: DuckDBPyRelation.aggregate() in Python

avg(relation, column)

@spec avg(t(), String.t()) :: t()

Convenience function for AVG aggregation.

Returns a relation with an AVG aggregation on the specified column. The result will have a column named "avg".

Parameters

  • relation - The relation to aggregate
  • column - Column name or expression to average

Returns

A new relation with AVG aggregation

Examples

relation |> DuckdbEx.Relation.avg("price")
# Equivalent to: aggregate("avg(price) as avg")

Reference: DuckDBPyRelation.avg() in Python

count(relation)

@spec count(t()) :: t()

Convenience function for COUNT aggregation.

Returns a relation with a COUNT(*) aggregation. The result will have a column named "count".

Parameters

  • relation - The relation to count

Returns

A new relation with COUNT aggregation

Examples

relation |> DuckdbEx.Relation.count()
# Equivalent to: aggregate("count(*) as count")

Reference: DuckDBPyRelation.count() in Python

create(relation, table_name)

@spec create(t(), String.t()) :: :ok | {:error, term()}

Creates a table from the relation.

create_view(relation, view_name, opts \\ [])

@spec create_view(t(), String.t(), keyword()) :: :ok | {:error, term()}

Creates a view from the relation.

cross(relation1, relation2)

@spec cross(t(), t()) :: t()

Performs a cross join (cartesian product) of two relations.

Returns a new relation containing all possible combinations of rows from both relations.

Parameters

  • relation1 - First relation
  • relation2 - Second relation

Returns

A new relation with the cross join

Examples

rel1 = DuckdbEx.Connection.sql(conn, "SELECT * FROM (VALUES (1), (2)) t(x)")
rel2 = DuckdbEx.Connection.sql(conn, "SELECT * FROM (VALUES (3), (4)) t(y)")
crossed = DuckdbEx.Relation.cross(rel1, rel2)
# Returns 4 rows: (1,3), (1,4), (2,3), (2,4)

Notes

Cross joins can produce very large result sets (rows1 × rows2). Use with caution on large relations.

Reference: DuckDBPyRelation.cross() in Python

distinct(relation)

@spec distinct(t()) :: t()

Removes duplicate rows from the relation.

Equivalent to SQL DISTINCT. Returns a new relation with duplicate rows removed.

Parameters

  • relation - The relation to remove duplicates from

Returns

A new relation with DISTINCT applied

Examples

# Remove duplicate rows
relation |> DuckdbEx.Relation.distinct()

# Chain with other operations
relation
|> DuckdbEx.Relation.filter("age > 25")
|> DuckdbEx.Relation.distinct()
|> DuckdbEx.Relation.order("name ASC")

Reference: DuckDBPyRelation.distinct() in Python

except_(relation1, relation2)

@spec except_(t(), t()) :: t()

Returns rows in the first relation but not in the second.

Equivalent to SQL EXCEPT. Returns a new relation containing rows from the first relation that are not in the second relation.

Parameters

  • relation1 - First relation
  • relation2 - Second relation to exclude

Returns

A new relation with the difference

Examples

# Find rows only in first relation
rel1 = DuckdbEx.Connection.sql(conn, "SELECT * FROM (VALUES (1), (2), (3)) t(x)")
rel2 = DuckdbEx.Connection.sql(conn, "SELECT * FROM (VALUES (2), (3), (4)) t(x)")
diff = DuckdbEx.Relation.except_(rel1, rel2)
# Returns row with x = 1

Notes

Both relations must have the same number of columns and compatible types.

The function is named except_ (with underscore) because except is a reserved keyword in Elixir.

Reference: DuckDBPyRelation.except_() in Python

execute(relation)

@spec execute(t()) :: {:ok, term()} | {:error, term()}

Executes the relation and returns the result struct.

This triggers query execution and returns the raw result structure. For most use cases, prefer fetch_all/1 or fetch_one/1.

Parameters

  • relation - The relation to execute

Returns

  • {:ok, result} - Result struct with rows and metadata
  • {:error, exception} - Execution failed

Examples

{:ok, result} = DuckdbEx.Relation.execute(relation)

Reference: DuckDBPyRelation.execute() in Python

fetch_all(relation)

@spec fetch_all(t()) :: {:ok, [tuple()]} | {:error, term()}

Fetches all rows from the relation.

Executes the relation query and returns all rows as a list of tuples.

Parameters

  • relation - The relation to fetch from

Returns

  • {:ok, rows} - List of row tuples
  • {:error, exception} - Execution failed

Examples

{:ok, rows} = DuckdbEx.Relation.fetch_all(relation)
# => [{1, "Alice"}, ...]

Reference: DuckDBPyRelation.fetchall() in Python

fetch_many(relation, n)

@spec fetch_many(t(), pos_integer()) :: {:ok, [tuple()]} | {:error, term()}

Fetches multiple rows from the relation.

Executes the relation query and returns the first N rows.

Parameters

  • relation - The relation to fetch from
  • n - Number of rows to fetch

Returns

  • {:ok, rows} - List of row tuples (up to N rows)
  • {:error, exception} - Execution failed

Examples

{:ok, rows} = DuckdbEx.Relation.fetch_many(relation, 5)
# => [{1}, {2}, ...]

Reference: DuckDBPyRelation.fetchmany() in Python

fetch_one(relation)

@spec fetch_one(t()) :: {:ok, tuple() | nil} | {:error, term()}

Fetches one row from the relation.

Executes the relation query and returns the first row, or nil if the result is empty.

Parameters

  • relation - The relation to fetch from

Returns

  • {:ok, row} - Row tuple or nil if empty
  • {:error, exception} - Execution failed

Examples

{:ok, row} = DuckdbEx.Relation.fetch_one(relation)
# => {1, "Alice"}

# Empty result
{:ok, nil} = DuckdbEx.Relation.fetch_one(empty_relation)

Reference: DuckDBPyRelation.fetchone() in Python

filter(relation, condition)

@spec filter(t(), String.t()) :: t()

Filters rows based on a condition.

Equivalent to SQL WHERE clause. Returns a new relation with only rows that satisfy the condition.

Parameters

  • relation - The relation to filter
  • condition - SQL WHERE condition as string

Examples

# Simple condition
relation |> DuckdbEx.Relation.filter("age > 25")

# Complex condition
relation |> DuckdbEx.Relation.filter("age > 25 AND status = 'active'")

# Chain multiple filters (AND logic)
relation
|> DuckdbEx.Relation.filter("age > 25")
|> DuckdbEx.Relation.filter("status = 'active'")

Notes

Multiple filters are combined with AND. Each call to filter adds another condition to the WHERE clause.

Reference: DuckDBPyRelation.filter() in Python

insert(relation, values)

@spec insert(t(), list() | tuple() | term()) :: :ok | {:error, term()}

Inserts a row of values into a table relation.

insert_into(relation, table_name)

@spec insert_into(t(), String.t()) :: :ok | {:error, term()}

Inserts rows from the relation into an existing table.

intersect(relation1, relation2)

@spec intersect(t(), t()) :: t()

Intersects two relations.

Equivalent to SQL INTERSECT. Returns a new relation containing only rows that appear in both relations.

Parameters

  • relation1 - First relation
  • relation2 - Second relation

Returns

A new relation with the intersection

Examples

# Find common rows
rel1 = DuckdbEx.Connection.sql(conn, "SELECT * FROM (VALUES (1), (2), (3)) t(x)")
rel2 = DuckdbEx.Connection.sql(conn, "SELECT * FROM (VALUES (2), (3), (4)) t(x)")
common = DuckdbEx.Relation.intersect(rel1, rel2)
# Returns rows with x = 2 and x = 3

Notes

Both relations must have the same number of columns and compatible types.

Reference: DuckDBPyRelation.intersect() in Python

join(relation1, relation2, condition, opts \\ [])

@spec join(t(), t(), String.t(), keyword()) :: t()

Joins two relations.

Supports various join types: inner, left, right, and outer joins. Returns a new relation combining rows from both relations based on the join condition.

Parameters

  • relation1 - First relation (left side)
  • relation2 - Second relation (right side)
  • condition - Join condition as SQL string
  • opts - Options (keyword list)
    • :type - Join type (:inner, :left, :right, :outer), defaults to :inner

Returns

A new relation with the join applied

Examples

# Inner join
users = DuckdbEx.Connection.table(conn, "users")
orders = DuckdbEx.Connection.table(conn, "orders")
joined = DuckdbEx.Relation.join(users, orders, "users.id = orders.user_id")

# Left join
joined = DuckdbEx.Relation.join(users, orders, "users.id = orders.user_id", type: :left)

# Right join
joined = DuckdbEx.Relation.join(users, orders, "users.id = orders.user_id", type: :right)

# Outer join
joined = DuckdbEx.Relation.join(users, orders, "users.id = orders.user_id", type: :outer)

# Chain multiple joins
users
|> DuckdbEx.Relation.join(orders, "users.id = orders.user_id")
|> DuckdbEx.Relation.join(products, "orders.product_id = products.id")

Notes

The join condition should reference columns with table names or aliases to avoid ambiguity.

Reference: DuckDBPyRelation.join() in Python

limit(relation, n, offset \\ 0)

@spec limit(t(), non_neg_integer(), non_neg_integer()) :: t()

Limits the number of rows returned.

Equivalent to SQL LIMIT clause. Returns a new relation that will return at most n rows when executed.

Parameters

  • relation - The relation to limit
  • n - Maximum number of rows to return
  • offset - Number of rows to skip before returning results (default: 0)

Examples

# Get first 10 rows
relation |> DuckdbEx.Relation.limit(10)

# Skip the first 5 rows
relation |> DuckdbEx.Relation.limit(10, 5)

# Combine with order for top-N queries
relation
|> DuckdbEx.Relation.order("score DESC")
|> DuckdbEx.Relation.limit(5)

Reference: DuckDBPyRelation.limit() in Python

max(relation, column)

@spec max(t(), String.t()) :: t()

Convenience function for MAX aggregation.

Returns a relation with a MAX aggregation on the specified column. The result will have a column named "max".

Parameters

  • relation - The relation to aggregate
  • column - Column name or expression to find maximum

Returns

A new relation with MAX aggregation

Examples

relation |> DuckdbEx.Relation.max("score")
# Equivalent to: aggregate("max(score) as max")

Reference: DuckDBPyRelation.max() in Python

min(relation, column)

@spec min(t(), String.t()) :: t()

Convenience function for MIN aggregation.

Returns a relation with a MIN aggregation on the specified column. The result will have a column named "min".

Parameters

  • relation - The relation to aggregate
  • column - Column name or expression to find minimum

Returns

A new relation with MIN aggregation

Examples

relation |> DuckdbEx.Relation.min("temperature")
# Equivalent to: aggregate("min(temperature) as min")

Reference: DuckDBPyRelation.min() in Python

new(conn, sql, relation_alias \\ nil, source \\ nil)

@spec new(DuckdbEx.Connection.t(), String.t(), String.t() | nil, term()) :: t()

Creates a new relation.

This is typically called internally by Connection functions like sql/2 or table/2. Users should not need to call this directly.

Parameters

  • conn - Database connection
  • sql - SQL query string
  • alias - Optional table alias

Returns

A new %DuckdbEx.Relation{} struct

order(relation, order_by)

@spec order(t(), String.t()) :: t()

Orders (sorts) the rows by specified columns.

Equivalent to SQL ORDER BY clause. Returns a new relation with rows sorted according to the order specification.

Parameters

  • relation - The relation to order
  • order_by - ORDER BY specification as string

Examples

# Single column ascending
relation |> DuckdbEx.Relation.order("name ASC")

# Single column descending
relation |> DuckdbEx.Relation.order("age DESC")

# Multiple columns
relation |> DuckdbEx.Relation.order("department ASC, salary DESC")

# Expression
relation |> DuckdbEx.Relation.order("length(name) DESC")

Reference: DuckDBPyRelation.order() in Python

project(relation, columns)

@spec project(t(), [String.t()]) :: t()

Projects (selects) specific columns from the relation.

Equivalent to SQL SELECT clause. Returns a new relation with only the specified columns or expressions.

Parameters

  • relation - The relation to project from
  • columns - List of column names or expressions

Examples

# Select specific columns
relation |> DuckdbEx.Relation.project(["name", "email"])

# Use expressions
relation |> DuckdbEx.Relation.project(["id", "upper(name) as upper_name"])

# Calculations
relation |> DuckdbEx.Relation.project(["price", "price * 1.1 as price_with_tax"])

Reference: DuckDBPyRelation.project() in Python

sort(relation, columns)

@spec sort(t(), String.t() | [String.t()]) :: t()

Sorts rows by one or more columns.

This is an alias for order/2, accepting either a list of column expressions or a single ORDER BY string.

sum(relation, column)

@spec sum(t(), String.t()) :: t()

Convenience function for SUM aggregation.

Returns a relation with a SUM aggregation on the specified column. The result will have a column named "sum".

Parameters

  • relation - The relation to aggregate
  • column - Column name or expression to sum

Returns

A new relation with SUM aggregation

Examples

relation |> DuckdbEx.Relation.sum("amount")
# Equivalent to: aggregate("sum(amount) as sum")

Reference: DuckDBPyRelation.sum() in Python

to_csv(relation, filename, opts \\ [])

@spec to_csv(t(), String.t(), keyword() | map()) :: :ok | {:error, term()}

Exports the relation to a CSV file.

to_parquet(relation, filename, opts \\ [])

@spec to_parquet(t(), String.t(), keyword() | map()) :: :ok | {:error, term()}

Exports the relation to a Parquet file.

to_table(relation, table_name)

@spec to_table(t(), String.t()) :: :ok | {:error, term()}

Alias for create/2.

to_view(relation, view_name, opts \\ [])

@spec to_view(t(), String.t(), keyword()) :: :ok | {:error, term()}

Alias for create_view/3.

union(relation1, relation2)

@spec union(t(), t()) :: t()

Unions two relations.

Equivalent to SQL UNION. Returns a new relation combining rows from both relations, with duplicates removed.

Parameters

  • relation1 - First relation
  • relation2 - Second relation

Returns

A new relation with both relations unioned

Examples

# Union two relations
rel1 = DuckdbEx.Connection.sql(conn, "SELECT 1 as x")
rel2 = DuckdbEx.Connection.sql(conn, "SELECT 2 as x")
combined = DuckdbEx.Relation.union(rel1, rel2)

# Can be chained
rel1
|> DuckdbEx.Relation.union(rel2)
|> DuckdbEx.Relation.union(rel3)

Notes

Both relations must have the same number of columns and compatible types. UNION automatically removes duplicates. Use UNION ALL if you want to keep duplicates.

Reference: DuckDBPyRelation.union() in Python

unique(relation, columns)

@spec unique(t(), String.t() | [String.t()]) :: t()

Returns distinct values for the specified columns.

update(relation, set, condition \\ nil)

@spec update(t(), map(), String.t() | nil) :: :ok | {:error, term()}

Updates rows in a table relation.