DuckdbEx.Relation (DuckdbEx v0.2.0)
View SourceDuckDB Relational API.
This module implements the DuckDB Relation API, which enables lazy, composable query building with method chaining. Relations are not executed until a fetch operation is called.
Reference: duckdb-python/src/duckdb_py/include/duckdb_python/pyrelation.hpp
Overview
A relation represents a lazy SQL query that can be composed with various operations (filter, project, join, etc.) before execution. This enables:
- Lazy evaluation: Queries are only executed when results are needed
- Composability: Operations can be chained in any order
- Optimization: DuckDB can optimize the entire query tree
- Reusability: Base relations can be used in multiple query branches
Examples
# Create a relation from SQL
relation = DuckdbEx.Connection.sql(conn, "SELECT * FROM users")
# Chain operations (lazy - not executed yet)
result = relation
|> DuckdbEx.Relation.filter("age > 25")
|> DuckdbEx.Relation.project(["name", "email"])
|> DuckdbEx.Relation.order("name ASC")
|> DuckdbEx.Relation.limit(10)
# Execute and fetch results
{:ok, rows} = DuckdbEx.Relation.fetch_all(result)Lazy Evaluation
Relations build SQL incrementally. Each operation returns a new relation with updated SQL, but nothing is executed until you call:
fetch_all/1- Fetch all rowsfetch_one/1- Fetch first rowexecute/1- Execute and return result struct
This allows DuckDB to optimize the entire query before execution.
Summary
Functions
Performs aggregation on the relation.
Convenience function for AVG aggregation.
Convenience function for COUNT aggregation.
Creates a table from the relation.
Creates a view from the relation.
Performs a cross join (cartesian product) of two relations.
Removes duplicate rows from the relation.
Returns rows in the first relation but not in the second.
Executes the relation and returns the result struct.
Fetches all rows from the relation.
Fetches multiple rows from the relation.
Fetches one row from the relation.
Filters rows based on a condition.
Inserts a row of values into a table relation.
Inserts rows from the relation into an existing table.
Intersects two relations.
Joins two relations.
Limits the number of rows returned.
Convenience function for MAX aggregation.
Convenience function for MIN aggregation.
Creates a new relation.
Orders (sorts) the rows by specified columns.
Projects (selects) specific columns from the relation.
Sorts rows by one or more columns.
Convenience function for SUM aggregation.
Exports the relation to a CSV file.
Exports the relation to a Parquet file.
Alias for create/2.
Alias for create_view/3.
Unions two relations.
Returns distinct values for the specified columns.
Updates rows in a table relation.
Types
@type t() :: %DuckdbEx.Relation{ alias: String.t() | nil, conn: DuckdbEx.Connection.t(), source: term(), sql: String.t() }
Functions
Performs aggregation on the relation.
Supports both simple aggregations and GROUP BY aggregations. Aggregation expressions can include common functions like COUNT, SUM, AVG, MIN, MAX, and any SQL aggregate function supported by DuckDB.
Parameters
relation- The relation to aggregateexpressions- Aggregation expression(s) as string or list of stringsopts- Options (keyword list):group_by- List of columns to group by (optional)
Returns
A new relation with aggregation applied
Examples
# Simple aggregation
relation |> DuckdbEx.Relation.aggregate("count(*) as total")
# Multiple aggregations
relation |> DuckdbEx.Relation.aggregate([
"count(*) as count",
"sum(amount) as total",
"avg(amount) as average"
])
# Group by single column
relation
|> DuckdbEx.Relation.aggregate("sum(sales) as total", group_by: ["region"])
# Group by multiple columns
relation
|> DuckdbEx.Relation.aggregate(
["sum(sales) as total", "count(*) as count"],
group_by: ["region", "year"]
)
# Filter after aggregation (HAVING clause)
relation
|> DuckdbEx.Relation.aggregate("sum(amount) as total", group_by: ["category"])
|> DuckdbEx.Relation.filter("total > 1000")Reference: DuckDBPyRelation.aggregate() in Python
Convenience function for AVG aggregation.
Returns a relation with an AVG aggregation on the specified column. The result will have a column named "avg".
Parameters
relation- The relation to aggregatecolumn- Column name or expression to average
Returns
A new relation with AVG aggregation
Examples
relation |> DuckdbEx.Relation.avg("price")
# Equivalent to: aggregate("avg(price) as avg")Reference: DuckDBPyRelation.avg() in Python
Convenience function for COUNT aggregation.
Returns a relation with a COUNT(*) aggregation. The result will have a column named "count".
Parameters
relation- The relation to count
Returns
A new relation with COUNT aggregation
Examples
relation |> DuckdbEx.Relation.count()
# Equivalent to: aggregate("count(*) as count")Reference: DuckDBPyRelation.count() in Python
Creates a table from the relation.
Creates a view from the relation.
Performs a cross join (cartesian product) of two relations.
Returns a new relation containing all possible combinations of rows from both relations.
Parameters
relation1- First relationrelation2- Second relation
Returns
A new relation with the cross join
Examples
rel1 = DuckdbEx.Connection.sql(conn, "SELECT * FROM (VALUES (1), (2)) t(x)")
rel2 = DuckdbEx.Connection.sql(conn, "SELECT * FROM (VALUES (3), (4)) t(y)")
crossed = DuckdbEx.Relation.cross(rel1, rel2)
# Returns 4 rows: (1,3), (1,4), (2,3), (2,4)Notes
Cross joins can produce very large result sets (rows1 × rows2). Use with caution on large relations.
Reference: DuckDBPyRelation.cross() in Python
Removes duplicate rows from the relation.
Equivalent to SQL DISTINCT. Returns a new relation with duplicate rows removed.
Parameters
relation- The relation to remove duplicates from
Returns
A new relation with DISTINCT applied
Examples
# Remove duplicate rows
relation |> DuckdbEx.Relation.distinct()
# Chain with other operations
relation
|> DuckdbEx.Relation.filter("age > 25")
|> DuckdbEx.Relation.distinct()
|> DuckdbEx.Relation.order("name ASC")Reference: DuckDBPyRelation.distinct() in Python
Returns rows in the first relation but not in the second.
Equivalent to SQL EXCEPT. Returns a new relation containing rows from the first relation that are not in the second relation.
Parameters
relation1- First relationrelation2- Second relation to exclude
Returns
A new relation with the difference
Examples
# Find rows only in first relation
rel1 = DuckdbEx.Connection.sql(conn, "SELECT * FROM (VALUES (1), (2), (3)) t(x)")
rel2 = DuckdbEx.Connection.sql(conn, "SELECT * FROM (VALUES (2), (3), (4)) t(x)")
diff = DuckdbEx.Relation.except_(rel1, rel2)
# Returns row with x = 1Notes
Both relations must have the same number of columns and compatible types.
The function is named except_ (with underscore) because except is a
reserved keyword in Elixir.
Reference: DuckDBPyRelation.except_() in Python
Executes the relation and returns the result struct.
This triggers query execution and returns the raw result structure.
For most use cases, prefer fetch_all/1 or fetch_one/1.
Parameters
relation- The relation to execute
Returns
{:ok, result}- Result struct with rows and metadata{:error, exception}- Execution failed
Examples
{:ok, result} = DuckdbEx.Relation.execute(relation)Reference: DuckDBPyRelation.execute() in Python
Fetches all rows from the relation.
Executes the relation query and returns all rows as a list of tuples.
Parameters
relation- The relation to fetch from
Returns
{:ok, rows}- List of row tuples{:error, exception}- Execution failed
Examples
{:ok, rows} = DuckdbEx.Relation.fetch_all(relation)
# => [{1, "Alice"}, ...]Reference: DuckDBPyRelation.fetchall() in Python
@spec fetch_many(t(), pos_integer()) :: {:ok, [tuple()]} | {:error, term()}
Fetches multiple rows from the relation.
Executes the relation query and returns the first N rows.
Parameters
relation- The relation to fetch fromn- Number of rows to fetch
Returns
{:ok, rows}- List of row tuples (up to N rows){:error, exception}- Execution failed
Examples
{:ok, rows} = DuckdbEx.Relation.fetch_many(relation, 5)
# => [{1}, {2}, ...]Reference: DuckDBPyRelation.fetchmany() in Python
Fetches one row from the relation.
Executes the relation query and returns the first row, or nil if the result is empty.
Parameters
relation- The relation to fetch from
Returns
{:ok, row}- Row tuple or nil if empty{:error, exception}- Execution failed
Examples
{:ok, row} = DuckdbEx.Relation.fetch_one(relation)
# => {1, "Alice"}
# Empty result
{:ok, nil} = DuckdbEx.Relation.fetch_one(empty_relation)Reference: DuckDBPyRelation.fetchone() in Python
Filters rows based on a condition.
Equivalent to SQL WHERE clause. Returns a new relation with only rows that satisfy the condition.
Parameters
relation- The relation to filtercondition- SQL WHERE condition as string
Examples
# Simple condition
relation |> DuckdbEx.Relation.filter("age > 25")
# Complex condition
relation |> DuckdbEx.Relation.filter("age > 25 AND status = 'active'")
# Chain multiple filters (AND logic)
relation
|> DuckdbEx.Relation.filter("age > 25")
|> DuckdbEx.Relation.filter("status = 'active'")Notes
Multiple filters are combined with AND. Each call to filter adds another condition to the WHERE clause.
Reference: DuckDBPyRelation.filter() in Python
Inserts a row of values into a table relation.
Inserts rows from the relation into an existing table.
Intersects two relations.
Equivalent to SQL INTERSECT. Returns a new relation containing only rows that appear in both relations.
Parameters
relation1- First relationrelation2- Second relation
Returns
A new relation with the intersection
Examples
# Find common rows
rel1 = DuckdbEx.Connection.sql(conn, "SELECT * FROM (VALUES (1), (2), (3)) t(x)")
rel2 = DuckdbEx.Connection.sql(conn, "SELECT * FROM (VALUES (2), (3), (4)) t(x)")
common = DuckdbEx.Relation.intersect(rel1, rel2)
# Returns rows with x = 2 and x = 3Notes
Both relations must have the same number of columns and compatible types.
Reference: DuckDBPyRelation.intersect() in Python
Joins two relations.
Supports various join types: inner, left, right, and outer joins. Returns a new relation combining rows from both relations based on the join condition.
Parameters
relation1- First relation (left side)relation2- Second relation (right side)condition- Join condition as SQL stringopts- Options (keyword list):type- Join type (:inner,:left,:right,:outer), defaults to:inner
Returns
A new relation with the join applied
Examples
# Inner join
users = DuckdbEx.Connection.table(conn, "users")
orders = DuckdbEx.Connection.table(conn, "orders")
joined = DuckdbEx.Relation.join(users, orders, "users.id = orders.user_id")
# Left join
joined = DuckdbEx.Relation.join(users, orders, "users.id = orders.user_id", type: :left)
# Right join
joined = DuckdbEx.Relation.join(users, orders, "users.id = orders.user_id", type: :right)
# Outer join
joined = DuckdbEx.Relation.join(users, orders, "users.id = orders.user_id", type: :outer)
# Chain multiple joins
users
|> DuckdbEx.Relation.join(orders, "users.id = orders.user_id")
|> DuckdbEx.Relation.join(products, "orders.product_id = products.id")Notes
The join condition should reference columns with table names or aliases to avoid ambiguity.
Reference: DuckDBPyRelation.join() in Python
@spec limit(t(), non_neg_integer(), non_neg_integer()) :: t()
Limits the number of rows returned.
Equivalent to SQL LIMIT clause. Returns a new relation that will return
at most n rows when executed.
Parameters
relation- The relation to limitn- Maximum number of rows to returnoffset- Number of rows to skip before returning results (default: 0)
Examples
# Get first 10 rows
relation |> DuckdbEx.Relation.limit(10)
# Skip the first 5 rows
relation |> DuckdbEx.Relation.limit(10, 5)
# Combine with order for top-N queries
relation
|> DuckdbEx.Relation.order("score DESC")
|> DuckdbEx.Relation.limit(5)Reference: DuckDBPyRelation.limit() in Python
Convenience function for MAX aggregation.
Returns a relation with a MAX aggregation on the specified column. The result will have a column named "max".
Parameters
relation- The relation to aggregatecolumn- Column name or expression to find maximum
Returns
A new relation with MAX aggregation
Examples
relation |> DuckdbEx.Relation.max("score")
# Equivalent to: aggregate("max(score) as max")Reference: DuckDBPyRelation.max() in Python
Convenience function for MIN aggregation.
Returns a relation with a MIN aggregation on the specified column. The result will have a column named "min".
Parameters
relation- The relation to aggregatecolumn- Column name or expression to find minimum
Returns
A new relation with MIN aggregation
Examples
relation |> DuckdbEx.Relation.min("temperature")
# Equivalent to: aggregate("min(temperature) as min")Reference: DuckDBPyRelation.min() in Python
@spec new(DuckdbEx.Connection.t(), String.t(), String.t() | nil, term()) :: t()
Creates a new relation.
This is typically called internally by Connection functions like sql/2 or
table/2. Users should not need to call this directly.
Parameters
conn- Database connectionsql- SQL query stringalias- Optional table alias
Returns
A new %DuckdbEx.Relation{} struct
Orders (sorts) the rows by specified columns.
Equivalent to SQL ORDER BY clause. Returns a new relation with rows sorted according to the order specification.
Parameters
relation- The relation to orderorder_by- ORDER BY specification as string
Examples
# Single column ascending
relation |> DuckdbEx.Relation.order("name ASC")
# Single column descending
relation |> DuckdbEx.Relation.order("age DESC")
# Multiple columns
relation |> DuckdbEx.Relation.order("department ASC, salary DESC")
# Expression
relation |> DuckdbEx.Relation.order("length(name) DESC")Reference: DuckDBPyRelation.order() in Python
Projects (selects) specific columns from the relation.
Equivalent to SQL SELECT clause. Returns a new relation with only the specified columns or expressions.
Parameters
relation- The relation to project fromcolumns- List of column names or expressions
Examples
# Select specific columns
relation |> DuckdbEx.Relation.project(["name", "email"])
# Use expressions
relation |> DuckdbEx.Relation.project(["id", "upper(name) as upper_name"])
# Calculations
relation |> DuckdbEx.Relation.project(["price", "price * 1.1 as price_with_tax"])Reference: DuckDBPyRelation.project() in Python
Sorts rows by one or more columns.
This is an alias for order/2, accepting either a list of column expressions
or a single ORDER BY string.
Convenience function for SUM aggregation.
Returns a relation with a SUM aggregation on the specified column. The result will have a column named "sum".
Parameters
relation- The relation to aggregatecolumn- Column name or expression to sum
Returns
A new relation with SUM aggregation
Examples
relation |> DuckdbEx.Relation.sum("amount")
# Equivalent to: aggregate("sum(amount) as sum")Reference: DuckDBPyRelation.sum() in Python
Exports the relation to a CSV file.
Exports the relation to a Parquet file.
Alias for create/2.
Alias for create_view/3.
Unions two relations.
Equivalent to SQL UNION. Returns a new relation combining rows from both relations, with duplicates removed.
Parameters
relation1- First relationrelation2- Second relation
Returns
A new relation with both relations unioned
Examples
# Union two relations
rel1 = DuckdbEx.Connection.sql(conn, "SELECT 1 as x")
rel2 = DuckdbEx.Connection.sql(conn, "SELECT 2 as x")
combined = DuckdbEx.Relation.union(rel1, rel2)
# Can be chained
rel1
|> DuckdbEx.Relation.union(rel2)
|> DuckdbEx.Relation.union(rel3)Notes
Both relations must have the same number of columns and compatible types. UNION automatically removes duplicates. Use UNION ALL if you want to keep duplicates.
Reference: DuckDBPyRelation.union() in Python
Returns distinct values for the specified columns.
Updates rows in a table relation.