Builder API for writing DataFrames using the V2 DataSource API.
Mirrors PySpark's DataFrameWriterV2 with a builder pattern.
Examples
import SparkEx.WriterV2
# Create a new table
df
|> SparkEx.DataFrame.write_v2("catalog.db.my_table")
|> using("parquet")
|> table_property("description", "My table")
|> create()
# Append to existing table
df
|> SparkEx.DataFrame.write_v2("my_table")
|> append()
# Overwrite with condition
import SparkEx.Functions, only: [col: 1, lit: 1]
df
|> SparkEx.DataFrame.write_v2("my_table")
|> overwrite(col("date") |> SparkEx.Column.eq(lit("2024-01-01")))
Summary
Functions
Appends the DataFrame's data to the table.
Sets the clustering columns.
Creates a new table with the DataFrame's data.
Creates a table or replaces it if it already exists.
Alias for create_or_replace/2 (PySpark createOrReplace).
Sets a single writer option.
Merges a map of options into the writer.
Overwrites rows in the table.
Overwrites rows in the table matching the given condition.
Overwrites all partitions that the DataFrame touches.
Sets the partitioning expressions.
Replaces an existing table with the DataFrame's data.
Merges a map of table properties.
Sets a single table property.
Sets the provider (data source) for the table.
Types
@type t() :: %SparkEx.WriterV2{ cluster_by: [String.t()], df: SparkEx.DataFrame.t(), options: %{required(String.t()) => String.t()}, overwrite_condition: term() | nil, partitioned_by: [term()], provider: String.t() | nil, table_name: String.t(), table_properties: %{required(String.t()) => String.t()} }
Functions
Appends the DataFrame's data to the table.
Sets the clustering columns.
Creates a new table with the DataFrame's data.
Creates a table or replaces it if it already exists.
Alias for create_or_replace/2 (PySpark createOrReplace).
Sets a single writer option.
Merges a map of options into the writer.
Overwrites rows in the table.
Examples
# Overwrite all data
overwrite(writer)
# Overwrite with condition
overwrite(writer, col("date") |> SparkEx.Column.eq(lit("2024-01-01")))
@spec overwrite( t(), keyword() ) :: :ok | {:error, term()}
@spec overwrite(t(), SparkEx.Column.t()) :: :ok | {:error, term()}
Overwrites rows in the table matching the given condition.
@spec overwrite(t(), SparkEx.Column.t(), keyword()) :: :ok | {:error, term()}
Overwrites all partitions that the DataFrame touches.
Sets the partitioning expressions.
Accepts column expressions (e.g. from SparkEx.Functions.col/1).
Replaces an existing table with the DataFrame's data.
Merges a map of table properties.
Sets a single table property.
Sets the provider (data source) for the table.