C Data Interface Guide
View SourceThe Arrow C Data Interface (CDI) is a standardised C ABI that lets Arrow implementations transfer data without serialisation — no IPC bytes, no copies, just raw C struct pointers shared between runtimes that both live in the same OS process.
ExArrow v0.4 introduces ExArrow.CDI which exposes the full CDI export/import
cycle from Elixir.
How it works
ExArrow RecordBatch
│
▼ cdi_export (arrow-rs to_ffi)
FFI_ArrowSchema + FFI_ArrowArray (heap-allocated C structs)
│
▼ schema_ptr / array_ptr
integer addresses (uintptr_t cast to u64)
│
▼ external CDI consumer (future Explorer, Polars, DuckDB, …)
zero-copy import into the consumer's Arrow runtimeWhen both ExArrow and the consuming library are loaded into the same BEAM process, the C structs are valid shared memory — no network, no file, no binary copy is needed.
Within ExArrow (round-trip)
The simplest use is a within-ExArrow round-trip, which is also a useful correctness test:
{:ok, batch} = ExArrow.IPC.Reader.from_file("trades.arrow") |> then(&ExArrow.Stream.next/1)
{:ok, handle} = ExArrow.CDI.export(batch)
{:ok, batch2} = ExArrow.CDI.import(handle)
ExArrow.RecordBatch.num_rows(batch2) #=> same as batchexport/1 allocates FFI_ArrowArray and FFI_ArrowSchema on the heap and
wraps them in a BEAM-managed resource handle. import/1 consumes the handle,
rebuilds the RecordBatch, and safely releases all native memory.
With an external CDI consumer
Any CDI-compatible library loaded in the same BEAM process can import the raw C struct pointers:
{:ok, handle} = ExArrow.CDI.export(batch)
{schema_ptr, array_ptr} = ExArrow.CDI.pointers(handle)
# Hand the integer addresses to the external consumer.
# Keep `handle` alive (in scope) until the consumer has finished importing!
SomeLib.import_arrow_cdi(schema_ptr, array_ptr)
# Tell ExArrow the consumer has taken ownership (called release internally).
:ok = ExArrow.CDI.mark_consumed(handle)After mark_consumed/1 the BEAM GC will drop the handle without calling the
Arrow release callback a second time, preventing a double-free.
Memory safety guarantees
| Scenario | What happens |
|---|---|
import/1 called — ExArrow consumes the handle | Pointers atomically swapped to null; Drop is a no-op |
mark_consumed/1 called — external consumer took the data | Same as above |
| Handle GC'd without import or mark_consumed | Drop calls Arrow release callbacks; underlying data freed |
External consumer already called release (null'd the callback) | Drop sees null release; no double-free |
Explorer CDI path (roadmap)
ExArrow.Explorer currently uses an IPC binary round-trip. The CDI module
lays the groundwork for a zero-copy path that will activate automatically once
Explorer exposes a CDI import API. No code changes in user applications will
be required — the bridge will detect CDI availability at compile time and
choose the fastest available path.
See also
- Nx guide —
ExArrow.Nx.from_tensors/1and the full Nx tensor bridge API. - Parquet guide — lazy row-group streaming introduced in v0.4.