Dux.Flame (Dux v0.3.0)

Spin up Dux workers on FLAME runners for distributed queries.

FLAME handles pool management and machine lifecycle — see the FLAME docs for pool configuration. This module provides spin_up/2 to place Dux.Remote.Worker processes on FLAME runners, and status/1 to inspect the cluster.

Livebook

# 1. Start a FLAME pool (see FLAME docs for backend options)
Kino.start_child!(
  {FLAME.Pool,
    name: :dux_pool,
    code_sync: [
      start_apps: true,
      sync_beams: [Path.join(System.tmp_dir!(), "livebook_runtime")]
    ],
    min: 0,
    max: 10,
    max_concurrency: 1,
    backend: {FLAME.FlyBackend,
      cpu_kind: "performance", cpus: 4, memory_mb: 8192,
      token: System.fetch_env!("FLY_API_TOKEN"),
      env: %{"LIVEBOOK_COOKIE" => Atom.to_string(Node.get_cookie())}
    },
    boot_timeout: 120_000,
    idle_shutdown_after: :timer.minutes(5)}
)

# 2. Spin up workers and distribute
workers = Dux.Flame.spin_up(5, pool: :dux_pool)

Dux.from_parquet("s3://bucket/data/**/*.parquet")
|> Dux.distribute(workers)
|> Dux.filter(amount > 100)
|> Dux.group_by(:region)
|> Dux.summarise(total: sum(amount))
|> Dux.compute()

Deployed app

# In your application supervisor children:
{FLAME.Pool,
  name: :dux_pool,
  backend: {FLAME.FlyBackend, ...},
  max: 10,
  max_concurrency: 1,
  code_sync: [start_apps: [:dux], copy_apps: true],
  idle_shutdown_after: :timer.minutes(5)}

# Then at runtime:
workers = Dux.Flame.spin_up(5, pool: :dux_pool)

Workers read S3 data directly — nothing flows through your machine. After idle timeout, FLAME auto-terminates the runners.

Summary

Functions

spin_up(n, opts \\ [])

Spin up n Dux workers on FLAME runners.

status(workers)

Get status of the Dux worker cluster.

Functions

spin_up(n, opts \\ [])

Spin up n Dux workers on FLAME runners.

Each worker is placed on a separate FLAME runner via FLAME.place_child/3. Returns a list of worker PIDs suitable for Dux.distribute/2.

Options

:pool — FLAME pool name (default: Dux.FlamePool)
:setup — callback function to run on each worker after startup
:memory_limit — DuckDB memory limit per worker (e.g. "2GB")
:temp_directory — spill-to-disk directory (default: system temp)

status(workers)

Get status of the Dux worker cluster.

Pass the workers list returned by spin_up/2.

workers = Dux.Flame.spin_up(3, pool: :dux_pool)
Dux.Flame.status(workers)
# => %{total_workers: 3, nodes: %{:"flame-abc@..." => 1, ...}, ...}

Returns worker count grouped by node.