# `Condukt.Sandbox.Kubernetes`
[🔗](https://github.com/tuist/condukt/blob/1.5.1/lib/condukt/sandbox/kubernetes.ex#L1)

Sandbox that runs each session inside a dedicated Kubernetes Pod.

One Pod per session. All filesystem reads and writes and all subprocess
execution happen inside the Pod via the Kubernetes exec API. The agent
cannot reach the host running the Condukt BEAM process.

## Idempotent init via the session id

`init/1` is idempotent on a stable id: it derives a deterministic Pod name
from it and either adopts an existing Pod or creates a fresh one. The
session's `:id` (passed to `Condukt.Session.start_link/1`, or
auto-generated) flows into the sandbox by default, so a single id at the
session level drives both the session and the Pod. This is the
recommended pattern for Oban-style workers where the job lifecycle and
the Pod lifecycle are decoupled:

    defmodule MyApp.AgentWorker do
      use Oban.Worker, queue: :agents, max_attempts: 3

      @impl true
      def perform(%Oban.Job{id: job_id, args: %{"prompt" => prompt}}) do
        {:ok, agent} =
          MyApp.CodingAgent.start_link(
            id: job_id,
            api_key: System.get_env("ANTHROPIC_API_KEY"),
            sandbox: {Condukt.Sandbox.Kubernetes, namespace: "agents"}
          )

        Condukt.Session.run(agent, prompt)
      end
    end

If the job is retried after a crash, the same `job_id` flows through and
the sandbox reattaches to the existing Pod. Repo clones and in-progress
file edits persist (they live in an `emptyDir` volume mounted at the
session cwd, which survives container restarts within the same Pod).

Pass `:id` explicitly on the sandbox spec only when you want the pod
identity to diverge from the session identity. An explicit value wins
over the session-supplied default. When no id is supplied at the session
level, one is generated and the pod is single-use: `shutdown/1` deletes
it.

## Init options

* `:id` — stable identifier used to derive the pod name. Defaults to the
  session id when invoked through `Condukt.Session`. Pass it explicitly
  on the sandbox spec only to diverge from the session id.
* `:namespace` — Kubernetes namespace (default `"default"`).
* `:image` — container image (default `"debian:bookworm-slim"`).
* `:cwd` — working directory inside the pod, also where the workspace
  volume is mounted (default `"/workspace"`).
* `:env` — environment variables to set on the pod container, as a map or
  list of `{key, value}` pairs.
* `:labels` — additional pod labels (caller-supplied; merged on top of
  Condukt's defaults).
* `:annotations` — additional pod annotations.
* `:resources` — Kubernetes resource requests/limits map, e.g.
  `%{requests: %{cpu: "500m", memory: "1Gi"}, limits: %{cpu: "2", memory: "4Gi"}}`.
* `:service_account` — Kubernetes ServiceAccount the pod runs as.
* `:active_deadline_seconds` — K8s-side hard ceiling for the pod's lifetime
  (default 8 hours). Insurance against abandoned pods.
* `:heartbeat_interval` — milliseconds between pod heartbeat annotation
  updates (default `60_000`). Pass `false` to disable. Use
  `reap_stale/1` from a separate process to delete pods whose heartbeat is
  too old.
* `:workspace_source` — git repository to clone into the workspace at init.
  Accepts a git URL string or a keyword/map with `:git` and optional `:ref`.
  The runtime image must include `git`.
* `:workspace_source_timeout` — milliseconds to wait for the workspace
  clone or checkout command (default `300_000`).
* `:ready_timeout` — milliseconds to wait for a created pod to reach
  Running phase (default `120_000`).
* `:on_stale` — what to do when adopting a pod that is in an unexpected
  phase (Succeeded / Failed). `:error` (default) returns
  `{:error, {:stale_pod, phase}}`; `:recreate` deletes and recreates.
* `:delete_on_shutdown` — whether `shutdown/1` deletes the pod. Defaults
  to `false` when `:id` is supplied (the pod outlives any single BEAM
  process), `true` when no id is given.
* `:conn` — already-built `K8s.Conn`. Skips kubeconfig/in-cluster
  resolution.
* `:kubeconfig` — path to a kubeconfig file (default `~/.kube/config`).
* `:context` — kubeconfig context name (default: current-context).
* `:in_cluster` — `true` to use the pod's mounted ServiceAccount token.
  Auto-detected when `KUBERNETES_SERVICE_HOST` is set, so usually not
  needed.

## RBAC

The Kubernetes identity used by the Condukt process needs:

    apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "list", "create", "patch", "delete"]

    apiGroups: [""]
    resources: ["pods/exec"]
    verbs: ["create"]

See `guides/sandbox.md` for a full sample `Role` + `RoleBinding`.

## Limitations

* `mount/3` is not supported. Volumes cannot be added to a running pod.
* Node failure loses the pod's `emptyDir` workspace. Mount a
  PersistentVolumeClaim into the pod manifest if you need cross-node
  durability — currently requires a custom `:image` setup, not exposed
  as an init option in v1.
* `:workspace_source` shells out to `git` inside the pod. Use an image that
  includes `git` when enabling it.

# `heartbeat`

Updates the heartbeat annotation on a Kubernetes sandbox pod.

The sandbox starts a worker tied to the owner process by default. This
helper is exposed for callers that disable the worker and want to drive
heartbeats from their own supervision tree.

# `reap_stale`

Deletes Condukt-managed pods whose heartbeat annotation is older than
`:stale_after`.

Options:

  * `:namespace` - namespace to scan, default `"default"`.
  * `:stale_after` - heartbeat age in milliseconds, default 15 minutes.
  * `:now` - `DateTime` used for tests, default `DateTime.utc_now()`.
  * K8s connection options accepted by `init/1`, such as `:conn`,
    `:kubeconfig`, `:context`, and `:in_cluster`.

# `terminate`

Explicitly delete the pod backing a session.

Use this when a session is truly done and you do not want the pod to
outlive the BEAM process (the default when `:id` is set).

    Condukt.Sandbox.Kubernetes.terminate(id, namespace: "agents")

---

*Consult [api-reference.md](api-reference.md) for complete listing*
