ADR-0010: Docker as recommended production executor

View Source

Status

Accepted

Context

Production deployments require isolated execution of skill commands. Isolation must protect against:

  • Filesystem access: Skills should only access designated directories
  • Network access: Skills should not exfiltrate data or attack internal services
  • Resource consumption: Skills should not exhaust CPU, memory, or disk
  • Process escape: Skills should not affect other processes or the host

Available isolation technologies:

TechnologyIsolation LevelPortabilitySetup ComplexityPerformance
DockerStrongHighLowGood
PodmanStrongMediumLowGood
FirecrackerVery StrongLowHighExcellent
gVisorVery StrongLowMediumFair
VM (QEMU)CompleteHighHighPoor
seccomp/LandlockPartialLinux onlyMediumExcellent

Docker provides the best balance of:

  • Widespread availability (most servers have it)
  • Strong isolation (namespaces, cgroups, seccomp)
  • Developer familiarity
  • Reasonable performance
  • Cross-platform support

Decision

We will provide Conjure.Executor.Docker as the recommended production executor.

Container Model

Each skill execution session runs in a fresh container:

def init(context) do
  volumes = [
    {context.skills_root, "/mnt/skills", :ro},      # Skills read-only
    {context.working_directory, "/workspace", :rw}   # Working dir read-write
  ]

  args = [
    "run", "-d",
    "--rm",
    "--network=none",
    "--memory=#{config.memory_limit}",
    "--cpus=#{config.cpu_limit}",
    "--security-opt=no-new-privileges",
    "--read-only",
    "--tmpfs=/tmp:size=100M",
    volume_args(volumes),
    config.image,
    "tail", "-f", "/dev/null"  # Keep container alive
  ]

  {container_id, 0} = System.cmd("docker", args)
  {:ok, %{context | container_id: String.trim(container_id)}}
end

Commands execute in the running container:

def bash(command, context) do
  {output, exit_code} = System.cmd("docker", [
    "exec",
    "-w", "/workspace",
    context.container_id,
    "bash", "-c", command
  ], stderr_to_stdout: true)

  case exit_code do
    0 -> {:ok, output}
    _ -> {:error, {:exit_code, exit_code, output}}
  end
end

Cleanup removes the container:

def cleanup(context) do
  System.cmd("docker", ["rm", "-f", context.container_id])
  :ok
end

Default Image

We will provide a default sandbox image with common tools:

FROM ubuntu:24.04

# System packages
RUN apt-get update && apt-get install -y \
    python3.12 python3-pip python3-venv \
    nodejs npm \
    bash git curl wget jq \
    poppler-utils qpdf \
    && rm -rf /var/lib/apt/lists/*

# Python packages (matching Anthropic's skill environment)
RUN pip3 install --break-system-packages \
    pyarrow openpyxl xlsxwriter pillow \
    python-pptx python-docx pypdf pdfplumber \
    reportlab pandas numpy matplotlib pyyaml

# Non-root user
RUN useradd -m -s /bin/bash -u 1000 sandbox
USER sandbox
WORKDIR /workspace

ENV PYTHONUNBUFFERED=1

Build via Mix task:

mix conjure.docker.build

Consequences

Positive

  • Strong isolation with well-understood security model
  • Resource limits enforced (memory, CPU)
  • Network isolation by default
  • Works with existing Docker infrastructure
  • Familiar to operations teams
  • Container images are immutable and auditable
  • Easy to customize image for specific needs

Negative

  • Docker daemon required on host
  • Container startup latency (~100-500ms per session)
  • Disk space for images
  • Docker socket access may be restricted in some environments
  • Windows requires Docker Desktop or WSL2

Neutral

  • Container per session (not per command) balances isolation and performance
  • Read-only root filesystem with tmpfs for temp files
  • Skills mounted read-only, working directory read-write

Configuration Options

config :conjure, :executor_config,
  docker: %{
    image: "conjure/sandbox:latest",
    memory_limit: "512m",
    cpu_limit: "1.0",
    network: :none,          # :none | :bridge | :host
    read_only: true,
    tmpfs_size: "100M",
    user: "sandbox",
    security_opts: ["no-new-privileges"]
  }

Security Hardening

1. No New Privileges

--security-opt=no-new-privileges

Prevents privilege escalation via setuid binaries.

2. Read-Only Root

--read-only --tmpfs=/tmp:size=100M

Container filesystem is immutable; only /tmp and mounted volumes are writable.

3. Dropped Capabilities

--cap-drop=ALL

Remove all Linux capabilities (optional, may break some tools).

4. Seccomp Profile

--security-opt=seccomp=/path/to/profile.json

Restrict available system calls (custom profile can be provided).

5. Network Isolation

--network=none

No network access by default. Skills requiring network must explicitly configure.

Performance Considerations

Container Reuse

For high-throughput scenarios, containers can be pooled:

defmodule Conjure.Executor.Docker.Pool do
  use GenServer

  def checkout(config) do
    # Return existing container or create new one
  end

  def checkin(container_id) do
    # Return container to pool (or destroy if limit reached)
  end
end

Warm Containers

Pre-start containers during low-load periods:

def warm(count, config) do
  for _ <- 1..count do
    {:ok, ctx} = Docker.init(%ExecutionContext{})
    Pool.add(ctx.container_id)
  end
end

Alternatives Considered

Podman

Daemonless container runtime. Considered as alternative because:

  • Rootless by default
  • Docker-compatible CLI
  • No daemon required

Not chosen as primary because:

  • Less widespread than Docker
  • Some compatibility issues with Docker images
  • Will be supported as alternative (same executor, different binary)

Firecracker

MicroVM technology from AWS. Rejected as default because:

  • Requires KVM (not available everywhere)
  • Complex setup
  • Overkill for typical skill workloads

Recommended for high-security deployments; custom executor can be implemented.

Kubernetes Jobs

Run each execution as a K8s Job. Rejected because:

  • Requires Kubernetes cluster
  • High latency (job scheduling)
  • Over-engineered for single-node deployments

Suitable for large-scale deployments; custom executor can be implemented.

References