GeminiCliSdk

Hex.pm Version HexDocs License Downloads

GeminiCliSdk

An Elixir SDK for the Gemini CLI -- build AI-powered applications with Google Gemini through a robust, idiomatic wrapper around the Gemini command-line interface.

Documentation Menu

  • README.md - installation, quick start, and runtime boundaries
  • guides/getting-started.md - first execution and session flows
  • guides/options.md - runtime and CLI option shaping
  • guides/models.md - Gemini model selection behavior
  • guides/architecture.md - shared core runtime ownership
  • guides/testing.md - local validation workflow

Features

  • Streaming -- Lazy Stream-based API with typed event structs and backpressure
  • Synchronous -- Simple {:ok, text} | {:error, error} for request/response patterns

  • Session Management -- List, resume, and delete conversation sessions
  • Shared Core Runtime -- Streaming and one-shot command execution now run on cli_subprocess_core while preserving Gemini-specific public types and entrypoints
  • Subprocess Safety -- Built on cli_subprocess_core, which owns the raw transport lane and the native subprocess runtime used for cleanup and raw process control
  • Typed Events -- 6 event types (init, message, tool_use, tool_result, error, result) parsed from JSONL
  • Full Options -- Model selection, YOLO mode, sandboxing, extensions, tool restrictions, and more
  • OTP Integration -- Application supervision tree with TaskSupervisor for async I/O

Installation

Add gemini_cli_sdk to your dependencies in mix.exs:

def deps do
  [
    {:gemini_cli_sdk, "~> 0.2.0"}
  ]
end

Prerequisites: The Gemini CLI must be installed and authenticated.

Quick Start

Streaming

GeminiCliSdk.execute("Explain GenServer in 3 sentences")
|> Enum.each(fn event ->
  case event do
    %GeminiCliSdk.Types.MessageEvent{role: "assistant", content: text} ->
      IO.write(text)
    _ ->
      :ok
  end
end)

Synchronous

{:ok, response} = GeminiCliSdk.run("What is Elixir?")
IO.puts(response)

With Options

opts = %GeminiCliSdk.Options{
  model: GeminiCliSdk.Models.fast_model(),
  yolo: true,
  timeout_ms: 60_000
}

{:ok, response} = GeminiCliSdk.run("Refactor this function", opts)

Sessions

# List sessions
{:ok, sessions} = GeminiCliSdk.list_sessions()

# Resume a session
GeminiCliSdk.resume_session("abc123", %GeminiCliSdk.Options{}, "Continue")
|> Enum.each(fn event ->
  case event do
    %GeminiCliSdk.Types.MessageEvent{role: "assistant", content: text} ->
      IO.write(text)
    _ -> :ok
  end
end)

Event Types

StructDescription
Types.InitEventSession initialized with session_id and model
Types.MessageEventMessage chunk with role and content
Types.ToolUseEventTool invocation with tool_name and parameters
Types.ToolResultEventTool result with tool_id and output
Types.ErrorEventError with severity and message
Types.ResultEventFinal result with status and stats

All stream-event structs are now schema-backed. Known fields are normalized through Zoi, forward-compatible unknown fields are preserved in extra, and the event modules expose to_map/1 for projection back to wire shape.

Architecture

GeminiCliSdk preserves its public API while running the common CLI session lane on cli_subprocess_core.

The current layering is:

GeminiCliSdk public API
  -> GeminiCliSdk.Stream / GeminiCliSdk.Runtime.CLI
  -> CliSubprocessCore.Session
  -> CliSubprocessCore raw transport
  -> gemini CLI

GeminiCliSdk command helpers
  -> CliSubprocessCore.Command.run/2
  -> CliSubprocessCore raw transport
  -> gemini CLI

GeminiCliSdk.Runtime.CLI is the Gemini runtime kit. It starts core sessions, preserves Gemini CLI command resolution and option shaping, and projects normalized core events back into GeminiCliSdk.Types.*.

The preserved GeminiCliSdk.Transport modules are public Gemini entrypoints backed by the core raw transport layer instead of owning a separate subprocess runtime.

Ownership Boundary

The final Phase 4 boundary for Gemini is:

  • shared session lifecycle
  • shared JSONL parsing and normalized event flow
  • shared raw transport ownership inside cli_subprocess_core
  • shared non-PTY command execution for session management and version helpers

Public Gemini entrypoints stay the same:

Gemini CLI resolution, option shaping, and public result/error mapping remain in this repo above the shared core.

No separate Gemini-owned common subprocess runtime remains here. Repo-local ownership is limited to Gemini CLI discovery, argument and environment shaping, typed event/result projection, and the public Gemini transport surface above the shared core.

The release and composition model is:

  • the common Gemini profile stays built into cli_subprocess_core
  • gemini_cli_sdk remains the provider-specific runtime-kit package above that shared core
  • no extra ASM extension seam is introduced unless Gemini later proves a real richer provider-native surface beyond the current common lane

If gemini_cli_sdk is installed alongside agent_session_manager, ASM reports Gemini runtime availability in ASM.Extensions.ProviderSDK.capability_report/0 but keeps namespaces: [] because Gemini currently composes through the common ASM surface only.

Centralized Model Selection

gemini_cli_sdk now consumes model payloads resolved by cli_subprocess_core. The SDK no longer owns active fallback/defaulting policy for provider selection.

Authoritative policy surface:

Gemini-side responsibility is limited to:

  • carrying the resolved model_payload on GeminiCliSdk.Options
  • projecting the resolved model for UX and metadata
  • rendering --model only when the resolved value is non-empty
  • treating repo-local env defaults as fallback inputs only when no explicit payload was supplied

No repo-local Gemini model fallback remains.

GeminiCliSdk.Options.validate!/1 canonicalizes explicit payloads through the shared core boundary. A CliSubprocessCore.ModelRegistry.Selection is the preferred form, and Map.from_struct(selection) is normalized back into the same canonical payload when callers already have a serialized struct map.

Documentation

Full documentation is available at HexDocs.

Examples

See the examples/ directory for live examples that run against the real Gemini CLI:

mix run examples/simple_prompt.exs
mix run examples/streaming.exs
bash examples/run_all.sh

License

MIT License. See LICENSE for details.

Model Selection Contract

See Centralized Model Selection. The Gemini SDK renders provider transport args from the shared resolved payload and does not emit nil/null/blank model values.

Session Listing And Resume Surfaces

Gemini now exposes a typed session-history projection for orchestration layers that need to recover an existing CLI conversation instead of replaying prompts from scratch.

The runtime also now carries system_prompt through the validated options surface so the caller can resume with the same instruction context it started with.