This guide explains the internal architecture of GeminiCliSdk for contributors and advanced users.
Module Hierarchy
graph TD
A[GeminiCliSdk] --> B[Stream]
A --> C[Command]
A --> D[Session]
B --> E[Runtime.CLI]
E --> F[CliSubprocessCore.Session]
F --> G[ExternalRuntimeTransport.Transport]
G --> H[Shared transport internals]
E --> I[CLI]
E --> J[ArgBuilder]
E --> K[Env]
E --> L[Config]
B --> M[Types]
C --> Q[CliSubprocessCore.Command]
C --> I
B --> O[Configuration]
C --> O
D --> B
P[GeminiCliSdk.Transport adapter] --> GData Flow
Streaming Execution
User Code
|
v
GeminiCliSdk.execute/2 -- validates options
|
v
Stream.execute/2 -- returns Stream.resource/3
|
v
Runtime.CLI.start_session/1 -- resolves CLI, preserves Gemini args/env
| starts a CliSubprocessCore.Session
v
CliSubprocessCore.Session -- shared common CLI session engine
|
v
ExternalRuntimeTransport.Transport -- shared raw transport
|
v
gemini CLI process -- emits JSONL to stdout
|
v
CliSubprocessCore.Session -- decodes Gemini profile events
|
v
Runtime.CLI.project_event/2 -- projects core events into Gemini public structs
|
v
Stream.receive_next/1 -- selective receive, timeout handling, cleanup
|
v
User's Enum/Stream consumer -- processes events lazilySynchronous Execution
GeminiCliSdk.run/2
|
v
GeminiCliSdk.execute/2 -- creates stream
|
v
Enum.reduce/3 -- collects assistant text
|
v
{:ok, text} | {:error, %Error{}}Key Modules
GeminiCliSdk (Public API)
The top-level module is a small delegating public API. It provides:
execute/2-- streaming executionrun/2-- synchronous executionlist_sessions/1,resume_session/3,delete_session/2-- session managementversion/0-- CLI version
GeminiCliSdk.Stream
The stream module still uses Stream.resource/3, but it now delegates session
ownership to the shared core lane.
Its main responsibilities are:
- start_fn: Starts
GeminiCliSdk.Runtime.CLI, subscribes to the scoped core session event stream, and closes stdin for prompt-driven runs. - next_fn: Does selective receive on that session's internal runtime events, captures stderr, projects public Gemini events, and enforces idle timeouts.
- after_fn: Closes the core session, flushes leftover mailbox messages, and cleans up temporary settings files.
The stream keeps the runtime-owned session tag inside its internal state and does not expose the raw core tag as public Gemini caller vocabulary.
GeminiCliSdk.Runtime.CLI
This is the Gemini runtime kit above CliSubprocessCore.Session.
It is responsible for:
- preserving Gemini CLI command resolution, including
GEMINI_CLI_PATHandnpx - preserving Gemini option-to-flag shaping for the public SDK surface
- starting the shared core session runtime
- projecting normalized core events back into
GeminiCliSdk.Types.*
The runtime kit uses a small Gemini invocation profile for command construction only. Parsing, event normalization, and subprocess ownership remain core-owned.
GeminiCliSdk.Transport
This module is the Gemini-facing raw transport entrypoint backed by
ExternalRuntimeTransport.Transport.
It preserves Gemini's public transport vocabulary while the actual transport
implementation lives in cli_subprocess_core.
GeminiCliSdk.Types
Defines the 6 public event structs and the parse_event/1 helper used by the
public projection layer. final_event?/1 still identifies stream-ending
events for the public Gemini surface.
GeminiCliSdk.CLI
Resolves the gemini binary location using a 4-strategy waterfall:
GEMINI_CLI_PATHenvironment variable (explicit path)geminion systemPATH(globally installed)- npm global bin directory (
npm prefix -g/bin/gemini) npxfallback (npx --yes --package @google/gemini-cli gemini)
Set GEMINI_NO_NPX=1 to disable the npx fallback. Returns a CommandSpec with program and argv_prefix (used by the npx strategy to prepend ["--yes", "--package", "@google/gemini-cli", "gemini"]).
GeminiCliSdk.ArgBuilder
Converts an Options struct into Gemini CLI arguments. After the replatform it
is no longer part of a Gemini-owned transport/runtime stack; it is only used by
GeminiCliSdk.Runtime.CLI to preserve Gemini-specific invocation semantics
above the shared core session engine.
GeminiCliSdk.Env
Builds the subprocess environment by filtering the current system env (passing through GEMINI_* and GOOGLE_* prefixed variables) and merging user overrides.
GeminiCliSdk.Config
Handles temporary settings files. When Options.settings is set, it writes a settings.json to a temp directory, which is cleaned up after the stream completes.
GeminiCliSdk.Command
Synchronous command runner for non-streaming operations (list sessions, delete
session, version). It now builds Gemini-specific invocations locally and
executes them through CliSubprocessCore.Command.run/2.
GeminiCliSdk.Models
Single source of truth for model names. Provides default_model/0, fast_model/0, alias resolution, and validation. All model references in the SDK flow through this module. See the Models guide.
GeminiCliSdk.Configuration
Centralizes all numeric constants (timeouts, buffer sizes, limits). Every internal module reads its constants from here. Supports runtime override via Application.get_env/3. See the Configuration guide.
OTP Integration
cli_subprocess_core starts the shared Task.Supervisor and provider registry
used by the core session and transport layers.
GeminiCliSdk keeps its own OTP application module for SDK-local processes and backwards-compatible application startup.
Error Architecture
Errors flow through two paths:
- Stream path: Errors become
Types.ErrorEventstructs in the event stream - Command path: Errors become
{:error, %Error{}}tuples
Core runtime errors are projected back into Gemini Types.ErrorEvent values so
callers do not need to adopt the normalized core event vocabulary.
The Gemini event structs remain the ergonomic public surface, while the stream
parser now treats Zoi as the source of truth for validation/normalization and
stores unknown fields in extra for forward-compatible projection.