Step Types Reference
View SourceTable of Contents
Overview
Pipeline supports 17+ distinct step types organized into four categories:
- AI Provider Steps: Interact with Claude and Gemini APIs
- Control Flow Steps: Manage execution flow and pipeline composition
- Data & File Operations: Transform data and manipulate files
- State Management: Manage variables and checkpoints
AI Provider Steps
Gemini
Type: gemini
Purpose: Use Google's Gemini models for analysis, planning, and decision-making
- name: "analyze_requirements"
type: "gemini"
role: "brain" # Semantic role indicator
model: "gemini-2.5-flash" # Model selection
# Token configuration
token_budget:
max_output_tokens: 4096
temperature: 0.7 # Creativity level (0-1)
top_p: 0.95 # Nucleus sampling
top_k: 40 # Top-k sampling
# Function calling
functions:
- "evaluate_code_quality"
- "generate_test_cases"
# Prompt
prompt:
- type: "static"
content: "Analyze these requirements and create a plan"
- type: "file"
path: "requirements.md"
# Output
output_to_file: "analysis_plan.json"
output_schema:
type: "object"
required: ["plan", "risks", "timeline"]Key Features:
- Multiple model options (flash, pro, etc.)
- Function calling support
- Structured output with schema validation
- Fine-tuned token control
Common Use Cases:
- Requirements analysis
- Architecture planning
- Code review and quality assessment
- Test generation
- Decision making
Claude
Type: claude
Purpose: Use Anthropic's Claude for code execution, file manipulation, and implementation tasks
- name: "implement_feature"
type: "claude"
role: "muscle" # Execution role
claude_options:
# Core settings
max_turns: 20 # Conversation turns limit
output_format: "json" # Response format
verbose: true # Detailed logging
# Model selection
model: "sonnet" # Model choice: "sonnet", "opus", or specific version
fallback_model: "sonnet" # Fallback when primary model overloaded
# Tool permissions
allowed_tools: ["Write", "Edit", "Read", "Bash", "Search"]
disallowed_tools: ["Delete"] # Explicitly forbidden
# System prompts
system_prompt: "You are an expert Python developer"
append_system_prompt: "Follow PEP 8 strictly"
# Working environment
cwd: "./workspace/backend" # Working directory
# Advanced options
session_id: "feature_dev_001" # For session continuation
resume_session: true # Resume if exists
debug_mode: true
telemetry_enabled: true
cost_tracking: true
# Async streaming (optional)
async_streaming: true # Enable real-time streaming
stream_handler: "console" # Handler: console/file/callback/buffer
stream_buffer_size: 100 # Buffer size for batching
prompt:
- type: "static"
content: "Implement the user authentication feature"
- type: "previous_response"
step: "analyze_requirements"
extract: "implementation_plan"
output_to_file: "implementation_result.json"Available Tools:
Write: Create new filesEdit: Modify existing filesRead: Read file contentsBash: Execute shell commandsSearch: Search across filesGlob: Find files by patternGrep: Search file contents
Key Features:
- Full file system access within workspace
- Shell command execution
- Session management for continuity
- Comprehensive error handling
- Cost tracking and telemetry
- Model selection for cost optimization (25x savings: sonnet vs opus)
- Fallback model support for reliability
Claude Smart
Type: claude_smart
Purpose: Intelligent preset-based Claude configuration with environment awareness
- name: "smart_analysis"
type: "claude_smart"
preset: "analysis" # Preset configuration
environment_aware: true # Auto-detect from Mix.env
# Optional overrides
claude_options:
max_turns: 10 # Override preset default
prompt:
- type: "file"
path: "pipelines/prompts/analysis/security_audit.md"
- type: "file"
path: "src/main.ex"
output_to_file: "security_analysis.json"Available Presets:
development: Permissive settings, full tool access, verbose logging (uses sonnet - cost-effective)production: Restricted tools, optimized for safety and performance (uses opus with sonnet fallback)analysis: Read-only tools, optimized for code analysis (uses opus - best capability)chat: Simple conversation mode, basic tools
Key Features:
- Automatic configuration based on environment
- Preset-specific optimizations
- Simplified configuration
- Intelligent defaults
- Built-in model selection for cost optimization
Model Selection & Cost Control
All Claude step types support model selection for cost optimization and performance tuning:
Model Options
claude_options:
# Simple shortcuts (recommended)
model: "sonnet" # Fast, cost-effective (~$0.01 per query)
model: "opus" # Highest quality (~$0.26 per query, 25x more expensive)
# Specific model versions (for reproducibility)
model: "claude-3-5-sonnet-20241022"
model: "claude-3-opus-20240229"
# Production reliability with fallback
model: "opus"
fallback_model: "sonnet" # Falls back when opus overloadedCost Optimization Examples
# Development workflow - cost-effective
- name: "dev_task"
type: "claude_smart"
preset: "development" # Automatically uses sonnet
# Production workflow - quality + reliability
- name: "prod_task"
type: "claude_smart"
preset: "production" # Uses opus with sonnet fallback
# Manual cost control
- name: "simple_task"
type: "claude"
claude_options:
model: "sonnet" # 25x cheaper for simple tasks
- name: "complex_analysis"
type: "claude"
claude_options:
model: "opus" # Worth the cost for complex workClaude Session
Type: claude_session
Purpose: Maintain stateful conversations across multiple interactions
- name: "tutoring_session"
type: "claude_session"
session_config:
session_name: "elixir_tutorial"
persist: true # Save session state
continue_on_restart: true # Resume after failures
checkpoint_frequency: 5 # Save every 5 turns
max_turns: 100 # Extended conversation
description: "Interactive Elixir tutoring session"
claude_options:
allowed_tools: ["Write", "Read", "Bash"]
output_format: "text"
prompt:
- type: "static"
content: "Let's continue our Elixir tutorial"
- type: "session_context"
session_id: "elixir_tutorial"
include_last_n: 10 # Include last 10 messages
output_to_file: "session_transcript.md"Key Features:
- Conversation state persistence
- Automatic checkpointing
- Session recovery after failures
- Context window management
- Multi-turn conversations
Use Cases:
- Interactive tutorials
- Long-running development tasks
- Iterative refinement processes
- Conversational workflows
Claude Extract
Type: claude_extract
Purpose: Advanced content extraction and post-processing
- name: "extract_insights"
type: "claude_extract"
preset: "analysis"
extraction_config:
use_content_extractor: true
format: "structured" # Output format
post_processing:
- "extract_code_blocks"
- "extract_recommendations"
- "extract_key_points"
- "extract_links"
max_summary_length: 2000
include_metadata: true
claude_options:
max_turns: 5
allowed_tools: ["Read"]
prompt:
- type: "static"
content: "Analyze this codebase and extract key insights"
- type: "file"
path: "src/"
output_to_file: "extracted_insights.json"Format Options:
text: Plain text extractionjson: Structured JSON outputstructured: Organized sectionssummary: Condensed summarymarkdown: Formatted markdown
Post-Processing Options:
extract_code_blocks: Extract code snippetsextract_recommendations: Pull out suggestionsextract_key_points: Identify main pointsextract_links: Extract URLs and referencesextract_entities: Named entity extraction
Claude Batch
Type: claude_batch
Purpose: Process multiple tasks in parallel
- name: "batch_code_review"
type: "claude_batch"
batch_config:
max_parallel: 3 # Concurrent executions
timeout_per_task: 300 # 5 minutes per task
consolidate_results: true # Merge all results
tasks:
- id: "review_backend"
prompt:
- type: "static"
content: "Review the backend code"
- type: "file"
path: "backend/src/"
output_to_file: "backend_review.json"
- id: "review_frontend"
prompt:
- type: "static"
content: "Review the frontend code"
- type: "file"
path: "frontend/src/"
output_to_file: "frontend_review.json"
- id: "review_tests"
prompt:
- type: "static"
content: "Review test coverage and quality"
- type: "file"
path: "test/"
output_to_file: "test_review.json"Key Features:
- Parallel task execution
- Independent task configuration
- Result consolidation
- Load balancing
- Per-task timeouts
Claude Robust
Type: claude_robust
Purpose: Enterprise-grade error handling and retry logic
- name: "critical_deployment"
type: "claude_robust"
retry_config:
max_retries: 5
backoff_strategy: "exponential" # Exponential backoff
retry_conditions:
- "timeout"
- "rate_limit"
- "api_error"
fallback_action: "simplified_prompt"
claude_options:
max_turns: 30
allowed_tools: ["Write", "Edit", "Read", "Bash"]
timeout_ms: 60000 # 1 minute timeout
prompt:
- type: "file"
path: "pipelines/prompts/deployment/production_deploy.md"
- type: "previous_response"
step: "deployment_plan"
output_to_file: "deployment_result.json"Retry Strategies:
linear: Fixed delay between retriesexponential: Exponentially increasing delays
Fallback Actions:
simplified_prompt: Retry with simpler instructionsmock_response: Return mock dataskip: Skip the stepfail: Fail the pipeline
Parallel Claude
Type: parallel_claude
Purpose: Execute multiple Claude instances simultaneously
- name: "parallel_implementation"
type: "parallel_claude"
parallel_tasks:
- id: "api_endpoints"
claude_options:
max_turns: 20
allowed_tools: ["Write", "Edit", "Read"]
cwd: "./workspace/api"
prompt:
- type: "static"
content: "Implement the REST API endpoints"
- type: "previous_response"
step: "api_design"
output_to_file: "api_implementation.json"
- id: "database_schema"
claude_options:
max_turns: 15
allowed_tools: ["Write", "Edit"]
cwd: "./workspace/database"
prompt:
- type: "static"
content: "Create the database schema"
- type: "previous_response"
step: "data_model"
output_to_file: "db_implementation.json"
- id: "test_suite"
claude_options:
max_turns: 25
allowed_tools: ["Write", "Read", "Bash"]
cwd: "./workspace/tests"
prompt:
- type: "static"
content: "Write comprehensive tests"
output_to_file: "test_implementation.json"Key Features:
- True parallel execution
- Independent workspaces
- Task isolation
- Result aggregation
- Resource management
Gemini Instructor
Type: gemini_instructor
Purpose: Structured output generation with schema validation
- name: "generate_api_spec"
type: "gemini_instructor"
model: "gemini-2.5-flash"
validation_mode: "strict" # Strict schema enforcement
schema:
type: "object"
required: ["endpoints", "models", "authentication"]
properties:
endpoints:
type: "array"
items:
type: "object"
required: ["method", "path", "description"]
properties:
method:
type: "string"
enum: ["GET", "POST", "PUT", "DELETE"]
path:
type: "string"
pattern: "^/api/.*"
description:
type: "string"
models:
type: "object"
authentication:
type: "object"
required: ["type", "endpoints"]
prompt:
- type: "static"
content: "Generate OpenAPI specification for this service"
- type: "file"
path: "docs/api_requirements.md"
output_to_file: "api_specification.json"Key Features:
- Guaranteed structured output
- Schema validation
- Type safety
- Integration with InstructorLite
- JSON Schema support
Async Streaming Configuration
For Claude-based step types, you can enable async streaming to get real-time responses:
# Example: Claude with async streaming
- name: "streaming_implementation"
type: "claude"
claude_options:
# Enable async streaming
async_streaming: true
# Choose a handler type
stream_handler: "console" # Options: console, file, callback, buffer
# Configure buffer size for batching messages
stream_buffer_size: 50 # Default: 100
# Other options work as normal
max_turns: 20
allowed_tools: ["Write", "Edit", "Read"]
prompt:
- type: "static"
content: "Implement the feature with real-time feedback"Stream Handler Types:
console: Stream output directly to console in real-timefile: Stream output to a file as it arrivescallback: Use custom callback function for handling streambuffer: Collect stream into memory buffer
Benefits of Async Streaming:
- Real-time feedback during long-running operations
- Lower memory usage for large responses
- Better user experience with progressive output
- Early error detection and interruption capability
Supported Step Types:
claudeclaude_smartclaude_sessionclaude_extractclaude_batchclaude_robustparallel_claude
Control Flow Steps
Pipeline
Type: pipeline
Purpose: Execute another pipeline as a step (recursive composition)
- name: "data_processing"
type: "pipeline"
# Option 1: External file
pipeline_file: "./pipelines/data_processor.yaml"
# Option 2: Registry reference (future)
# pipeline_ref: "common/data_processor"
# Option 3: Inline definition
# pipeline:
# name: "inline_processor"
# steps: [...]
# Input mapping
inputs:
raw_data: "{{steps.extract.result}}"
config: "{{workflow.processing_config}}"
mode: "production"
# Output extraction
outputs:
- "processed_data" # Simple extraction
- path: "metrics.accuracy" # Path extraction
as: "accuracy_score"
- path: "report.summary"
as: "summary_text"
# Configuration
config:
inherit_context: true # Pass parent context
workspace_dir: "./nested/data_processing"
timeout_seconds: 600
max_depth: 5 # Nesting limit
enable_tracing: trueKey Features:
- Pipeline composition and modularity
- Context isolation
- Input/output mapping
- Safety limits (depth, memory, time)
- Circular dependency detection
Use Cases:
- Reusable workflows
- Complex multi-stage processing
- Modular architecture
- Testing pipeline components
For Loop
Type: for_loop
Purpose: Iterate over collections with optional parallelization
- name: "process_files"
type: "for_loop"
iterator: "file" # Loop variable name
data_source: "{{steps.scan.files}}" # Array to iterate
# Parallel execution
parallel: true
max_parallel: 5 # Concurrent limit
# Error handling
break_on_error: false # Continue on errors
# Loop body
steps:
- name: "analyze_file"
type: "claude"
prompt:
- type: "static"
content: "Analyze this file:"
- type: "static"
content: "{{loop.file.path}}"
output_to_file: "analysis_{{loop.file.name}}.json"
- name: "update_progress"
type: "set_variable"
variables:
processed_count: "{{state.processed_count + 1}}"Variable Access:
{{loop.variable}}: Current item{{loop.index}}: Current index (0-based){{loop.parent.variable}}: Parent loop variable{{loop.is_first}}: Boolean first iteration{{loop.is_last}}: Boolean last iteration
Data Sources:
- Arrays from previous steps
- Static arrays
- File listings
- Query results
While Loop
Type: while_loop
Purpose: Repeat steps until condition is met
- name: "optimize_until_passing"
type: "while_loop"
condition: "{{steps.test.score < 90}}"
# Safety limits
max_iterations: 10
timeout_seconds: 1800 # 30 minutes total
# Loop body
steps:
- name: "optimize"
type: "claude"
prompt:
- type: "static"
content: "Improve the code to increase test score"
- type: "previous_response"
step: "test"
extract: "failures"
- name: "test"
type: "gemini"
prompt:
- type: "static"
content: "Run tests and calculate score"
output_schema:
type: "object"
required: ["score", "failures"]Condition Expressions:
- Comparison operators:
==,!=,>,<,>=,<= - Boolean operators:
and,or,not - String operations:
contains,matches - Null checks:
exists,empty
Safety Features:
- Maximum iteration limit
- Total timeout protection
- Infinite loop detection
- Memory usage monitoring
Switch/Case
Type: switch
Purpose: Branch execution based on values
- name: "handle_by_type"
type: "switch"
expression: "{{steps.detect.file_type}}"
cases:
"python":
- name: "analyze_python"
type: "claude"
prompt:
- type: "static"
content: "Analyze Python code"
"javascript":
- name: "analyze_js"
type: "claude"
prompt:
- type: "static"
content: "Analyze JavaScript code"
"markdown":
- name: "process_docs"
type: "gemini"
prompt:
- type: "static"
content: "Process documentation"
default:
- name: "generic_analysis"
type: "gemini"
prompt:
- type: "static"
content: "Perform generic analysis"Features:
- Value-based branching
- Multiple steps per case
- Default fallback
- Expression evaluation
- Pattern matching support
Data & File Operations
Data Transform
Type: data_transform
Purpose: Transform and manipulate structured data
- name: "process_results"
type: "data_transform"
input_source: "{{steps.analysis.results}}"
operations:
# Filter operation
- operation: "filter"
field: "items"
condition: "score > 80 and category == 'critical'"
# Map transformation
- operation: "map"
field: "items"
expression: |
{
"id": item.id,
"score": item.score * 100,
"priority": item.score > 90 ? "high" : "normal"
}
# Aggregation
- operation: "aggregate"
field: "items"
function: "average"
group_by: "category"
# Join with another dataset
- operation: "join"
left_field: "items"
right_source: "{{steps.metadata.items}}"
join_key: "id"
join_type: "left"
# Sort results
- operation: "sort"
field: "items"
by: "score"
order: "desc"
output_field: "processed_results"
output_to_file: "transformed_data.json"Available Operations:
filter: Filter items by conditionmap: Transform each itemaggregate: Calculate statisticsjoin: Combine datasetsgroup_by: Group itemssort: Order itemsunique: Remove duplicatesflatten: Flatten nested arrays
Aggregate Functions:
sum,average,min,maxcount,count_distinctstd_dev,variance
File Operations
Type: file_ops
Purpose: Manipulate files and directories
- name: "organize_outputs"
type: "file_ops"
# Copy files
operation: "copy"
source:
- "./workspace/src/*.py"
- "./workspace/tests/*.py"
destination: "./output/python_files/"
# Alternative operations:
# Move files
# operation: "move"
# source: "./temp/*"
# destination: "./processed/"
# Delete files
# operation: "delete"
# files: ["./temp/*.tmp", "./cache/*"]
# Validate files exist
# operation: "validate"
# files:
# - path: "./config/app.yaml"
# must_exist: true
# min_size: 100
# - path: "./output/"
# must_be_dir: true
# List files
# operation: "list"
# path: "./workspace"
# pattern: "**/*.py"
# recursive: true
# Convert formats
# operation: "convert"
# source: "./data.csv"
# destination: "./data.json"
# format: "csv_to_json"Supported Operations:
copy: Duplicate files/directoriesmove: Relocate files/directoriesdelete: Remove files/directoriesvalidate: Check file propertieslist: Directory listingconvert: Format conversion
Format Conversions:
csv_to_json,json_to_csvyaml_to_json,json_to_yamlxml_to_json,json_to_xml
Codebase Query
Type: codebase_query
Purpose: Intelligent codebase analysis and file discovery
- name: "analyze_project"
type: "codebase_query"
codebase_context: true # Include project metadata
queries:
# Find specific file types
source_files:
find_files:
- type: "source"
- pattern: "lib/**/*.ex"
- exclude_tests: true
- modified_since: "2024-01-01"
# Find test files related to source
test_files:
find_files:
- related_to: "lib/user.ex"
- type: "test"
# Find functions in files
public_functions:
find_functions:
- in_file: "lib/user.ex"
- public_only: true
- with_annotation: "@doc"
# Analyze dependencies
dependencies:
find_dependencies:
- for_file: "lib/user.ex"
- include_transitive: false
# Find dependent files
dependents:
find_dependents:
- of_file: "lib/user.ex"
- include_tests: true
# Project information
project_info:
get_project_type: true
get_dependencies: true
get_git_status: true
output_to_file: "codebase_analysis.json"Query Types:
find_files: Locate files by criteriafind_functions: Extract function definitionsfind_dependencies: Analyze imports/requiresfind_dependents: Reverse dependency lookupget_project_type: Detect project frameworkget_git_status: Git repository information
Supported Languages:
- Elixir
- Python
- JavaScript/TypeScript
- Go
- Rust
State Management
Set Variable
Type: set_variable
Purpose: Manage workflow state and variables
- name: "initialize_state"
type: "set_variable"
variables:
# Static values
iteration_count: 0
max_retries: 3
results: []
# Computed values
total_files: "{{length(steps.scan.files)}}"
start_time: "{{now()}}"
# Complex expressions
threshold: "{{workflow.base_threshold * 1.5}}"
is_production: "{{environment.mode == 'production'}}"
scope: "global" # Variable scopeVariable Scopes:
global: Available to all stepslocal: Current step onlysession: Persists across runs
Supported Operations:
- Arithmetic:
+,-,*,/,% - String concatenation
- Array operations
- Boolean logic
- Function calls
Checkpoint
Type: checkpoint
Purpose: Save workflow state for recovery
- name: "save_progress"
type: "checkpoint"
state:
completed_steps: "{{state.completed_steps}}"
processed_items: "{{state.processed_items}}"
current_phase: "implementation"
metrics:
files_processed: "{{state.file_count}}"
errors_found: "{{state.error_count}}"
completion_percentage: "{{state.progress}}"
checkpoint_name: "phase_2_complete"
include_workspace: true # Save workspace files
compress: true # Compress checkpointFeatures:
- State persistence
- Workspace backup
- Compression support
- Automatic recovery
- Checkpoint management
Use Cases:
- Long-running workflows
- Failure recovery
- Progress tracking
- Partial execution
Step Type Selection Guide
| Task Type | Recommended Step Type | Key Consideration |
|---|---|---|
| Planning & Analysis | gemini | Best for reasoning and structure |
| Code Implementation | claude | Full tool access for file manipulation |
| Quick Analysis | claude_smart | Preset configurations |
| Long Conversations | claude_session | State persistence |
| Content Extraction | claude_extract | Structured output processing |
| Parallel Tasks | claude_batch or parallel_claude | Concurrent execution |
| Critical Operations | claude_robust | Error recovery |
| Structured Output | gemini_instructor | Schema validation |
| Modular Workflows | pipeline | Composition and reuse |
| Iteration | for_loop or while_loop | Collection processing |
| Branching | switch | Value-based routing |
| Data Manipulation | data_transform | JSONPath operations |
| File Management | file_ops | File system operations |
| Code Analysis | codebase_query | Language-aware analysis |
This reference provides comprehensive documentation for all available step types in Pipeline YAML v2 format.