Ragex Configuration Guide

View Source

This document covers all configuration options for Ragex, including embedding models, caching, and performance tuning.

Table of Contents

Embedding Models

Ragex supports multiple embedding models for semantic code search. The model choice affects:

  • Quality: Accuracy of semantic search results
  • Speed: Embedding generation and search time
  • Memory: RAM required to load the model
  • Dimensions: Vector size (impacts storage and similarity computation)

Default Model

By default, Ragex uses all-MiniLM-L6-v2:

  • ✅ Fast inference (384 dimensions)
  • ✅ Small model size (~90MB)
  • ✅ Good quality for general-purpose search
  • ✅ Suitable for small to medium codebases

Configuration Methods

import Config

# Set embedding model
config :ragex, :embedding_model, :all_minilm_l6_v2

# Available options:
# :all_minilm_l6_v2       (default)
# :all_mpnet_base_v2      (high quality)
# :codebert_base          (code-specific)
# :paraphrase_multilingual (multilingual)

2. Via Environment Variable

export RAGEX_EMBEDDING_MODEL=codebert_base
mix run --no-halt

This overrides config.exs settings.

3. Checking Current Configuration

mix ragex.embeddings.migrate --check

Output example:

Checking embedding model status...

 Configured Model: all-MiniLM-L6-v2
  ID: all_minilm_l6_v2
  Dimensions: 384
  Type: sentence_transformer
  Repository: sentence-transformers/all-MiniLM-L6-v2

 No embeddings stored yet

Available Models:
   all_minilm_l6_v2 (current)
    all-MiniLM-L6-v2 - 384 dims
   all_mpnet_base_v2
    all-mpnet-base-v2 - 768 dims
   codebert_base
    CodeBERT Base - 768 dims
   paraphrase_multilingual
    paraphrase-multilingual-MiniLM-L12-v2 - 384 dims

Available Models

1. all-MiniLM-L6-v2 (Default)

Model ID: :all_minilm_l6_v2

Specifications:

  • Dimensions: 384
  • Max tokens: 256
  • Type: Sentence transformer
  • Model size: ~90MB

Best for:

  • ✅ General-purpose semantic search
  • ✅ Small to medium codebases (<10k entities)
  • ✅ Fast inference requirements
  • ✅ Limited memory environments

Performance:

  • Embedding generation: ~50ms per entity
  • Memory usage: ~400MB (model + runtime)
  • Quality: Good for most use cases

Configuration:

config :ragex, :embedding_model, :all_minilm_l6_v2

2. all-mpnet-base-v2 (High Quality)

Model ID: :all_mpnet_base_v2

Specifications:

  • Dimensions: 768
  • Max tokens: 384
  • Type: Sentence transformer
  • Model size: ~420MB

Best for:

  • ✅ Large codebases requiring high accuracy
  • ✅ Deep semantic understanding
  • ✅ When quality is more important than speed
  • ✅ Complex domain-specific terminology

Performance:

  • Embedding generation: ~100ms per entity
  • Memory usage: ~800MB (model + runtime)
  • Quality: Excellent semantic understanding

Trade-offs:

  • ⚠️ 2x slower than all-MiniLM-L6-v2
  • ⚠️ 2x more memory
  • ⚠️ 2x larger embeddings (storage)

Configuration:

config :ragex, :embedding_model, :all_mpnet_base_v2

3. CodeBERT Base (Code-Specific)

Model ID: :codebert_base

Specifications:

  • Dimensions: 768
  • Max tokens: 512
  • Type: Code model
  • Model size: ~500MB

Best for:

  • ✅ Code similarity tasks
  • ✅ Programming-specific queries
  • ✅ Multi-language codebases
  • ✅ API discovery and documentation search

Performance:

  • Embedding generation: ~120ms per entity
  • Memory usage: ~900MB (model + runtime)
  • Quality: Optimized for code understanding

Special features:

  • Pre-trained on code and natural language
  • Better understanding of programming concepts
  • Good for finding similar code patterns

Configuration:

config :ragex, :embedding_model, :codebert_base

4. paraphrase-multilingual-MiniLM-L12-v2 (Multilingual)

Model ID: :paraphrase_multilingual

Specifications:

  • Dimensions: 384
  • Max tokens: 128
  • Type: Multilingual
  • Model size: ~110MB

Best for:

  • ✅ International teams
  • ✅ Non-English documentation
  • ✅ Multilingual codebases (50+ languages)
  • ✅ Mixed language comments/docs

Performance:

  • Embedding generation: ~60ms per entity
  • Memory usage: ~450MB (model + runtime)

AI Features Configuration

Ragex includes AI-powered features for enhanced code analysis (Phases A, B, C).

Master Switch

config :ragex, :ai,
  enabled: true,  # Master switch for all AI features
  default_provider: :deepseek_r1  # or :openai, :anthropic, :ollama

Feature Flags

config :ragex, :ai_features,
  # Phase B - High-Priority Features
  validation_error_explanation: true,
  refactor_preview_commentary: true,
  
  # Phase C - Analysis Features
  dead_code_refinement: true,
  duplication_semantic_analysis: true,
  dependency_insights: true

Feature-Specific Options

Each feature has optimized defaults for temperature and token limits:

FeatureTemperatureMax TokensCache TTL
Validation Error Explanation0.33007 days
Refactor Preview Commentary0.75001 hour
Dead Code Refinement0.64007 days
Duplication Semantic Analysis0.56003 days
Dependency Insights0.67006 hours

Runtime Override

All features support runtime configuration overrides:

# Disable AI globally but enable for specific analysis
{:ok, dead} = DeadCode.find_unused_exports(ai_refine: true)

# Enable AI globally but disable for specific analysis
{:ok, clones} = Duplication.detect_in_files(files, ai_analyze: false)

# Override AI provider
{:ok, insights} = AIInsights.analyze_coupling(data, provider: :openai)

Graceful Degradation

  • All AI features are opt-in and disabled by default
  • When disabled or unavailable, features gracefully return original results
  • No failures or crashes if AI provider is unavailable
  • Cache reduces API calls by 40-60%

API Keys

AI features require API keys for external providers:

# DeepSeek (recommended)
export DEEPSEEK_API_KEY="sk-..."

# OpenAI
export OPENAI_API_KEY="sk-..."

# Anthropic
export ANTHROPIC_API_KEY="sk-..."

# Ollama (local, no key needed)
export OLLAMA_HOST="http://localhost:11434"

Documentation

See phase completion documents for detailed usage:

  • stuff/phases/PHASE_A_AI_FEATURES_FOUNDATION.md
  • stuff/phases/PHASE_B_AI_FEATURES_COMPLETE.md
  • stuff/phases/PHASE_C_AI_ANALYSIS_COMPLETE.md
  • Quality: Good for multilingual content

Supported languages: Arabic, Chinese, Dutch, English, French, German, Italian, Korean, Polish, Portuguese, Russian, Spanish, Turkish, and 37+ more

Configuration:

config :ragex, :embedding_model, :paraphrase_multilingual

Model Selection Guide

Decision Tree

Do you have multilingual code/docs?
   YES  paraphrase_multilingual
   NO   Continue...

Is your codebase primarily code-focused?
   YES  codebert_base
   NO   Continue...

Do you need maximum quality?
   YES  all_mpnet_base_v2
   NO   all_minilm_l6_v2 (default)

Use Case Recommendations

Use CaseRecommended ModelWhy
Startup/Small Projectall_minilm_l6_v2Fast, lightweight, good enough
Enterprise/Large Codebaseall_mpnet_base_v2Best quality, worth the cost
Code-heavy (APIs, Libraries)codebert_baseTrained on code specifically
International Teamparaphrase_multilingualMulti-language support
Limited Memory (<4GB)all_minilm_l6_v2Smallest footprint
Quality-Criticalall_mpnet_base_v2Highest accuracy

Dimension Compatibility

Models with the same dimensions can share embeddings:

384-dimensional models (compatible):

  • all_minilm_l6_v2
  • paraphrase_multilingual

768-dimensional models (compatible):

  • all_mpnet_base_v2
  • codebert_base

You can switch between compatible models without regenerating embeddings!

Auto-Analyze Directories

Overview

Ragex can automatically scan and index configured directories when the application starts. This is useful for:

  • Pre-loading commonly used projects into the knowledge graph
  • Ensuring fresh analysis on server startup
  • CI/CD pipelines that need indexed code immediately
  • Development workflows where you work on multiple codebases

Configuration

In config/config.exs:

# Auto-analyze directories on startup
config :ragex, :auto_analyze_dirs, [
  "/opt/Proyectos/MyProject",
  "/home/user/code/important-lib",
  "~/workspace/api-server"
]

Default: [] (no automatic analysis)

Behavior

  1. Startup Phase: Analysis runs during the :auto_analyze start phase
  2. Non-blocking: Server starts before analysis completes
  3. Logging: Progress logged to stderr (visible in logs)
  4. Error Handling: Failures logged as warnings; server continues
  5. Parallel: Multiple directories analyzed sequentially

Example Output

Auto-analyzing 2 configured directories...
Analyzing directory: /opt/Proyectos/MyProject
Successfully analyzed /opt/Proyectos/MyProject: 45 files (2 skipped, 0 errors)
Analyzing directory: /home/user/code/important-lib
Successfully analyzed /home/user/code/important-lib: 23 files (0 skipped, 0 errors)
Auto-analysis complete

Performance Considerations

  • Large codebases: Initial analysis can take 30-60 seconds per 1,000 files
  • Embeddings: Generation adds ~50ms per entity (enable caching to speed up subsequent starts)
  • Memory: Each analyzed file adds ~10KB to ETS tables
  • Recommendation: Limit to 3-5 active projects, use incremental updates

Environment-Specific Configuration

Development:

# config/dev.exs
import Config

config :ragex, :auto_analyze_dirs, [
  ".",  # Current project
  "../shared-lib"  # Related library
]

Production:

# config/prod.exs
import Config

config :ragex, :auto_analyze_dirs, [
  "/app/src",  # Main application
  "/app/vendor/critical-deps"  # Important dependencies
]

CI/CD:

# config/ci.exs
import Config

# Analyze entire codebase for comprehensive checks
config :ragex, :auto_analyze_dirs, [
  System.get_env("CI_PROJECT_DIR", ".")
]

Disabling Auto-Analysis

Set to empty list:

config :ragex, :auto_analyze_dirs, []

Or override via environment:

export RAGEX_AUTO_ANALYZE="false"

Combining with File Watcher

Auto-analysis and file watching work together:

  1. Startup: Auto-analyze directories (full scan)
  2. Runtime: File watcher tracks changes (incremental updates)
  3. Result: Always up-to-date knowledge graph

Troubleshooting

Issue: "Failed to analyze directory"

Solutions:

  • Check path exists and is readable
  • Verify sufficient disk space for cache
  • Check file permissions
  • Look for syntax errors in source files

Issue: Startup takes too long

Solutions:

  • Reduce number of configured directories
  • Enable embedding cache (see Cache Configuration)
  • Use file watching for incremental updates instead
  • Consider analyzing on-demand via MCP tools

Cache Configuration

Enable/Disable Cache

config :ragex, :cache,
  enabled: true,  # Set to false to disable caching
  dir: Path.expand("~/.cache/ragex"),  # Cache directory
  max_age_days: 30  # Auto-cleanup after 30 days

Cache Location

Default: ~/.cache/ragex/embeddings/<project_hash>.ets

Custom location:

config :ragex, :cache,
  enabled: true,
  dir: "/custom/path/to/cache"

Cache Management Commands

# Show cache statistics
mix ragex.cache.stats

# Clear all caches
mix ragex.cache.clear

# Clear caches older than 7 days
mix ragex.cache.clear --older-than 7

Migration Guide

Switching Models

Scenario 1: Compatible Models (Same Dimensions)

Example: all_minilm_l6_v2 → paraphrase_multilingual (both 384 dims)

Steps:

  1. Update config/config.exs:

    config :ragex, :embedding_model, :paraphrase_multilingual
  2. Restart the server:

    # Kill existing process
    # Then restart
    mix run --no-halt
    
  3. ✅ Done! Existing embeddings still work.


Scenario 2: Incompatible Models (Different Dimensions)

Example: all_minilm_l6_v2 (384) → all_mpnet_base_v2 (768)

Steps:

  1. Check current status:

    mix ragex.embeddings.migrate --check
    
  2. Clear existing embeddings:

    # Stop the server
    # Embeddings are in-memory and will be cleared on restart
    
  3. Update config/config.exs:

    config :ragex, :embedding_model, :all_mpnet_base_v2
  4. Restart and re-analyze:

    mix run --no-halt
    # Then analyze your codebase via MCP tools
    

Using the Migration Tool

Check Status

mix ragex.embeddings.migrate --check

Plan Migration

mix ragex.embeddings.migrate --model codebert_base

This checks compatibility and provides instructions.

Force Migration

mix ragex.embeddings.migrate --model codebert_base --force

Performance Tuning

Memory Optimization

For systems with limited memory (<4GB):

  1. Use lightweight model:

    config :ragex, :embedding_model, :all_minilm_l6_v2
  2. Limit batch size (in Bumblebee adapter):

    compile: [batch_size: 16, sequence_length: 256]  # Reduce from 32
  3. Disable cache if needed:

    config :ragex, :cache, enabled: false

Speed Optimization

For faster embedding generation:

  1. Use faster model:

    config :ragex, :embedding_model, :all_minilm_l6_v2
  2. Reduce sequence length:

    compile: [batch_size: 32, sequence_length: 256]  # Reduce from 512
  3. Enable EXLA compiler (if not already):

    • Ensure exla dependency is included
    • First run will compile (slow), subsequent runs are fast

Quality Optimization

For best search quality:

  1. Use high-quality model:

    config :ragex, :embedding_model, :all_mpnet_base_v2
  2. Generate embeddings for all entities:

    # In analyze_file MCP tool
    {
      "generate_embeddings": true  # Always true
    }
  3. Use longer text descriptions:

    • Include more context in function/module docs
    • Better descriptions = better embeddings

Environment-Specific Configuration

Development

# config/dev.exs
import Config

config :ragex, :embedding_model, :all_minilm_l6_v2  # Fast for dev
config :ragex, :cache, enabled: true  # Cache for quick restarts

Production

# config/prod.exs
import Config

config :ragex, :embedding_model, :all_mpnet_base_v2  # Quality for prod
config :ragex, :cache, enabled: true, max_age_days: 90  # Long cache

Testing

# config/test.exs
import Config

config :ragex, :embedding_model, :all_minilm_l6_v2  # Fast tests
config :ragex, :cache, enabled: false  # No cache for isolation

Troubleshooting

Model Won't Load

Symptom: "Failed to load Bumblebee model"

Solutions:

  1. Check internet connection (first download)
  2. Verify disk space (~500MB needed)
  3. Check cache directory permissions: ~/.cache/huggingface/
  4. Try clearing HuggingFace cache:
    rm -rf ~/.cache/huggingface/
    

Dimension Mismatch Error

Symptom: "Dimension mismatch: expected 384, got 768"

Solution:

mix ragex.embeddings.migrate --check
# Follow instructions to clear embeddings
# Then restart with new model

Out of Memory

Symptom: Server crashes or freezes during embedding generation

Solutions:

  1. Switch to smaller model:

    config :ragex, :embedding_model, :all_minilm_l6_v2
  2. Reduce batch size in code:

    compile: [batch_size: 8, sequence_length: 256]
  3. Increase system swap space


Advanced Configuration

Custom Model (Advanced)

To add a custom model, edit lib/ragex/embeddings/registry.ex:

custom_model: %{
  id: :custom_model,
  name: "Custom Model",
  repo: "organization/model-name",
  dimensions: 512,
  max_tokens: 256,
  description: "My custom embedding model",
  type: :sentence_transformer,
  recommended_for: ["custom use case"]
}

Then configure:

config :ragex, :embedding_model, :custom_model

Summary

Quick Start (Default):

# config/config.exs
config :ragex, :embedding_model, :all_minilm_l6_v2

For Best Quality:

config :ragex, :embedding_model, :all_mpnet_base_v2

For Code-Specific:

config :ragex, :embedding_model, :codebert_base

For Multilingual:

config :ragex, :embedding_model, :paraphrase_multilingual

Check Status Anytime:

mix ragex.embeddings.migrate --check

References