Analyzer Plugin System

The Analyzer Plugin System provides a unified, extensible framework for code analysis in Metastatic. Write custom analysis rules once and apply them across all supported programming languages through the unified MetaAST representation.

Quick Start

alias Metastatic.{Document, Analysis.Runner}

# Create a document from code
ast = {:binary_op, :arithmetic, :+, {:variable, "x"}, {:literal, :integer, 5}}
doc = Document.new(ast, :python)

# Run analyzers
{:ok, report} = Runner.run(doc)

# Check results
IO.puts("Found #{report.summary.total} issues")
Enum.each(report.issues, fn issue ->
  IO.puts("[#{issue.severity}] #{issue.message}")
end)

Core Concepts

Analyzer Behaviour

An analyzer is a module that implements the Metastatic.Analysis.Analyzer behaviour:

info/0 - Returns metadata about the analyzer
analyze/2 - Called for each AST node during traversal
run_before/1 (optional) - Called before traversal starts
run_after/2 (optional) - Called after traversal completes

Registry

The Metastatic.Analysis.Registry manages analyzer discovery and configuration:

alias Metastatic.Analysis.Registry

# Register an analyzer
Registry.register(MyCustomAnalyzer)

# List all registered analyzers
Registry.list_all()

# List by category
Registry.list_by_category(:correctness)

# Configure an analyzer
Registry.configure(MyAnalyzer, %{threshold: 10})

Runner

The Metastatic.Analysis.Runner executes analyzers on documents:

alias Metastatic.Analysis.Runner

# Run all registered analyzers
{:ok, report} = Runner.run(doc)

# Run specific analyzers
{:ok, report} = Runner.run(doc, analyzers: [UnusedVariables, SimplifyConditional])

# With configuration
{:ok, report} = Runner.run(doc,
  analyzers: :all,
  config: %{
    nesting_depth: %{max_depth: 4},
    unused_variables: %{ignore_prefix: "_"}
  }
)

Using Built-in Analyzers

Business-logic Analyzers

1. Pure MetaAST (9 analyzers)

Language-agnostic patterns using only M2.1/M2.2 constructs:

CallbackHell - Detects deeply nested conditionals (callback hell pattern)
MissingErrorHandling - Pattern matching without error case handling
SilentErrorCase - Conditionals with only success path
SwallowingException - Exception handling without logging
HardcodedValue - Hardcoded URLs/IPs in string literals
NPlusOneQuery - Database queries in collection operations
InefficientFilter - Fetch-all then filter anti-pattern
UnmanagedTask - Unsupervised async operations
TelemetryInRecursiveFunction - Metrics emission in recursive functions

2. Function Name Heuristics (4 analyzers)

Detection based on function name patterns:

MissingTelemetryForExternalHttp - HTTP calls without telemetry/observability
SyncOverAsync - Blocking operations in async contexts
DirectStructUpdate - Struct/object updates bypassing validation
MissingHandleAsync - Fire-and-forget async operations without supervision

3. Naming Conventions (4 analyzers)

Detection based on function/module naming conventions:

BlockingInPlug - Blocking I/O operations in HTTP middleware
MissingTelemetryInAuthPlug - Authentication/authorization without audit logging
MissingTelemetryInLiveviewMount - Component lifecycle methods without telemetry
MissingTelemetryInObanWorker - Background job processing without metrics

4. Content Analysis (3 analyzers)

Pattern detection through string content analysis:

MissingPreload - Database queries without eager loading (N+1 risk)
InlineJavascript - Inline executable code in strings (XSS vulnerability)
MissingThrottle - Expensive operations without rate limiting

Generic Analyzers

SimplifyConditional

Suggests simplification of redundant conditionals:

# Detects patterns like:
# if x then true else false  →  x
# if x then false else true  →  not x

{:ok, report} = Runner.run(doc, analyzers: [SimplifyConditional])

Configuration: None (not configurable)

DeadCodeAnalyzer

Detects unreachable and dead code:

# Detects:
# - Code after return statements
# - Branches in constant conditionals

{:ok, report} = Runner.run(doc,
  analyzers: [DeadCodeAnalyzer],
  config: %{dead_code: [min_confidence: :high]}
)

Configuration:

:min_confidence - :low (all), :medium, :high (only definite)

NestingDepth

Detects excessive nesting depth:

# Warns when nesting exceeds thresholds

{:ok, report} = Runner.run(doc,
  analyzers: [NestingDepth],
  config: %{nesting_depth: [max_depth: 4, warn_threshold: 3]}
)

Configuration:

:max_depth - Maximum allowed depth (default: 5)
:warn_threshold - Warning threshold (default: 4)

UnusedVariables

Detects unused variables:

# Finds assigned but never used variables

{:ok, report} = Runner.run(doc,
  analyzers: [UnusedVariables],
  config: %{unused_variables: [ignore_prefix: "_"]}
)

Configuration:

:ignore_underscore - Ignore variables starting with underscore (default: true)

Understanding Reports

The runner returns a report with:

{:ok, report} = Runner.run(doc)

# Report structure:
%{
  document: Document.t(),           # The analyzed document
  analyzers_run: [module()],        # Which analyzers ran
  issues: [Analyzer.issue()],       # All issues found
  summary: %{                       # Aggregated statistics
    total: integer(),
    by_severity: %{atom() => integer()},
    by_category: %{atom() => integer()},
    by_analyzer: %{atom() => integer()}
  },
  timing: %{total_ms: float()} | nil # Performance info
}

Issue Structure

Each issue contains:

%{
  analyzer: module(),               # Which analyzer found it
  category: atom(),                 # Category (:correctness, :style, etc.)
  severity: atom(),                 # :error, :warning, :info, :refactoring_opportunity
  message: String.t(),              # Human-readable message
  node: Metastatic.AST.meta_ast(),  # The problematic node
  location: %{                      # Location info
    line: non_neg_integer() | nil,
    column: non_neg_integer() | nil,
    path: Path.t() | nil
  },
  suggestion: %{                    # Optional refactoring suggestion
    type: :replace | :remove | :insert_before | :insert_after,
    replacement: meta_ast() | nil,
    message: String.t()
  } | nil,
  metadata: map()                   # Analyzer-specific data
}

Common Patterns

Filter by Severity

errors = Enum.filter(report.issues, &(&1.severity == :error))
warnings = Enum.filter(report.issues, &(&1.severity == :warning))
refactoring = Enum.filter(report.issues, &(&1.severity == :refactoring_opportunity))

Filter by Category

correctness_issues = Enum.filter(report.issues, &(&1.category == :correctness))
style_issues = Enum.filter(report.issues, &(&1.category == :style))

Group by Analyzer

by_analyzer = Enum.group_by(report.issues, & &1.analyzer)

Enum.each(by_analyzer, fn {analyzer, issues} ->
  IO.puts("#{analyzer}: #{length(issues)} issues")
end)

Get Refactoring Suggestions

refactorings = 
  report.issues
  |> Enum.filter(&(&1.severity == :refactoring_opportunity))
  |> Enum.filter(&(&1.suggestion != nil))

Enum.each(refactorings, fn issue ->
  IO.puts("#{issue.message}")
  IO.puts("Suggestion: #{issue.suggestion.message}")
end)

Track Timing

{:ok, report} = Runner.run(doc, track_timing: true)

if report.timing do
  IO.puts("Analysis took #{report.timing.total_ms}ms")
end

Application Configuration

Configure analyzers at the application level:

# config/config.exs
config :metastatic, :analyzers,
  auto_register: [
    Metastatic.Analysis.UnusedVariables,
    Metastatic.Analysis.SimplifyConditional,
    Metastatic.Analysis.DeadCodeAnalyzer,
    Metastatic.Analysis.NestingDepth
  ],
  disabled: [],  # Disable specific analyzers
  config: %{
    unused_variables: %{ignore_prefix: "_"},
    nesting_depth: %{max_depth: 4},
    dead_code: %{min_confidence: :high}
  }

Then use:

# Runs all registered analyzers with configured settings
{:ok, report} = Runner.run(doc)

Best Practices

Use Specific Analyzers When Possible - Running fewer analyzers is faster
Configure Thresholds - Adjust defaults to match your project standards
Process Issues by Severity - Handle errors before warnings before info
Cache Registry Lookups - Don't repeatedly query the registry
Use run_before/1 for Expensive Setup - Heavy computation in lifecycle hooks
Combine with Other Tools - Use with formatter and type checker
Review Suggestions - Don't blindly apply refactoring suggestions
Monitor Performance - Use track_timing: true for performance-critical code

Performance Considerations

Single-pass traversal: Multiple analyzers run in a single AST traversal
Lazy evaluation: Analysis only runs when explicitly called
Configurable depth: Limit analysis with max_issues option
Language agnostic: Same analyzers work across all languages

Troubleshooting

Analyzer Not Found

# Register it first
Registry.register(MyAnalyzer)

# Or pass explicitly
Runner.run(doc, analyzers: [MyAnalyzer])

No Issues Found

Verify the AST contains the patterns the analyzer looks for
Check analyzer configuration
Run with specific analyzer to verify it's registered

Performance Issues

Reduce number of analyzers
Use max_issues to stop early
Profile with track_timing: true
Consider language-specific analyzers if available

Next Steps

See CUSTOM_ANALYZER_GUIDE.md to create your own analyzers
See BUILTIN_ANALYZERS.md for detailed reference
Check examples/ directory for working code samples

← Previous Page Supplemental Modules

Next Page → Analyzer Plugin Design