Vaultx.Cache.L3 (Vaultx v0.7.0)

View Source

L3 Persistent Cache implementation with enterprise-grade security.

This module provides persistent caching capabilities with optional AES-256-GCM encryption for sensitive data. It's designed for long-term storage of cache entries that should survive application restarts and provide durability guarantees.

Architecture

L3 cache operates as the persistent layer in VaultX's multi-tier caching system:

  • L1 (Memory) → L2 (Distributed) → L3 (Persistent)
  • Provides durability and long-term storage
  • Automatic promotion to higher cache layers on access
  • Intelligent cleanup and maintenance

Features

Core Functionality

  • File-based persistent storage with atomic write operations
  • Configurable TTL with automatic expiration handling
  • Pattern-based cache clearing for bulk operations
  • Automatic cleanup of expired entries
  • Cross-platform compatibility with proper file handling

Security & Encryption

  • AES-256-GCM encryption for sensitive cache data
  • Secure key management with multiple key sources
  • File permission hardening (0600 for key files)
  • Cryptographic integrity verification
  • Graceful degradation on encryption failures

Performance & Reliability

  • Atomic file operations prevent corruption
  • Efficient file naming with collision avoidance
  • Memory-efficient streaming for large cache entries
  • Concurrent access safety through file locking patterns
  • Automatic error recovery and logging

Encryption Design

Cryptographic Specifications

  • Algorithm: AES-256-GCM (Galois/Counter Mode)
  • Key Size: 256 bits (32 bytes)
  • IV Size: 96 bits (12 bytes) - optimal for GCM
  • Tag Size: 128 bits (16 bytes) - authentication tag
  • Key Derivation: Direct key usage (no PBKDF2 needed for generated keys)

Key Management Strategy

The encryption key is sourced in the following priority order:

  1. Environment Variable (VAULTX_L3_ENCRYPTION_KEY)

    • Base64-encoded 256-bit key
    • Recommended for containerized deployments
    • Example: export VAULTX_L3_ENCRYPTION_KEY="$(openssl rand -base64 32)"
  2. Persistent Key File (.encryption_key in storage directory)

    • Automatically generated on first use
    • Stored with restrictive permissions (0600)
    • Survives application restarts
    • Base64-encoded for safe storage
  3. Fallback Generation (temporary, not recommended for production)

    • Generated in-memory if file operations fail
    • Will cause data loss on restart
    • Logged as a warning

Security Considerations

Strengths

  • Authenticated encryption prevents tampering
  • Unique IV per encryption prevents replay attacks
  • Secure key storage with proper file permissions
  • No key derivation overhead for performance
  • Cryptographically secure random IV generation

Limitations & Mitigations

  • Key in memory: Required for operation, cleared on process termination
  • No key rotation: Future enhancement, current keys remain valid
  • File system security: Relies on OS-level file permissions
  • Backup considerations: Encrypted cache files require key backup

Production Recommendations

  • Use environment variables for key management in containers
  • Implement regular key rotation policies (manual process currently)
  • Monitor key file permissions and access
  • Include encryption keys in backup/disaster recovery procedures
  • Consider using external key management systems for high-security environments

Configuration

# Application configuration
config :vaultx,
  cache_l3_enabled: true,
  cache_l3_storage_path: "/var/cache/vaultx",
  cache_l3_ttl_default: 86_400_000,  # 24 hours in milliseconds
  cache_l3_cleanup_interval: 3_600_000,  # 1 hour in milliseconds
  cache_l3_encryption: true

# Environment variables (override application config)
export VAULTX_CACHE_L3_ENABLED=true
export VAULTX_CACHE_L3_STORAGE_PATH="/secure/cache/path"
export VAULTX_CACHE_L3_ENCRYPTION=true
export VAULTX_L3_ENCRYPTION_KEY="$(openssl rand -base64 32)"

Usage Examples

# L3 cache is typically managed by the cache manager
# Direct usage is not recommended in normal operations

# Basic operations (via cache manager)
{:ok, value} = Vaultx.Cache.get("persistent_key")
:ok = Vaultx.Cache.put("persistent_key", value, ttl: :timer.hours(24))

# Direct L3 operations (advanced usage)
{:ok, value} = Vaultx.Cache.L3.get("cache_key")
:ok = Vaultx.Cache.L3.put("cache_key", value, ttl: :timer.hours(1))
:ok = Vaultx.Cache.L3.delete("cache_key")

# Bulk operations
:ok = Vaultx.Cache.L3.clear("pattern:*")
:ok = Vaultx.Cache.L3.clear(:all)

File Structure

/var/cache/vaultx/
 .encryption_key          # Encryption key (if file-based)
 cache_abc123def_a1b2c3d4.cache  # Cache files with collision avoidance
 cache_xyz789ghi_e5f6g7h8.cache
 ...

Performance Characteristics

  • Read latency: ~10-50ms (disk I/O dependent)
  • Write latency: ~20-100ms (disk I/O dependent)
  • Encryption overhead: ~1-5ms additional latency
  • Throughput: Limited by disk I/O performance
  • Storage: Limited by available disk space

Monitoring & Maintenance

L3 cache provides comprehensive monitoring through the metrics system:

  • Cache hit/miss ratios
  • Storage space utilization
  • Encryption/decryption performance
  • Cleanup operation statistics
  • Error rates and recovery actions

Error Handling

The L3 cache implements graceful error handling:

  • Encryption failures: Fall back to unencrypted storage with warnings
  • File system errors: Retry with exponential backoff
  • Corruption detection: Automatic cleanup of corrupted entries
  • Permission issues: Clear error messages and recovery suggestions

Use Cases & Best Practices

Ideal Use Cases

  • Long-term secret caching: Store frequently accessed secrets with 24+ hour TTL
  • Offline resilience: Maintain critical data availability during network outages
  • Cross-restart persistence: Preserve expensive computations across deployments
  • Backup cache layer: Fallback when distributed caches are unavailable
  • Development environments: Reduce API calls during development and testing

Performance Optimization

  • Batch operations: Use pattern-based clearing for bulk cache invalidation
  • TTL tuning: Set longer TTLs for stable data, shorter for dynamic content
  • Storage location: Use fast SSDs for cache storage when possible
  • Cleanup scheduling: Adjust cleanup intervals based on cache churn rate
  • Encryption trade-offs: Disable encryption for non-sensitive data to improve performance

Security Best Practices

  • Key management: Use environment variables in production, never hardcode keys
  • File permissions: Ensure cache directory has restrictive permissions (0700)
  • Regular audits: Monitor cache access patterns for anomalies
  • Key rotation: Implement periodic key rotation (manual process currently)
  • Backup strategy: Include encryption keys in disaster recovery procedures

Operational Considerations

  • Disk space monitoring: Set up alerts for cache directory disk usage
  • Performance monitoring: Track cache hit rates and response times
  • Log analysis: Monitor encryption/decryption errors and file system issues
  • Cleanup verification: Ensure expired entries are properly removed
  • Recovery procedures: Document cache rebuild processes for disaster recovery

Integration Patterns

  • Vault secrets: Cache KV secrets with appropriate TTLs based on sensitivity
  • Authentication data: Store token validation results with security-appropriate TTLs
  • Configuration data: Cache mount information and engine metadata
  • Policy evaluation: Store policy decision results for performance
  • Lease management: Cache lease information with proper expiration handling

Troubleshooting

Common Issues

  • Permission denied: Check file system permissions on cache directory
  • Encryption key mismatch: Verify key consistency across restarts
  • Disk space full: Monitor and clean up cache directory regularly
  • Slow performance: Check disk I/O performance and consider SSD storage
  • Memory usage: Monitor process memory for large cached objects

Debugging Tools

  • Cache statistics: Use Vaultx.Cache.stats() for performance metrics
  • Log analysis: Enable debug logging for detailed operation tracking
  • File inspection: Examine cache files directly (encrypted files will be binary)
  • Process monitoring: Check GenServer health and restart patterns
  • Metrics collection: Use telemetry data for performance analysis

Summary

Functions

Returns a specification to start this module under a supervisor.

Performs cleanup of expired entries.

Clears L3 cache.

Deletes a value from L3 cache.

Gets a value from L3 cache.

Puts a value into L3 cache.

Starts the L3 cache.

Gets L3 cache statistics.

Functions

child_spec(init_arg)

Returns a specification to start this module under a supervisor.

See Supervisor.

cleanup()

Performs cleanup of expired entries.

clear(pattern \\ :all)

Clears L3 cache.

delete(key)

Deletes a value from L3 cache.

get(key)

Gets a value from L3 cache.

put(key, value, opts \\ [])

Puts a value into L3 cache.

start_link(config)

Starts the L3 cache.

stats()

Gets L3 cache statistics.