This guide covers monitoring, backup, retention, and troubleshooting for TimelessLogs.

Statistics

Get aggregate storage statistics without reading any blocks:

Elixir API

{:ok, stats} = TimelessLogs.stats()

Returns a %TimelessLogs.Stats{} struct:

FieldDescription
total_blocksNumber of stored blocks (raw + compressed)
total_entriesTotal log entries across all blocks
total_bytesTotal block storage size
disk_sizeOn-disk storage size
index_sizeIndex snapshot + log file size
oldest_timestampTimestamp of oldest entry (microseconds)
newest_timestampTimestamp of newest entry (microseconds)
raw_blocksNumber of uncompressed raw blocks
raw_bytesSize of raw blocks
raw_entriesEntries in raw blocks
zstd_blocksNumber of zstd-compressed blocks
zstd_bytesSize of zstd blocks
zstd_entriesEntries in zstd blocks
openzl_blocksNumber of OpenZL-compressed blocks
openzl_bytesSize of OpenZL blocks
openzl_entriesEntries in OpenZL blocks
compression_raw_bytes_inTotal raw bytes processed by compactor
compression_compressed_bytes_outTotal compressed bytes produced
compaction_countNumber of compaction runs

HTTP API

curl http://localhost:9428/select/logsql/stats
curl http://localhost:9428/health

Flushing

Force flush the buffer to write pending entries to disk immediately:

TimelessLogs.flush()
curl http://localhost:9428/api/v1/flush

Use before backups or graceful shutdowns.

Backup

Create a consistent online backup without stopping the application.

Elixir API

{:ok, result} = TimelessLogs.backup("/tmp/logs_backup")
# => {:ok, %{path: "/tmp/logs_backup", files: ["index.snapshot", "blocks"], total_bytes: 24000000}}

HTTP API

curl -X POST http://localhost:9428/api/v1/backup \
  -H 'Content-Type: application/json' \
  -d '{"path": "/tmp/logs_backup"}'

Backup procedure

  1. The buffer is flushed (all pending entries written to disk)
  2. ETS index is written as a snapshot file (atomic rename)
  3. Block files are copied in parallel to the target directory
  4. Returns the backup path, file list, and total bytes

Restore procedure

  1. Stop the TimelessLogs application
  2. Replace the data directory contents with the backup files
  3. Start the application -- it will load from the restored data

Retention

Retention runs automatically to prevent unbounded disk growth. Two independent policies are enforced:

Age-based retention

Delete blocks with ts_max older than the cutoff:

config :timeless_logs,
  retention_max_age: 7 * 86_400  # 7 days (default)

Size-based retention

Delete oldest blocks until total size is under the limit:

config :timeless_logs,
  retention_max_size: 512 * 1024 * 1024  # 512 MB (default)

Disable retention

config :timeless_logs,
  retention_max_age: nil,    # No age limit
  retention_max_size: nil    # No size limit

Manual trigger

TimelessLogs.Retention.run_now()
# => :noop or {:ok, count_deleted}

Check interval

config :timeless_logs,
  retention_check_interval: 300_000  # 5 minutes (default)

Telemetry events

TimelessLogs emits telemetry events for monitoring and observability:

EventMeasurementsMetadata
[:timeless_logs, :flush, :stop]duration, entry_count, byte_sizeblock_id
[:timeless_logs, :query, :stop]duration, total, blocks_readfilters
[:timeless_logs, :retention, :stop]duration, blocks_deleted
[:timeless_logs, :compaction, :stop]duration, raw_blocks, entry_count, byte_size
[:timeless_logs, :merge_compaction, :stop]duration, batches_merged, blocks_consumed
[:timeless_logs, :block, :error]file_path, reason

Attaching handlers

:telemetry.attach_many(
  "my-log-metrics",
  [
    [:timeless_logs, :flush, :stop],
    [:timeless_logs, :query, :stop],
    [:timeless_logs, :compaction, :stop]
  ],
  fn event, measurements, metadata, _config ->
    IO.inspect({event, measurements, metadata})
  end,
  nil
)

Key metrics to monitor

MetricSourceAlert threshold
Flush duration[:flush, :stop] durationSustained > 100ms
Flush entry count[:flush, :stop] entry_countSustained at max_buffer_size
Query latency[:query, :stop] duration> 5s for typical queries
Blocks read per query[:query, :stop] blocks_readGrowing linearly
Compaction entry count[:compaction, :stop] entry_countNot firing (raw blocks accumulating)
Block read errors[:block, :error]Any occurrence
Retention blocks deleted[:retention, :stop] blocks_deleted0 when disk is growing

Troubleshooting

High memory usage

  • Check raw_blocks in stats -- many uncompacted raw blocks use more memory
  • Trigger compaction: TimelessLogs.Compactor.compact_now()
  • Reduce max_buffer_size to flush smaller batches
  • Check for slow subscribers blocking the buffer

Disk space growing

  • Verify retention is configured: check retention_max_age and retention_max_size
  • Trigger retention manually: TimelessLogs.Retention.run_now()
  • Check stats for total_bytes and total_entries trends
  • Reduce retention age or size limits

Slow queries

  • Use :level and :metadata filters to leverage the term index
  • Avoid full scans (no filters) on large datasets
  • Reduce the time range with :since and :until
  • Check raw_blocks count -- many small raw blocks are slower to query than fewer compressed blocks
  • Trigger merge compaction to consolidate small compressed blocks: TimelessLogs.merge_now()

Logs not appearing in queries

  • Flush the buffer: TimelessLogs.flush()
  • Check that the Logger handler is installed: :logger.get_handler_config(:timeless_logs)
  • Verify the data_dir exists and is writable
  • Check for block read errors in telemetry events

Compaction not running

  • Check raw_blocks and raw_entries in stats
  • Verify compaction_threshold isn't set too high for your log volume
  • Trigger manually: TimelessLogs.Compactor.compact_now()
  • Check that compaction_max_raw_age is reasonable (default: 60 seconds)