RocksDB Statistics Guide

View Source

This guide explains how to collect and use RocksDB statistics in Erlang RocksDB. Statistics provide insights into database performance, I/O operations, cache efficiency, and more.

Getting Started

Creating a Statistics Object

First, create a statistics object and attach it to your database:

%% Create a statistics collector
{ok, Stats} = rocksdb:new_statistics(),

%% Open database with statistics enabled
{ok, Db} = rocksdb:open("my_db", [
    {create_if_missing, true},
    {statistics, Stats}
]).

Reading Statistics

Use rocksdb:statistics_ticker/2 to read counter values:

{ok, KeysWritten} = rocksdb:statistics_ticker(Stats, number_keys_written),
io:format("Keys written: ~p~n", [KeysWritten]).

Use rocksdb:statistics_histogram/2 to read histogram data:

{ok, WriteHist} = rocksdb:statistics_histogram(Stats, db_write),
io:format("Write latency - median: ~.2f us, p99: ~.2f us~n",
          [maps:get(median, WriteHist), maps:get(percentile99, WriteHist)]).

Histogram Data Format

Histogram results are returned as a map with the following keys:

KeyTypeDescription
medianfloat50th percentile value
percentile95float95th percentile value
percentile99float99th percentile value
averagefloatMean value
standard_deviationfloatStandard deviation
maxfloatMaximum observed value
countintegerNumber of samples
sumintegerSum of all samples

Setting Statistics Level

Control the level of detail collected:

%% Set to collect all statistics
ok = rocksdb:set_stats_level(Stats, stats_all),

%% Or disable expensive timing stats
ok = rocksdb:set_stats_level(Stats, stats_except_timers).

Available levels:

  • stats_disable_all - Disable all statistics
  • stats_except_tickers - Collect histograms only
  • stats_except_histogram_or_timers - Collect tickers only (no histograms or timing)
  • stats_except_timers - Collect everything except timing measurements
  • stats_except_detailed_timers - Collect everything except detailed timing
  • stats_except_time_for_mutex - Collect everything except mutex timing
  • stats_all - Collect all statistics (default)

Cleanup

Release the statistics object when done:

ok = rocksdb:close(Db),
ok = rocksdb:release_statistics(Stats).

Complete Ticker Reference

Tickers are simple counters that track cumulative values.

Database Operation Tickers

TickerDescription
number_keys_writtenTotal number of keys written to the database
number_keys_readTotal number of keys read from the database
number_keys_updatedTotal number of keys updated (via merge)
bytes_writtenTotal bytes written to the database
bytes_readTotal bytes read from the database
iter_bytes_readTotal bytes read through iterators

Iterator Tickers

TickerDescription
number_db_seekNumber of seek operations on iterators
number_db_nextNumber of next operations on iterators
number_db_prevNumber of prev operations on iterators
number_db_seek_foundNumber of seek operations that found a key
number_db_next_foundNumber of next operations that found a key
number_db_prev_foundNumber of prev operations that found a key

Block Cache Tickers

TickerDescription
block_cache_missTotal block cache misses
block_cache_hitTotal block cache hits
block_cache_addNumber of blocks added to cache
block_cache_add_failuresNumber of failures adding blocks to cache
block_cache_index_missIndex block cache misses
block_cache_index_hitIndex block cache hits
block_cache_filter_missFilter block cache misses
block_cache_filter_hitFilter block cache hits
block_cache_data_missData block cache misses
block_cache_data_hitData block cache hits
block_cache_bytes_readTotal bytes read from block cache
block_cache_bytes_writeTotal bytes written to block cache

Example - Calculate cache hit ratio:

{ok, Hits} = rocksdb:statistics_ticker(Stats, block_cache_hit),
{ok, Misses} = rocksdb:statistics_ticker(Stats, block_cache_miss),
HitRatio = case Hits + Misses of
    0 -> 0.0;
    Total -> Hits / Total * 100
end,
io:format("Block cache hit ratio: ~.2f%~n", [HitRatio]).

Memtable Tickers

TickerDescription
memtable_hitNumber of reads served from memtable
memtable_missNumber of reads not found in memtable

Write Path Tickers

TickerDescription
write_done_by_selfWrites completed by the calling thread
write_done_by_otherWrites batched and completed by another thread
wal_file_syncedNumber of WAL file sync operations
stall_microsTotal microseconds spent in write stalls

Compaction Tickers

TickerDescription
compact_read_bytesBytes read during compaction
compact_write_bytesBytes written during compaction
flush_write_bytesBytes written during memtable flush
compaction_key_drop_newer_entryKeys dropped due to newer version existing
compaction_key_drop_obsoleteKeys dropped due to being obsolete (deleted)
compaction_key_drop_range_delKeys dropped due to range delete
compaction_key_drop_userKeys dropped by user compaction filter
compaction_cancelledNumber of cancelled compactions
number_superversion_acquiresSuperversion acquire operations
number_superversion_releasesSuperversion release operations

BlobDB Tickers

TickerDescription
blob_db_num_putNumber of put operations
blob_db_num_writeNumber of write operations
blob_db_num_getNumber of get operations
blob_db_num_multigetNumber of multi-get operations
blob_db_num_seekNumber of seek operations
blob_db_num_nextNumber of next operations
blob_db_num_prevNumber of prev operations
blob_db_num_keys_writtenNumber of keys written
blob_db_num_keys_readNumber of keys read
blob_db_bytes_writtenTotal bytes written
blob_db_bytes_readTotal bytes read
blob_db_write_inlinedWrites stored inline (not in blob)
blob_db_write_inlined_ttlInline writes with TTL
blob_db_write_blobWrites stored in blob files
blob_db_write_blob_ttlBlob writes with TTL
blob_db_blob_file_bytes_writtenBytes written to blob files
blob_db_blob_file_bytes_readBytes read from blob files
blob_db_blob_file_syncedNumber of blob file syncs
blob_db_blob_index_expired_countExpired blob index entries
blob_db_blob_index_expired_sizeSize of expired blob index entries
blob_db_blob_index_evicted_countEvicted blob index entries
blob_db_blob_index_evicted_sizeSize of evicted blob index entries
blob_db_gc_num_filesBlob files processed by GC
blob_db_gc_num_new_filesNew blob files created by GC
blob_db_gc_failuresNumber of GC failures
blob_db_gc_num_keys_relocatedKeys relocated during GC
blob_db_gc_bytes_relocatedBytes relocated during GC
blob_db_fifo_num_files_evictedFiles evicted by FIFO compaction
blob_db_fifo_num_keys_evictedKeys evicted by FIFO compaction
blob_db_fifo_bytes_evictedBytes evicted by FIFO compaction
blob_db_cache_missBlob cache misses
blob_db_cache_hitBlob cache hits
blob_db_cache_addBlobs added to cache
blob_db_cache_add_failuresFailed blob cache additions
blob_db_cache_bytes_readBytes read from blob cache
blob_db_cache_bytes_writeBytes written to blob cache

Transaction Tickers

TickerDescription
txn_prepare_mutex_overheadTime spent waiting on prepare mutex (ns)
txn_old_commit_map_mutex_overheadTime spent waiting on commit map mutex (ns)
txn_duplicate_key_overheadTime spent on duplicate key checking (ns)
txn_snapshot_mutex_overheadTime spent waiting on snapshot mutex (ns)
txn_get_try_againNumber of TryAgain errors from transaction gets

Complete Histogram Reference

Histograms track the distribution of values over time.

Database Operation Histograms

HistogramDescription
db_getGet operation latency (microseconds)
db_writeWrite operation latency (microseconds)
db_multigetMulti-get operation latency (microseconds)
db_seekIterator seek latency (microseconds)

Example - Monitor read latency:

{ok, GetHist} = rocksdb:statistics_histogram(Stats, db_get),
io:format("Get latency:~n"),
io:format("  Median: ~.2f us~n", [maps:get(median, GetHist)]),
io:format("  P95:    ~.2f us~n", [maps:get(percentile95, GetHist)]),
io:format("  P99:    ~.2f us~n", [maps:get(percentile99, GetHist)]),
io:format("  Max:    ~.2f us~n", [maps:get(max, GetHist)]).

Compaction and Flush Histograms

HistogramDescription
compaction_timeCompaction duration (microseconds)
flush_timeMemtable flush duration (microseconds)

I/O Histograms

HistogramDescription
sst_read_microsSST file read latency (microseconds)
sst_write_microsSST file write latency (microseconds)
table_sync_microsSST file sync latency (microseconds)
wal_file_sync_microsWAL file sync latency (microseconds)
bytes_per_readBytes per read operation
bytes_per_writeBytes per write operation

BlobDB Histograms

HistogramDescription
blob_db_key_sizeKey size distribution (bytes)
blob_db_value_sizeValue size distribution (bytes)
blob_db_write_microsWrite latency (microseconds)
blob_db_get_microsGet latency (microseconds)
blob_db_multiget_microsMulti-get latency (microseconds)
blob_db_seek_microsSeek latency (microseconds)
blob_db_next_microsNext latency (microseconds)
blob_db_prev_microsPrev latency (microseconds)
blob_db_blob_file_write_microsBlob file write latency (microseconds)
blob_db_blob_file_read_microsBlob file read latency (microseconds)
blob_db_blob_file_sync_microsBlob file sync latency (microseconds)
blob_db_compression_microsCompression time (microseconds)
blob_db_decompression_microsDecompression time (microseconds)

Transaction Histograms

HistogramDescription
num_op_per_transactionNumber of operations per transaction

Example: Comprehensive Monitoring

Here's a complete example that monitors key database metrics:

-module(db_monitor).
-export([report/1]).

report(Stats) ->
    %% Operation counts
    {ok, KeysWritten} = rocksdb:statistics_ticker(Stats, number_keys_written),
    {ok, KeysRead} = rocksdb:statistics_ticker(Stats, number_keys_read),

    %% Cache efficiency
    {ok, CacheHits} = rocksdb:statistics_ticker(Stats, block_cache_hit),
    {ok, CacheMisses} = rocksdb:statistics_ticker(Stats, block_cache_miss),
    CacheHitRatio = safe_ratio(CacheHits, CacheHits + CacheMisses),

    %% Memtable efficiency
    {ok, MemHits} = rocksdb:statistics_ticker(Stats, memtable_hit),
    {ok, MemMisses} = rocksdb:statistics_ticker(Stats, memtable_miss),
    MemHitRatio = safe_ratio(MemHits, MemHits + MemMisses),

    %% Write stalls
    {ok, StallMicros} = rocksdb:statistics_ticker(Stats, stall_micros),

    %% Latency histograms
    {ok, GetHist} = rocksdb:statistics_histogram(Stats, db_get),
    {ok, WriteHist} = rocksdb:statistics_histogram(Stats, db_write),

    io:format("=== RocksDB Statistics ===~n"),
    io:format("Keys written: ~p, Keys read: ~p~n", [KeysWritten, KeysRead]),
    io:format("Block cache hit ratio: ~.2f%~n", [CacheHitRatio * 100]),
    io:format("Memtable hit ratio: ~.2f%~n", [MemHitRatio * 100]),
    io:format("Total stall time: ~.2f ms~n", [StallMicros / 1000]),
    io:format("Get latency p99: ~.2f us~n", [maps:get(percentile99, GetHist)]),
    io:format("Write latency p99: ~.2f us~n", [maps:get(percentile99, WriteHist)]),
    ok.

safe_ratio(_, 0) -> 0.0;
safe_ratio(Num, Denom) -> Num / Denom.

Statistics with Column Families

When using column families, create one statistics object and share it:

{ok, Stats} = rocksdb:new_statistics(),
{ok, Db, [DefaultCf, DataCf, IndexCf]} = rocksdb:open_with_cf(
    "my_db",
    [{create_if_missing, true}, {statistics, Stats}],
    [{"default", []}, {"data", []}, {"index", []}]
),

%% Statistics are aggregated across all column families
{ok, TotalKeysWritten} = rocksdb:statistics_ticker(Stats, number_keys_written).

Statistics with Transactions

Statistics also work with pessimistic transaction databases:

{ok, Stats} = rocksdb:new_statistics(),
{ok, Db, _} = rocksdb:open_pessimistic_transaction_db(
    "txn_db",
    [{create_if_missing, true}, {statistics, Stats}],
    [{"default", []}]
),

%% Perform transactions...
{ok, Txn} = rocksdb:pessimistic_transaction(Db, []),
ok = rocksdb:pessimistic_transaction_put(Txn, <<"key">>, <<"value">>),
ok = rocksdb:pessimistic_transaction_commit(Txn),
ok = rocksdb:release_pessimistic_transaction(Txn),

%% Check transaction-specific stats
{ok, OpsPerTxn} = rocksdb:statistics_histogram(Stats, num_op_per_transaction),
io:format("Avg ops per transaction: ~.2f~n", [maps:get(average, OpsPerTxn)]).

Performance Considerations

  1. Statistics overhead: Collecting statistics has a small performance cost. Use stats_except_timers or stats_except_detailed_timers in production if timing precision isn't critical.

  2. Polling frequency: Statistics are cumulative. Poll them periodically and compute deltas for rate metrics.

  3. Memory: The statistics object uses minimal memory but should be released when no longer needed.

  4. Thread safety: Statistics are thread-safe and can be read from any Erlang process.