Posting Lists

View Source

Posting lists are used for inverted indexes, search engines, and document tagging systems. erlang-rocksdb provides a specialized merge operator for efficient posting list management with built-in support for set operations and roaring bitmap indexes.

Overview

A posting list stores a set of keys (e.g., document IDs) associated with a term. The merge operator allows efficient append operations and handles key deletion using tombstones that are automatically cleaned up during merge operations (reads and compaction).

Key Features:

  • Sorted keys (lexicographic order)
  • Roaring bitmap for fast existence checks and set operations
  • Automatic V1 to V2 format migration
  • Efficient intersection, union, and difference operations

Binary Formats

V2 Format (Current)

V2 is the default format as of version 2.5.0. It stores keys sorted and includes a roaring64 bitmap for fast operations.

FieldSizeDescription
Version1 byte0x02 for V2
BitmapSize4 bytes (big-endian)Size of serialized roaring bitmap
BitmapDataBitmapSize bytesSerialized roaring64 bitmap
KeyCount4 bytes (big-endian)Number of keys
KeysVariableSorted entries: <Len:32/big><Key:Len>...

V1 Format (Legacy)

V1 format is still readable but will be automatically upgraded to V2 on the next merge operation.

FieldSizeDescription
KeyLength4 bytes (big-endian)Length of the key data
Flag1 byte0 = normal, non-zero = tombstone
KeyDataKeyLength bytesThe key binary

Setup

Open a database with the posting list merge operator:

{ok, Db} = rocksdb:open("mydb", [
    {create_if_missing, true},
    {merge_operator, posting_list_merge_operator}
]).

Adding Keys

Use merge with {posting_add, Key}:

ok = rocksdb:merge(Db, <<"term:erlang">>, {posting_add, <<"doc1">>}, []),
ok = rocksdb:merge(Db, <<"term:erlang">>, {posting_add, <<"doc2">>}, []),
ok = rocksdb:merge(Db, <<"term:erlang">>, {posting_add, <<"doc3">>}, []).

Deleting Keys

Use merge with {posting_delete, Key} to add a tombstone:

ok = rocksdb:merge(Db, <<"term:erlang">>, {posting_delete, <<"doc2">>}, []).

The key is logically deleted and removed during the merge operation.

Reading Posting Lists

{ok, Binary} = rocksdb:get(Db, <<"term:erlang">>, []).

Helper Functions

Get Active Keys (Sorted)

Returns deduplicated keys in lexicographic order:

Keys = rocksdb:posting_list_keys(Binary),
%% [<<"doc1">>, <<"doc3">>]  % sorted

Check if Key is Active

true = rocksdb:posting_list_contains(Binary, <<"doc1">>),
false = rocksdb:posting_list_contains(Binary, <<"doc2">>).  % deleted

Find Key

Returns {ok, IsTombstone} or not_found:

{ok, false} = rocksdb:posting_list_find(Binary, <<"doc1">>),
not_found = rocksdb:posting_list_find(Binary, <<"unknown">>).

Count Active Keys

Count = rocksdb:posting_list_count(Binary).

Get Format Version

2 = rocksdb:posting_list_version(Binary).  % V2 format

Convert to Map

Get the full state as a map:

Map = rocksdb:posting_list_to_map(Binary),
%% #{<<"doc1">> => active, <<"doc3">> => active}

Decode to List (V1 only)

For V1 format binaries:

Entries = rocksdb:posting_list_decode(Binary),
%% [{<<"doc1">>, false}, {<<"doc2">>, true}]

Fold Over Entries

Count = rocksdb:posting_list_fold(
    fun(_Key, _IsTombstone, Acc) -> Acc + 1 end,
    0,
    Binary
).

Set Operations

V2 posting lists support efficient set operations using roaring bitmaps:

Intersection

Find keys present in both posting lists:

{ok, Bin1} = rocksdb:get(Db, <<"term:erlang">>, []),
{ok, Bin2} = rocksdb:get(Db, <<"term:otp">>, []),
ResultBin = rocksdb:posting_list_intersection(Bin1, Bin2),
CommonKeys = rocksdb:posting_list_keys(ResultBin).

Union

Combine all keys from both posting lists:

ResultBin = rocksdb:posting_list_union(Bin1, Bin2),
AllKeys = rocksdb:posting_list_keys(ResultBin).

Difference

Find keys in the first list but not in the second:

ResultBin = rocksdb:posting_list_difference(Bin1, Bin2),
UniqueKeys = rocksdb:posting_list_keys(ResultBin).

Fast Intersection Count

Get cardinality without materializing the result (uses roaring bitmap):

Count = rocksdb:posting_list_intersection_count(Bin1, Bin2).

Multi-List Intersection

Intersect multiple posting lists efficiently (processes smallest first):

ResultBin = rocksdb:posting_list_intersect_all([Bin1, Bin2, Bin3]).

Bitmap Contains (Fast Lookup)

Fast hash-based lookup using the embedded bitmap:

true = rocksdb:posting_list_bitmap_contains(Binary, <<"doc1">>).

Note: May have rare false positives due to hash collisions. Use posting_list_contains/2 for exact checks.

Postings Resource API

For repeated lookups on the same posting list, use the resource-based API. Parse once, lookup many times with O(1) performance.

Open/Parse Posting List

{ok, Binary} = rocksdb:get(Db, <<"term:erlang">>, []),
{ok, Postings} = rocksdb:postings_open(Binary).

Fast Contains Lookup

%% Exact match (O(log n) using sorted set)
true = rocksdb:postings_contains(Postings, <<"doc1">>),

%% Hash-based lookup (O(1), rare false positives)
true = rocksdb:postings_bitmap_contains(Postings, <<"doc1">>).

Count and Keys

Count = rocksdb:postings_count(Postings),
Keys = rocksdb:postings_keys(Postings).  %% sorted

Set Operations on Resources

Set operations accept both binaries and resources, returning a resource:

%% Intersection (AND)
{ok, Result} = rocksdb:postings_intersection(Postings1, Postings2),

%% Union (OR)
{ok, Result} = rocksdb:postings_union(Postings1, Postings2),

%% Difference (A - B)
{ok, Result} = rocksdb:postings_difference(Postings1, Postings2),

%% Fast intersection count
Count = rocksdb:postings_intersection_count(Postings1, Postings2),

%% Multi-way intersection
{ok, Result} = rocksdb:postings_intersect_all([P1, P2, P3]).

Convert Back to Binary

Binary = rocksdb:postings_to_binary(Postings).

Performance Comparison

MethodLookup Time
postings_contains/2~0.1-0.2 us
postings_bitmap_contains/2~0.1 us
posting_list_to_map/1 + maps:is_key/2~0.1 us
posting_list_contains/2 (binary)~1-10 us

Use the resource API for batch lookups (e.g., checking many document IDs against a term's posting list).

Tombstone Cleanup

Tombstones are automatically cleaned up by the merge operator:

  • During reads: When reading a key, the merge operator combines all entries and removes tombstoned keys from the result.
  • During compaction: The merge operator consolidates entries, keeping only active (non-tombstoned) keys.

This means you don't need a separate compaction filter for tombstone removal - it's built into the merge operator.

Format Migration

V1 data is automatically migrated to V2 format on the next merge operation. No manual migration is needed. You can check the format version with:

Version = rocksdb:posting_list_version(Binary).
%% 1 = V1 (legacy), 2 = V2 (current)

Use Cases

  • Inverted Index: Map terms to document IDs for full-text search
  • Tagging System: Map tags to item IDs
  • Graph Adjacency: Store outgoing edges for each node
  • Set Membership: Efficient set operations via roaring bitmaps
  • Query Intersection: Fast AND queries across multiple terms

Performance Tips

  1. Batch Writes: Use write_batch to add multiple keys atomically
  2. Periodic Compaction: Run compaction to reclaim space and optimize layout
  3. Large Lists: For very large posting lists (100K+), consider sharding by key prefix
  4. Use Set Operations: For multi-term queries, use postings_intersection/2 or postings_intersect_all/1 instead of manual iteration
  5. Intersection Count: Use postings_intersection_count/2 when you only need cardinality
  6. Use Resources for Batch Lookups: For multiple contains checks, use postings_open/1 once then postings_contains/2 for each lookup (~0.1 us vs ~1-10 us per lookup)
  7. NIF Functions: All helper functions are implemented as NIFs for efficiency

Example: Inverted Index with Query

%% Index a document
index_document(Db, DocId, Terms) ->
    lists:foreach(fun(Term) ->
        ok = rocksdb:merge(Db, <<"term:", Term/binary>>, {posting_add, DocId}, [])
    end, Terms).

%% Remove document from index
remove_document(Db, DocId, Terms) ->
    lists:foreach(fun(Term) ->
        ok = rocksdb:merge(Db, <<"term:", Term/binary>>, {posting_delete, DocId}, [])
    end, Terms).

%% Search for documents containing a term
search(Db, Term) ->
    case rocksdb:get(Db, <<"term:", Term/binary>>, []) of
        {ok, Binary} -> rocksdb:posting_list_keys(Binary);
        not_found -> []
    end.

%% Search for documents containing ALL terms (AND query)
search_all(Db, Terms) ->
    Postings = lists:filtermap(fun(Term) ->
        case rocksdb:get(Db, <<"term:", Term/binary>>, []) of
            {ok, Binary} ->
                {ok, P} = rocksdb:postings_open(Binary),
                {true, P};
            not_found -> false
        end
    end, Terms),
    case Postings of
        [] -> [];
        _ ->
            {ok, Result} = rocksdb:postings_intersect_all(Postings),
            rocksdb:postings_keys(Result)
    end.

%% Search for documents containing ANY term (OR query)
search_any(Db, Terms) ->
    Postings = lists:filtermap(fun(Term) ->
        case rocksdb:get(Db, <<"term:", Term/binary>>, []) of
            {ok, Binary} ->
                {ok, P} = rocksdb:postings_open(Binary),
                {true, P};
            not_found -> false
        end
    end, Terms),
    case Postings of
        [] -> [];
        [Single] -> rocksdb:postings_keys(Single);
        [First | Rest] ->
            {ok, Result} = lists:foldl(fun(P, {ok, Acc}) ->
                rocksdb:postings_union(Acc, P)
            end, {ok, First}, Rest),
            rocksdb:postings_keys(Result)
    end.

%% Check if document contains term (single lookup)
has_term(Db, Term, DocId) ->
    case rocksdb:get(Db, <<"term:", Term/binary>>, []) of
        {ok, Binary} -> rocksdb:posting_list_contains(Binary, DocId);
        not_found -> false
    end.

%% Check if document contains term (batch lookups - use resource)
has_term_batch(Postings, DocId) ->
    rocksdb:postings_contains(Postings, DocId).

%% Count documents matching AND query
count_matches(Db, Terms) ->
    Postings = lists:filtermap(fun(Term) ->
        case rocksdb:get(Db, <<"term:", Term/binary>>, []) of
            {ok, Binary} ->
                {ok, P} = rocksdb:postings_open(Binary),
                {true, P};
            not_found -> false
        end
    end, Terms),
    case Postings of
        [] -> 0;
        [Single] -> rocksdb:postings_count(Single);
        [First, Second | Rest] ->
            %% Use fast intersection count
            lists:foldl(fun(P, Count) ->
                min(Count, rocksdb:postings_intersection_count(First, P))
            end, rocksdb:postings_intersection_count(First, Second), Rest)
    end.

%% Filter documents by multiple terms (batch contains check)
filter_docs(Db, Term, DocIds) ->
    case rocksdb:get(Db, <<"term:", Term/binary>>, []) of
        {ok, Binary} ->
            {ok, Postings} = rocksdb:postings_open(Binary),
            [DocId || DocId <- DocIds,
                      rocksdb:postings_contains(Postings, DocId)];
        not_found -> []
    end.