ebson_iter (ebson v0.2.0)

View Source

BSON binary iterator for zero-copy traversal.

This module provides efficient traversal over BSON documents without decoding values eagerly. It uses offset-based ValueRefs to defer decoding until explicitly requested.

Design

The iterator operates directly on the raw BSON binary using offsets, allowing traversal without memory allocation for values. This is ideal for hot paths like query filtering where only specific fields need to be accessed.

API Overview

  • new/1 - Create an iterator from a BSON binary
  • next/1 - Get next element (key, type, value ref)
  • peek/2 - Find a key at top level without iteration state
  • find_path/2 - Navigate nested documents via path
  • decode_value/2 - Decode a value ref to Erlang term

ValueRef and Memory Safety

ValueRefs are maps containing #{bin, off, len} that point into the original binary without copying data. This is efficient but means the original binary is retained in memory as long as any ValueRef derived from it exists.

To break this reference chain, call decode_value/2 which uses binary:copy/1 internally. For full document decoding, use ebson:decode_map/1 which ensures all data is copied.

Example Usage

  %% Traverse and find a field
  {ok, Iter} = ebson_iter:new(BsonBin),
  case ebson_iter:next(Iter) of
      {ok, Key, Type, ValueRef, Iter2} ->
          {ok, Value} = ebson_iter:decode_value(Type, ValueRef),
          ...;
      done -> ...
  end.
 
  %% Quick lookup without full iteration
  {ok, Type, ValueRef} = ebson_iter:peek(BsonBin, <<"fieldname">>),
  {ok, Value} = ebson_iter:decode_value(Type, ValueRef).
 
  %% Nested path lookup
  {ok, Type, ValueRef} = ebson_iter:find_path(BsonBin, [<<"a">>, <<"b">>, <<"c">>]).

Summary

Functions

Decode a value from a ValueRef into an Erlang term. All binary data is copied using binary:copy/1 to break reference chains. Embedded documents/arrays are NOT recursively decoded - use decode_map/1 for that.

Find a value by navigating a path through nested documents. Path is a list of binary keys. Skips entire subdocuments efficiently using length prefixes.

Create a new iterator from a BSON binary. Validates document structure: length prefix and terminator.

Advance to the next element in the document. Returns {ok, Key, Type, ValueRef, NewIter} for each element, 'done' when iteration is complete, or {error, Reason} on malformed input.

Look up a key at the top level of a BSON document without decoding. Returns {ok, Type, ValueRef} if found, not_found if key doesn't exist, or {error, Reason} on malformed input.

Types

bson_type/0

-type bson_type() ::
          double | string | document | array | binary | objectid | boolean | datetime | null | int32 |
          int64 | timestamp | decimal128 | regex | javascript | minkey | maxkey.

iter/0

-opaque iter()

value_ref/0

-type value_ref() :: #{bin := binary(), off := non_neg_integer(), len := non_neg_integer()}.

Functions

decode_value(Type, ValueRef)

-spec decode_value(bson_type(), value_ref()) -> {ok, term()} | {error, term()}.

Decode a value from a ValueRef into an Erlang term. All binary data is copied using binary:copy/1 to break reference chains. Embedded documents/arrays are NOT recursively decoded - use decode_map/1 for that.

find_path(Bin, RestPath)

-spec find_path(binary(), [binary()]) -> {ok, bson_type(), value_ref()} | not_found | {error, term()}.

Find a value by navigating a path through nested documents. Path is a list of binary keys. Skips entire subdocuments efficiently using length prefixes.

new(Bin)

-spec new(binary()) -> {ok, iter()} | {error, term()}.

Create a new iterator from a BSON binary. Validates document structure: length prefix and terminator.

next(Bson_iter)

-spec next(iter()) -> {ok, binary(), bson_type(), value_ref(), iter()} | done | {error, term()}.

Advance to the next element in the document. Returns {ok, Key, Type, ValueRef, NewIter} for each element, 'done' when iteration is complete, or {error, Reason} on malformed input.

peek(Bin, KeyBin)

-spec peek(binary(), binary()) -> {ok, bson_type(), value_ref()} | not_found | {error, term()}.

Look up a key at the top level of a BSON document without decoding. Returns {ok, Type, ValueRef} if found, not_found if key doesn't exist, or {error, Reason} on malformed input.