Goblin (Goblin v0.5.0)
View SourceAn embedded LSM-Tree database for Elixir.
Goblin provides a persistent key-value store with ACID transactions, automatic compaction, and crash recovery. It uses a Log-Structured Merge-Tree architecture optimized for write-heavy workloads.
Starting a database
{:ok, db} = Goblin.start_link(
name: MyApp.DB,
db_dir: "/path/to/db"
)Basic operations
Goblin.put(db, :user_123, %{name: "Alice", age: 30})
Goblin.get(db, :user_123)
# => %{name: "Alice", age: 30}
Goblin.remove(db, :user_123)
Goblin.get(db, :user_123)
# => nilBatch operations
Goblin.put_multi(db, [
{:user_1, %{name: "Alice"}},
{:user_2, %{name: "Bob"}}
])
Goblin.remove_multi(db, [:user_1, :user_2])Transactions
alias Goblin.Tx
Goblin.transaction(db, fn tx ->
count = Tx.get(tx, :counter, 0)
tx = Tx.put(tx, :counter, count + 1)
{:commit, tx, count + 1}
end)
Goblin.read(db, fn tx ->
key1 = Tx.get(tx, :start, 0)
key2 = Tx.get(tx, key1)
{key1, key2}
end)Configuration
name- Registered name for the database supervisor (optional)db_dir- Directory path for storing database files (required)flush_level_file_limit- How many files in flush level before compaction is triggered (default: 4)mem_limit- How many bytes are stored in memory before flushing (default: 64 1024 1024)level_base_size- How many bytes are stored in level 1 before compaction is triggered (default: 256 1024 1024)level_size_multiplier- Which factor each level size is multiplied with (default: 10)max_sst_size- How large, in bytes, the SST portion of a disk table is allowed to be (default: level_base_size / level_size_multiplier)bf_fpp- Bloom filter false positive probability (default: 0.01)
Architecture
Goblin consists of several supervised components:
- Broker - Transactions management
- MemTable - Memory buffer server
- DiskTables - Disk table management
- Compactor - Compaction server
- Cleaner - Safe file removal server
- WAL - Write-ahead log server
- Manifest - Tracks database state and file metadata
Summary
Functions
Returns a specification to start this module under a supervisor.
Checks whether the database is currently compacting or not.
Exports the database files to the provided dir, export_dir.
export can be used to create back-ups of the database.
A copy of the manifest is created and the currently tracked files along with the manifest copy is archived in a .tar.gz file in export_dir.
Checks whether the database is currently flushing or not.
Retrieves the value associated with the provided key. Returns the default value of the key was not found.
Retrieves values for multiple keys in a single operation. If a key in the provided list of keys is not found then it is excluded from the resulting list of key-value pairs.
Writes a key-value pair to the database.
Writes multiple key-value pairs in a single operation.
Performs a read transaction. A snapshot is taken in order to provide consistent reads of the database. This allows concurrent readers and will not block each other.
Removes a key from the database.
Removes multiple keys from the database in a single operation.
Retrieves all key-value pairs within a specified range in the form of a stream.
Starts the Goblin database supervisor and all child processes.
Stops the Goblin database supervisor.
Subscribes the current process for database writes.
When a write occurs then either {:put, key, value} or {:remove, key} is dispatched to the subscribers mailbox.
Batch writes result in separate messages.
Executes a function within a database transaction that allow writes.
Unsubscribes the current process for future database writes.
Types
Functions
Returns a specification to start this module under a supervisor.
See Supervisor.
@spec compacting?(Supervisor.supervisor()) :: boolean()
Checks whether the database is currently compacting or not.
Parameters
db- The dabase server (PID or registered name)
Returns
- A boolean indicating whether the database is compacting or not
@spec export(Supervisor.supervisor(), Path.t()) :: {:ok, Path.t()} | {:error, term()}
Exports the database files to the provided dir, export_dir.
export can be used to create back-ups of the database.
A copy of the manifest is created and the currently tracked files along with the manifest copy is archived in a .tar.gz file in export_dir.
Note
The export runs in a read/2, thus preventing deletion of files.
Parameters
db- The database server (PID or registered name)export_dir- Path to the directory to place the exported .tar.gz
Returns
{:ok, export_path}- Path of the exported .tar.gz file{:error, reason}- If an error occurred
Examples
Goblin.export(db, "path/to/back_ups")
# => {:ok, "path/to/back_ups/goblin_20260106T102414Z.tar.g"}
@spec flushing?(Supervisor.supervisor()) :: boolean()
Checks whether the database is currently flushing or not.
Parameters
db- The database server (PID or registered name)
Returns
- A boolean indicating whether the database is flushing or not
@spec get(Supervisor.supervisor(), db_key(), db_value() | nil) :: db_value() | nil
Retrieves the value associated with the provided key. Returns the default value of the key was not found.
Parameters
db- The database server (PID or registered name)key- The key to look updefault- Value to return ifkeyis not found (default:nil)
Returns
- The value associated with the key, or
defaultif not found
Examples
Goblin.get(db, :user_123)
# => nil
Goblin.put(db, :user_123, %{name: "Alice", age: 30})
Goblin.get(db, :user_123)
# => %{name: "Alice", age: 30}
Goblin.get(db, :non_existant_key, :default_value)
# => :default_value
@spec get_multi(Supervisor.supervisor(), [db_key()]) :: [{db_key(), db_value()}]
Retrieves values for multiple keys in a single operation. If a key in the provided list of keys is not found then it is excluded from the resulting list of key-value pairs.
If a lot of keys are already flushed to disk, then this operation is more efficient than calling get/3 multiple times.
Parameters
db- The database server (PID or registered name)keys- A list of keys to look up
Returns
- A list of
{key, value}tuples for keys found, sorted by key
Examples
Goblin.put_multi(db, [
{:user_1, %{name: "Alice"}},
{:user_2, %{name: "Bob"}}
])
Goblin.get_multi(db, [:user_1, :user_2])
# => [user_1: %{name: "Alice"}, user_2: %{name: "Bob"}]
Goblin.get_multi(db, [:user_1, :user_2, :non_existing_user])
# => [user_1: %{name: "Alice"}, user_2: %{name: "Bob"}]
@spec put(Supervisor.supervisor(), db_key(), db_value()) :: :ok
Writes a key-value pair to the database.
Parameters
db- The database server (PID or registered name)key- Any Elixir term to use as the keyvalue- Any Elixir to associate withkey
Returns
:ok
Examples
Goblin.put(db, :user_123, %{name: "Alice", age: 30})
# => :ok
@spec put_multi(Supervisor.supervisor(), [{db_key(), db_value()}]) :: :ok
Writes multiple key-value pairs in a single operation.
This is more efficient than calling put/3 multiple times as it batches the writes together.
Parameters
db- The database server (PID or registered name)pairs- A list of{key, value}tuples
Returns
:ok
Examples
Goblin.put_multi(db, [
{:user_1, %{name: "Alice"}},
{:user_2, %{name: "Bob"}},
{:user_3, %{name: "Charlie"}}
])
# => :ok
@spec read(Supervisor.supervisor(), (Goblin.Tx.t() -> any())) :: any()
Performs a read transaction. A snapshot is taken in order to provide consistent reads of the database. This allows concurrent readers and will not block each other.
Attempting to write to the database during this transactions results in a raise.
Any file deletion is prevented while there exists an ongoing read transaction.
Parameters
db- The database server (PID or registered name)f- A function that takes aGoblin.Tx.t()struct
Returns
- The evaluation of
f
Examples
# Get two values within snapshot
Goblin.read(db, fn tx ->
key1 = Goblin.Tx.get(tx, :start, 0)
key2 = Goblin.Tx.get(tx, key1)
{key1, key2}
end)
# => {key1, key2}
# Attempting to write will raise
Goblin.read(db, fn tx ->
tx = Goblin.Tx.put(tx, :user_123, %{name: "Alice", age:30})
{:commit, tx, :ok}
end)
# => RuntimeError
@spec remove(Supervisor.supervisor(), db_key()) :: :ok
Removes a key from the database.
This operation writes a tombstone marker for the key. The actual data is eventually removed during compaction.
Parameters
db- The database server (PID or registered name)key- The key to remove
Returns
:ok
Examples
Goblin.get(db, :user_123)
# => %{name: "Alice", age: 30}
Goblin.remove(db, :user_123)
# => :ok
Goblin.get(db, :user_123)
# => nil
@spec remove_multi(Supervisor.supervisor(), [db_key()]) :: :ok
Removes multiple keys from the database in a single operation.
This is more efficient than calling remove/2 multiple times as it batches the removals together.
Parameters
db- The database server (PID or registered name)keys- A list of keys to remove
Returns
:ok
Examples
Goblin.remove_multi(db, [:user_1, :user_2, :user_3])
# => :ok
@spec select( Supervisor.supervisor(), keyword() ) :: Enumerable.t({db_key(), db_value()})
Retrieves all key-value pairs within a specified range in the form of a stream.
Entries in the stream are sorted by key in ascending order.
Both min and max are inclusive.
If neither bound is specified, then all key-value pairs are retrieved.
Specifying only min returns all key-value pairs from min and onwards.
Specifying only max returns all key-value pairs until max.
Parameters
db- The database server (PID or registered name)opts- Keyword list of options::min- Minimum key (inclusive, optional):max- Maximum key (inclusive, optional)
Returns
- A stream over the key-value pairs in the provided range
Examples
Goblin.put_multi(db, [
{1, "one"},
{2, "two"},
{3, "three"},
{4, "four"},
{5, "five"}
])
Goblin.select(db, min: 2, max: 4) |> Enum.to_list()
# => [{2, "two"}, {3, "three"}, {4, "four"}]
Goblin.select(db, min: 3) |> Enum.to_list()
# => [{3, "three"}, {4, "four"}, {5, "five"}]
Goblin.select(db, max: 2) |> Enum.to_list()
# => [{1, "one"}, {2, "two"}]
Goblin.select(db) |> Enum.to_list()
# => [{1, "one"}, {2, "two"}, {3, "three"}, {4, "four"}, {5, "five"}]
@spec start_link(keyword()) :: Supervisor.on_start()
Starts the Goblin database supervisor and all child processes.
Creates the database directory if it doesn't exist and initializes all components including the Writer, Store, Compactor, WAL, and Manifest.
Options
:name- Registered name for the supervisor (optional):db_dir- Directory path for database files (required):flush_level_file_limit- How many files in flush level before compaction is triggered (default: 4):mem_limit- How many bytes are stored in memory before flushing (default: 64 1024 1024):level_base_size- How many bytes are stored in level 1 before compaction is triggered (default: 256 1024 1024):level_size_multiplier- Which factor each level size is multiplied with (default: 10):max_sst_size- How large, in bytes, the SST portion of a disk table is allowed to be (default: level_base_size / level_size_multiplier):bf_fpp- Bloom filter false positive probability (default: 0.01)
Returns
{:ok, pid}- On successful start{:error, reason}- On failure
Examples
{:ok, db} = Goblin.start_link(
name: MyApp.DB,
db_dir: "/var/lib/myapp/db",
flush_level_file_limit: 4,
mem_limit: 64 * 1024 * 1024,
level_base_size: 256 * 1024 * 1024,
level_size_multiplier: 10,
bf_fpp: 0.01
)
@spec stop(Supervisor.supervisor(), term(), non_neg_integer() | :infinity) :: :ok
Stops the Goblin database supervisor.
Parameters
db- Database supervisor pid or registered namereason- Supervisor reason, defaults to:normaltimeout- Defaults to:infinity
Returns
:ok
@spec subscribe(Supervisor.supervisor()) :: :ok | {:error, term()}
Subscribes the current process for database writes.
When a write occurs then either {:put, key, value} or {:remove, key} is dispatched to the subscribers mailbox.
Batch writes result in separate messages.
Parameters
db- The database server (pid or registered name)
Returns
:okif successful,{:error, reason}otherwise
Examples
Goblin.subscribe(db)
# => :ok
Goblin.put(db, :user_1, %{name: "Alice"})
flush()
# => {:put, :user_1, %{name: "Alice"}}
Goblin.remove(db, :user_1)
flush()
# => {:remove, :user_1}
@spec transaction(Supervisor.supervisor(), (Goblin.Tx.t() -> Goblin.Tx.return())) :: any() | {:error, :aborted | :already_in_tx | :not_writer}
Executes a function within a database transaction that allow writes.
Transactions are executed serially and are ACID-compliant.
The provided function, f, receives a transaction struct and must return either {:commit, tx, reply} in order to commit the transaction, or :abort to abort the transaction.
Returning anything else results in a raise.
Parameters
db- The database server (PID or registered name)f-A function that takes aGoblin.Tx.t()struct and returns a transaction result
Returns
reply- The value returned from the transaction when successfully committed{:error, :aborted}- When the transaction is aborted{:error, :already_in_tx}- When attempting to create nested transactions{:error, :not_writer}- When attempting to commit by somebody other than the current writer
Examples
# Update a value
Goblin.transaction(db, fn tx ->
count = Goblin.Tx.get(tx, :counter, 0)
tx = Goblin.Tx.put(tx, :counter, counter + 1)
{:commit, tx, :ok}
end)
# => :ok
# Abort a transaction
Goblin.transaction(db, fn _tx ->
:abort
end)
# => {:error, :aborted}
@spec unsubscribe(Supervisor.supervisor()) :: :ok | {:error, term()}
Unsubscribes the current process for future database writes.
Parameters
db- The database server (pid or registered name)
Returns
:ok