View Source Ecto.Adapters.FoundationDB (Ecto.Adapters.FoundationDB v0.4.0)
Adapter module for FoundationDB.
It uses erlfdb for communicating with the database, and implements an Ecto-compatible Layer on top of FDB's key-value store.
Installation
Install the latest stable release of FoundationDB from the official FoundationDB Releases.
The foundationdb-server
package is required on any system that will be running
a FoundationDB server instance. For example, it's common to
run the foundationdb-server
on your development machine and on managed
instances running a FoundationDB cluster, but not for your stateless Elixir
application server in production.
foundationdb-clients
is always required.
Include :ecto_foundationdb
in your list of dependencies in mix.exs
:
defp deps do
[
{:ecto_foundationdb, "~> 0.4"}
]
end
Usage
Define this module in your MyApp
application:
defmodule MyApp.Repo do
use Ecto.Repo, otp_app: :my_app, adapter: Ecto.Adapters.FoundationDB
use EctoFoundationDB.Migrator
def migrations(), do: []
end
Edit your config.exs
file to include the following:
config :my_app, MyApp.Repo,
cluster_file: "/etc/foundationdb/fdb.cluster"
In your app's supervisor, you can start the repo as follows:
children = [
...
MyApp.Repo,
...
]
Supervisor.start_link(...)
Tenants
EctoFoundationDB requires your application to use multitenancy via EctoFoundationDB.Tenant
.
We strongly encourage your application to decide on a multitenancy strategy that makes sense
for your problem domain (or simply use "default"
to get started).
Once you have opened a tenant, you'll use the Ecto :prefix
option to provide the tenant
to each Repo operation.
Creating a schema
This is a standard Ecto.Schema
that will be used in the examples of this documentation.
defmodule User do
use Ecto.Schema
@primary_key {:id, :binary_id, autogenerate: true}
schema "users" do
field(:name, :string)
field(:department, :string)
field(:balance, :integer)
timestamps()
end
# ...
end
Opening a tenant
See EctoFoundationDB.Tenant
for more details.
alias EctoFoundationDB.Tenant
tenant = Tenant.open!(MyApp.Repo, "some-org")
Inserting a new struct
The Tenant is stored on the struct as metadata, so that when passing structs
to Repo functions, the :prefix
option is not required.
alice = %User{name: "Alice", department: "Engineering"}
|> FoundationDB.usetenant(tenant)
alice = MyApp.Repo.insert!(alice)
Querying for a struct using the primary key
The :prefix
option is required. The struct returned will have the tenant
in the metadata, like above.
MyApp.Repo.get!(User, alice.id, prefix: tenant)
Transactions
Multiple calls to Repo functions can be grouped inside a single transaction, which is ACID and globally serializable.
Here's a simple example. Transactions can be arbitrarily complex, but there is a limit to the size and runtime.
MyApp.Repo.transaction(fn ->
alice = MyApp.Repo.get!(User, alice.id)
if alice.balance > 0 do
MyApp.Repo.update(User.changeset(alice, %{balance: alice.balance - 1}))
else
raise "Overdraft"
end
end, prefix: tenant)
Indexes and Queries
The implication of the FoundationDB Layer Concept is that the manner in which data can be stored and accessed is the responsibility of the client application. To achieve general purpose data storage and access patterns commonly desired in web applications, EctoFoundationDB provides support for indexes out of the box, and also allows your application to define custom indexes.
When you define a migration that calls create index(User, [:department])
, a Default
index is
created on the :department
field. The following two indexes are equivalent:
# ...
create index(User, [:department])
# is equivalent to
create index(User, [:department], options: [indexer: EctoFoundationDB.Indexer.Default]))]
# ...
Timestamp indexes and multi-field indexes are supported:
create index(User, [:inserted_at])
create index(User, [:birthyear, :department, :birthdate])
In fact, the index value can be any Elixir term. All types support Equal queries (of the form x.field == "value"
).
However, certain Ecto Schema types support Between queries (of the form x.field >= 10 and x.field < 20
).
See the Data Types section for the list.
See the Migrations section for further details about managing indexes in your application.
Index Query Examples
Retrieve all users in the Engineering department (with an Equal constraint):
query = from(u in User, where: u.department == ^"Engineering")
MyApp.Repo.all(query, prefix: tenant)
Retrieve all users with a birthyear from 1992 to 1994 (with a Between constraint):
query = from(u in User,
where: u.birthyear >= ^1992 and u.birthyear <= ^1994
)
MyApp.Repo.all(query, prefix: tenant)
Retrieve all Engineers born in August 1992:
query = from(u in User,
where: u.birthyear == ^1992 and
u.department == ^"Engineering" and
u.birthdate >= ^~D[1992-08-01] and u.birthdate < ^~D[1992-09-01]
)
MyApp.Repo.all(query, prefix: tenant)
Index ordering matters
When you create an index using multiple fields, the FDB key that stores the index will be extended with all the values in the order of your defined index fields. Because FoundationDB stores the keys themselves in a well-defined order, the order of the fields in your index determines the Between queries you can perform.
RULE: For any Query, there can be 0 or 1 Between clauses, and if one exists for a field, then there must not exist an Equal clause for ay field that follows, according to the list of fields in the index. Moreover, the Between field must be of a Data Type supported for Between queries.
Above, we demonstrated a nontrivial query with 2 Equal constraints and 1 Between constraint. Compare that with an invalid
query below, which attempts a Between constraint on the :birthyear
field followed by an Equal constraint on the :department
field.
# Invalid query!
from(u in User,
where: u.birthyear >= ^1992 and u.birthyear < ^1995 and
u.department == ^"Engineering"
)
Why is this query unsupported by EctoFDB? With the index defined as-is, EctoFDB is unable to extract a minimal dataset from the database in one operation. Instead of over-extracting data and then applying a filter, we've chosen to reject the query. We hope this encourages correct index management and removes worry about execution time of individual Repo function calls.
Yeah that's cool, but how do I run the query I want to run? You have 2 options.
- Purposefully over-extract the data using Stream, and do the filtering in Elixir:
from(u in User, where: u.birthyear >= ^1992 and u.birthyear < 1995 and)
|> MyApp.Repo.stream(prefix: tenant)
|> Stream.filter(&((&1).department == "Engineering"))
|> Enum.to_list()
or
- Create the necessary index. For our example above, the new index might look like:
create index(User, [:department, :birthyear])
We encourage you to test execution times of your expected workload before deciding to create an index.
Custom Indexes
As data is inserted, updated, deleted, and queried, the Indexer callbacks (via behaviour EctoFoundationDB.Indexer
)
are executed. Your Indexer module does all the work necessary to maintain the index within the transaction
that the data itself is being manipulated. Many different types of indexes are possible. You are in control!
For an example, please see the MaxValue implementation which is used internally by EctoFoundationDB to keep track of the maximum value of a field in a Schema.
Transactions
The Repo functions will automatically use FoundationDB transactions behind the scenes. EctoFoundationDB also exposes the Ecto.Repo transaction function, allowing you to group operations together with transactional isolation:
MyApp.Repo.transaction(fn ->
# Ecto work
end, prefix: tenant)
Please read the FoundationDB Developer Guide on transactions for more information about how to develop with transactions.
It's important to remember that even though the database gives ACID guarantees about the
keys updated in a transaction, the Elixir function passed into Repo.transaction/1
may be executed more than once when any conflicts
are encountered. For this reason, your function must not have any side effects other than
the database operations. For example, do not publish any messages to other processes
(such as Phoenix.PubSub
) from within the transaction.
# Do not do this!
MyApp.Repo.transaction(fn ->
MyApp.Repo.insert(%User{name: "Alice"})
...
Phoenix.PubSub.broadcast("my_topic", "new_user") # Not safe! :(
end, prefix: tenant)
# Instead, do this:
MyApp.Repo.transaction(fn ->
MyApp.Repo.insert(%User{name: "Alice"})
...
end, prefix: tenant)
Phoenix.PubSub.broadcast("my_topic", "new_user") # Safe :)
If you're looking for PubSub-like functionality at the transaction-level, please see Watches.
Migrations
At first glance, EctoFoundationDB migrations may look similar to that of :ecto_sql
,
but the actual execution of migrations and how your app must be configured are very
different, so please read this section in full.
If your app uses indexes on any of your schemas, you must include a module that implements
the EctoFoundationDB.Migrator
behaviour. By default, this is assumed to be your Repo
module, as we have demonstrated above.
Migrations are only used for index management. There are no tables to add, drop, or modify.
When you add a field to your Schema, any read requests on old objects will return nil
in
that field.
As tenants are opened during your application runtime, migrations will be executed
automatically. This distributes the migration across a potentially long period of time,
as migrations will not be executed unless the tenant is opened. Specifically, a call to
Tenant.open/2
will block until the migration for that tenant completes. Other tenants
will not be affected.
The Migrator is a module in your application runtime that provides the full list of ordered migrations. These are the migrations that will be executed when a tenant is opened. If you leave out a migration from the list, it will not be applied.
The result of migrations/0
will change over time to include new migrations, like so:
defmodule MyApp.Repo do
# ...
def migrations() do
[
{0, MyApp.IndexUserByDepartment}
# At a later date, we my add a new index with a corresponding addition
# tho the migrations list:
# {1, MyApp.IndexUserByRole}
]
end
end
Each migration is contained in a separate module, much like EctoSQL's. However, the operations to be carried out must be returned as a list. For example, the creation an index may look like this:
defmodule MyApp.IndexUserByDepartment do
use EctoFoundationDB.Migration
def change() do
[
create index(User, [:department])
]
end
end
Your migrator and MyApp.<Migration>
modules are part of your codebase. They must be
included in your application release.
Migrations can be completed in full at any time with a call to
EctoFoundationDB.Migrator.up_all/1
. Depending on how many tenants you have in
your database, and the size of the data for each tenant, this operation might take a long
time and make heavy use of system resources. It is safe to interrupt this operation, as
migrations are performed transactionally. However, if you interrupt a migration, the next
attempt for that tenant may have a brief delay (~5 sec); the migrator must ensure that the
previous migration has indeed stopped executing.
Data Types
EctoFoundationDB stores your struct's data using :erlang.term_to_binary/1
, and
retrieves it with :erlang.binary_to_term/1
. As such, there is no data type
conversion between Elixir types and Database Types. Any term you put in your
struct will be stored and later retrieved.
Data types are used by EctoFoundationDB for the creation and querying of indexes. Certain Ecto types are encoded into the FDB key, which allows you to formulate Between queries on indexes of these types:
:id
:binary_id
:integer
:float
:boolean
:string
:binary
:date
:time
:time_usec
:naive_datetime
:naive_datetime_usec
:utc_datetime
:utc_datetime_usec
See Indexes and Queries for more information.
Watches
FoundationDB Watches are
similar to Triggers in an RDBMS. Registering a watch on a particular key provides a guarantee*
that when that key in the database is changed, the client application is notified with a
push-style notification. This registration is initiated inside a transaction, and the caller is
provided a EctoFoundationDB.Future
object that is used to receive the push notification at a later time.
The notification is delivered directly to the Elixir process that requested it.
The EctoFDB Repo module allows you to register a watch on any struct that you have written to the database. The FDB watch is registered on the key that represents the Primary Write (the key-value that stores all the data of your struct). When that struct changes in the database via any means, the Future is resolved and the caller has the opportunity to refresh its internal state.
EctoFDB Repo functions for watches
EctoFDB adds the following functions to your Repo module. They are each discussed in detail below.
Repo.watch/1
/Repo.watch/2
: Registers a watchRepo.assign_ready/2
/Repo.assign_ready/3
: Handles Future resolution via a push message
Repo.watch/2
With arguments struct
and options
, registers
a watch on the given struct and returns a Future. This Future resolves to
a value when the FDB key is set or cleared. Until then, the Future remains unresolved.
The Future from a watch is resolvable outside of an FDB transaction.
In addition to other standard options, a :label
option will prime the Future for use by assign_ready/3
.
It's recommended that you provide a label and use assign_ready/3
, but not required.
Repo.assign_ready/3
With arguments futures
, ready_refs
, and
options
, returns a tuple containing: 1) a Keyword of structs with keys according to the :label
provided to watch/2
, and 2) an updated list of futures that are still unresolved.
In order for the caller to acquire the ready_refs
arguments, you must receive messages from the
process mailbox of the form {ref, :ready}
. See the example below for detail. If your process chooses
to buffer a list ready_refs
longer than 1, then assign_ready/3
uses pipelining to retrieve the
resulting structs. Each watched struct must have been registered with a unique :label
.
The options
are provided to Repo.async_get/3
internally. In addition to standard options, a
watch?: true
option will re-register a new watch on the struct so that your calling
process's event loop can always be aware of changes.
Example
The power of watches is best illustrated in the context of a stateful Elixir process that is interested in tracking the up-to-date value for a particular schema. This could be a GenServer, LiveView, or some other stateful process.
defmodule MyAliceView do
# ...
def init(tenant) do
{alice, future} = MyRepo.transaction(
fn ->
alice = MyRepo.get_by(User, name: "Alice")
{alice, MyRepo.watch(alice, label: :alice)}
end, prefix: tenant)
assigns = [alice: alice]
{:ok, %{assigns: assigns, futures: [future]}}
end
# ...
def handle_info({ref, :ready}, state=%{assigns: assigns, futures: futures}) when is_reference(ref) do
{new_assigns, new_futures} = MyRepo.assign_ready(futures, [ref], watch?: true)
{:noreply, %{
state |
assigns: Map.merge(assigns, Enum.into(new_assigns, %{})),
futures: new_futures
}}
end
end
By handling the Future message resolution event {ref, :ready}
in the handle_info/2
, we never need to
call Repo.await/1
. Instead, our process can do other work and react to the :ready
event immediately
when it's delivered.
And Repo.assign_ready/3
provides a conventional way to retrieve the result and register a new watch
in one go, so that updates to Alice are not missed.
Upserts
The FoundationDB Adapter supports the following upsert approaches.
Upserts with on_conflict
The following choices for on_conflict
are supported, and they work in the manner
described by the official Ecto documentation.
:raise
: The default.:nothing
:replace_all
{:replace_all_except, fields}
{:replace, fields}
Upserts with conflict_target
If you provide conflict_target: []
, the FoundationDB Adapter will make no effort to
inspect the existence of objects in the database before insert. When an existing
object is blindly overwritten, your indexes may become inconsistent, resulting in undefined
behavior. You should only consider its use if one of the following is true:
- (a) your schema has no indexes
- (b) you know apriori that any indexed fields are unchanged at the time of upsert, or
- (c) you know apirori that the primary key you're inserting doesn't already exist.
For these use cases, this option may speed up the loading of initial data at scale.
Pipelining
Pipelining is the technique of sending multiple queries to a database at the same time, and
then receiving the results at a later time. In doing so, you can often avoid waiting
for multiple network round trips. EctoFDB uses a async
/await
syntax to
implement pipelining.
EctoFDB Repo functions for pipelining
EctoFDB adds the following functions to your Repo module. You'll notice that there are async_*
versions of all the Queryable functions on the standard Ecto.Repo
. These have the same arguments
as their counterparts, and they each return a single EctoFoundationDB.Future
.
Repo.async_get/2
/Repo.async_get/3
Repo.async_get!/2
/Repo.async_get!/3
Repo.async_get_by/2
/Repo.async_get_by/3
Repo.async_get_by!/2
/Repo.async_get_by!/3
Repo.async_one/1
/Repo.async_one/2
Repo.async_one!/1
/Repo.async_one!/2
Repo.async_all/1
/Repo.async_all/2
Repo.await/1
: The results of the above functions are provided toRepo.await/1
to resolve the Futures. It accepts a single Future or a list of Futures.
Example
Consider this example. Our transaction below has 2 network round trips in serial. The "Alice" User is retrieved, and then the "Bob" user is retrieved. Finally, the list is constructed and the transaction ends.
result = MyApp.Repo.transaction(fn ->
[
MyApp.Repo.get_by(User, name: "Alice"),
MyApp.Repo.get_by(User, name: "Bob")
]
end, prefix: tenant)
[alice, bob] = result
With pipelining, we can issue 2 async_get_by
queries without waiting for their result,
and then only when we need the data, we await
the result. Notice that the list of futures
is constructed, and then we await
on them.
result = MyApp.Repo.transaction(fn ->
futures = [
MyApp.Repo.async_get_by(User, name: "Alice"),
MyApp.Repo.async_get_by(User, name: "Bob")
]
MyApp.Repo.await(futures)
end, prefix: tenant)
[alice, bob] = result
Conceptually, only 1 network round-trip is required, so this will be faster than waiting on each in sequence.
Pipelining on range queries (interleaving waits)
The functions Repo.async_all/1
and Repo.async_all/2
work a little bit different on the await. When the DB must
return a list of results, if that list ends up being larger than :erlfdb
's configured :target_bytes
,
then the result set is split into multiple round-trips to the database. If your application awaits multiple
such queries, then :erlfdb
's "interleaving waits" feature is used.
Consider this example, where we're retrieving all Users and all Posts in the same transaction.
MyApp.Repo.transaction(fn ->
futures = [
MyApp.Repo.async_all(User),
MyApp.Repo.async_all(Post)
]
MyApp.Repo.await(futures)
end, prefix: tenant)
Each query individually executed would be multiple round-trips to the database as pages of results are retrieved.
With :erlfdb
's interleaving waits, we send as many get_range requests as possible with pipelining
until all are exhausted.
This happens automatically when you include the list of futures to your Repo.await/1
.
Standard Options
:cluster_file
- The path to the fdb.cluster file. The default is"/etc/foundationdb/fdb.cluster"
.:migrator
- A module that implements theEctoFoundationDB.Migrator
behaviour. Required when using any indexes. Defaults tonil
. Whennil
, your Repo module is assumed to be the Migrator.
Advanced Options
:storage_id
- All tenants created by this adapter are prefixed with this string. This allows multiple configurations to operate on the same FoundationDB cluster independently. Defaults to"Ecto.Adapters.FoundationDB"
.:open_db
- 1-arity function accepting the Repo module. Used for opening a reference to the FoundationDB cluster. Defaults to&EctoFoundationDB.Database.open/1
. When usingEctoFoundationDB.Sandbox
, you should set this option toSandbox.open_db/1
.:migration_step
- The maximum number of keys to process in a single transaction when running migrations. Defaults to1000
. If you use a number that is too large, the FDB transactions run by the Migrator will fail.:idx_cache
- When set to:enabled
, the Ecto ets cache is used to store the available indexes per tenant. This speeds up all database operations. Defaults to:enabled
.:max_single_value_size
- Number of Bytes above which an encoded struct will be split into multiple FDB key-value pairs upon insert and update. Defaults to100_000
. Any value beyond100_000
is unsupported by FoundationDB. You probably do not want to change this value. Also see:max_value_size
.:max_value_size
- Number of Bytes above which an encoded struct will be rejected by EctoFDB upon insert and update. Defaults to:infinity
, which means that a single encoded struct of any size can be written to any number of keys. Please remember that FDB's transaction size limit of 10MB still applies. The maximum number of key-value pairs for a given encoded struct, not counting indexes, is computed by1 + ceil(:max_value_size / :max_single_value_size)
.
Optimizing for throughput and latency
Pipelining operations inside your transactions can offer significant speed-ups, and you should also consider parallelizing multiple transactions. Both techniques are useful in different circumstances.
- Pipelining: The transaction can contain arbitrary logic and remain ACID compliant. If you make smart use of pipelining, the latency of execution of that logic is kept to a minimum.
- Parallel transactions: Each transaction is internally ACID compliant, but any 2 given transactions may conflict if they're accessing the same keys. When transactions do not conflict, throughput goes up.
If you're attempting to get the maximum throughput on data loading, you'll probably want to make use of both!
Read more at FDB Developer Guide | Loading Data.
Limitations and Caveats
FoundationDB
Please read FoundationDB Known Limitations.
Layer
EctoFoundationDB implements a specific Layer on the FoundationDB
key-value store. This Layer is intended to be generally useful, but you
may not find it suitable for your workloads. The Layer is documented in
detail at EctoFoundationDB.Layer
. This project does not support
plugging in other Layers.
Tenants
The use of a tenant implies an ownership relationship between the tenant and the data. It's up to you how you use this relationship in your application.
Failure to specify a tenant will result in a raised exception at runtime.
Migrations
The following are not supported:
- Renaming a table
- Renaming a field
- Dropping an index
- Moving down in migration bersions (rollback)
Finally, the Ecto Mix tasks are known to be unsupported:
# These commands are not supported. Do not use them with :ecto_foundationdb!
# mix ecto.create
# mix ecto.drop
# mix ecto.migrate
# mix ecto.gen.migration
# mix ecto.rollback
# mix ecto.migrations
Key and Value Size
FoundationDB has limits on key and value size Please be aware of these limitations as you develop. EctoFoundationDB doesn't make any attempt to protect you from these errors.
Ecto Queries
You'll find that most queries outside of those detailed in the documentation fail with
an EctoFoundationDB.Exception.Unsupported
exception. The FoundationDB Layer concept
precludes complex queries from being executed within the database. Therefore, it only makes sense
to implement a limited set of query types -- specifically those that do have optimized database
query semantics. All other filtering, aggregation, and grouping must be done by your Elixir code.
In other words, you'll be writing less SQL, and more Elixir.
Refer to the integration tests for a collection of queries that is known to work.
Other Ecto Features
As you test the boundaries of EctoFoundationDB, you'll find that many other Ecto features just plain don't work with this adapter. Ecto has a lot of features, and this adapter is in the early stage. For any given unsupported feature, the underlying reason will be one of the following.
- It makes sense for EctoFoundationDB, but it just hasn't been implemented yet.
- The fundamental differences between FDB and SQL backends mean that there is no practical reason to implement it in EctoFoundationDB.
Summary
Functions
Executes the given function in a transaction on the database.
Functions
@spec db(Ecto.Repo.t()) :: EctoFoundationDB.Database.t()
@spec transactional(EctoFoundationDB.Tenant.t() | nil, function()) :: any()
Executes the given function in a transaction on the database.
If you provide an arity-0 function, your function will be executed in a newly spawned process. This is to ensure that EctoFoundationDB can safely manage the process dictionary.
Please be aware of the limitations that FoundationDB imposes on transactions.
For example, a transaction must complete within 5 seconds.
@spec usetenant(Ecto.Schema.schema(), any()) :: Ecto.Schema.schema()