Polymorphic associations with many to many
Besides belongs_to
, has_many
, has_one
and :through
associations, Ecto also includes many_to_many
. many_to_many
relationships, as the name says, allows a record from table X to have many associated entries from table Y and vice-versa. Although many_to_many
associations can be written as has_many :through
, using many_to_many
may considerably simplify some workflows.
In this guide, we will talk about polymorphic associations and how many_to_many
can remove boilerplate from certain approaches compared to has_many :through
.
Todo lists v65131
The internet has seen its share of todo list applications. But that won't stop us from creating our own!
In our case, there is one aspect of todo list applications we are interested in, which is the relationship where the todo list has many todo items. This exact scenario is explored in detail in a post about nested associations and embeds from Dashbit's blog. Let's recap the important points.
Our todo list app has two schemas, Todo.List
and Todo.Item
:
defmodule MyApp.TodoList do
use Ecto.Schema
schema "todo_lists" do
field :title
has_many :todo_items, MyApp.TodoItem
timestamps()
end
end
defmodule MyApp.TodoItem do
use Ecto.Schema
schema "todo_items" do
field :description
timestamps()
end
end
One of the ways to introduce a todo list with multiple items into the database is to couple our UI representation to our schemas. That's the approach we took in the blog post with Phoenix. Roughly:
<%= form_for @todo_list_changeset,
todo_list_path(@conn, :create),
fn f -> %>
<%= text_input f, :title %>
<%= inputs_for f, :todo_items, fn i -> %>
...
<% end %>
<% end %>
When such a form is submitted in Phoenix, it will send parameters with the following shape:
%{
"todo_list" => %{
"title" => "shopping list",
"todo_items" => %{
0 => %{"description" => "bread"},
1 => %{"description" => "eggs"}
}
}
}
We could then retrieve those parameters and pass it to an Ecto changeset and Ecto would automatically figure out what to do:
# In MyApp.TodoList
def changeset(struct, params \\ %{}) do
struct
|> Ecto.Changeset.cast(params, [:title])
|> Ecto.Changeset.cast_assoc(:todo_items, required: true)
end
# And then in MyApp.TodoItem
def changeset(struct, params \\ %{}) do
struct
|> Ecto.Changeset.cast(params, [:description])
end
By calling Ecto.Changeset.cast_assoc/3
, Ecto will look for a "todo_items" key inside the parameters given on cast, and compare those parameters with the items stored in the todo list struct. Ecto will automatically generate instructions to insert, update or delete todo items such that:
- if a todo item sent as parameter has an ID and it matches an existing associated todo item, we consider that todo item should be updated
- if a todo item sent as parameter does not have an ID (nor a matching ID), we consider that todo item should be inserted
- if a todo item is currently associated but its ID was not sent as parameter, we consider the todo item is being replaced and we act according to the
:on_replace
callback. By default:on_replace
will raise so you choose a behaviour between replacing, deleting, ignoring or nilifying the association
The advantage of using cast_assoc/3
is that Ecto is able to do all of the hard work of keeping the entries associated, as long as we pass the data exactly in the format that Ecto expects. However, such approach is not always preferable and in many situations it is better to design our associations differently or decouple our UIs from our database representation.
Polymorphic todo items
To show an example of where using cast_assoc/3
is just too complicated to be worth it, let's imagine you want your "todo items" to be polymorphic. For example, you want to be able to add todo items not only to "todo lists" but to many other parts of your application, such as projects, milestones, you name it.
First of all, it is important to remember Ecto does not provide the same type of polymorphic associations available in frameworks such as Rails and Laravel. In such frameworks, a polymorphic association uses two columns, the parent_id
and parent_type
. For example, one todo item would have parent_id
of 1 with parent_type
of "TodoList" while another would have parent_id
of 1 with parent_type
of "Project".
The issue with the design above is that it breaks database references. The database is no longer capable of guaranteeing the item you associate to exists or will continue to exist in the future. This leads to an inconsistent database which end-up pushing workarounds to your application.
The design above is also extremely inefficient, especially if you're working with large tables. Bear in mind that if that's your case, you might be forced to remove such polymorphic references in the future when frequent polymorphic queries start grinding the database to a halt even after adding indexes and optimizing the database.
Luckily, the documentation for the Ecto.Schema.belongs_to/3
macro includes a section named "Polymorphic associations" with some examples on how to design sane and performant associations. One of those approaches consists in using many join tables. Besides the "todo_lists" and "projects" tables and the "todo_items" table, we would create "todo_list_items" and "project_items" to associate todo items to todo lists and todo items to projects respectively. In terms of migrations, we are looking at the following:
create table(:todo_lists) do
add :title
timestamps()
end
create table(:projects) do
add :name
timestamps()
end
create table(:todo_items) do
add :description
timestamps()
end
create table(:todo_list_items) do
add :todo_item_id, references(:todo_items)
add :todo_list_id, references(:todo_lists)
timestamps()
end
create table(:project_items) do
add :todo_item_id, references(:todo_items)
add :project_id, references(:projects)
timestamps()
end
By adding one table per association pair, we keep database references and can efficiently perform queries that relies on indexes.
First let's see how to implement this functionality in Ecto using a has_many :through
and then use many_to_many
to remove a lot of the boilerplate we were forced to introduce.
Polymorphism with has_many :through
Given we want our todo items to be polymorphic, we can no longer associate a todo list to todo items directly. Instead we will create an intermediate schema to tie MyApp.TodoList
and MyApp.TodoItem
together.
defmodule MyApp.TodoList do
use Ecto.Schema
schema "todo_lists" do
field :title
has_many :todo_list_items, MyApp.TodoListItem
has_many :todo_items,
through: [:todo_list_items, :todo_item]
timestamps()
end
end
defmodule MyApp.TodoListItem do
use Ecto.Schema
schema "todo_list_items" do
belongs_to :todo_list, MyApp.TodoList
belongs_to :todo_item, MyApp.TodoItem
timestamps()
end
end
defmodule MyApp.TodoItem do
use Ecto.Schema
schema "todo_items" do
field :description
timestamps()
end
end
Although we introduced MyApp.TodoListItem
as an intermediate schema, has_many :through
allows us to access all todo items for any todo list transparently:
todo_lists |> Repo.preload(:todo_items)
The trouble is that :through
associations are read-only since Ecto does not have enough information to fill in the intermediate schema. This means that, if we still want to use cast_assoc
to insert a todo list with many todo items directly from the UI, we cannot use the :through
association and instead must go step by step. We would need to first cast_assoc(:todo_list_items)
from TodoList
and then call cast_assoc(:todo_item)
from the TodoListItem
schema:
# In MyApp.TodoList
def changeset(struct, params \\ %{}) do
struct
|> Ecto.Changeset.cast(params, [:title])
|> Ecto.Changeset.cast_assoc(
:todo_list_items,
required: true
)
end
# And then in the MyApp.TodoListItem
def changeset(struct, params \\ %{}) do
struct
|> Ecto.Changeset.cast_assoc(:todo_item, required: true)
end
# And then in MyApp.TodoItem
def changeset(struct, params \\ %{}) do
struct
|> Ecto.Changeset.cast(params, [:description])
end
To further complicate things, remember cast_assoc
expects a particular shape of data that reflects your associations. In this case, because of the intermediate schema, the data sent through your forms in Phoenix would have to look as follows:
%{"todo_list" => %{
"title" => "shipping list",
"todo_list_items" => %{
0 => %{"todo_item" => %{"description" => "bread"}},
1 => %{"todo_item" => %{"description" => "eggs"}},
}
}}
To make matters worse, you would have to duplicate this logic for every intermediate schema, and introduce MyApp.TodoListItem
for todo lists, MyApp.ProjectItem
for projects, etc.
Luckily, many_to_many
allows us to remove all of this boilerplate.
Polymorphism with many_to_many
In a way, the idea behind many_to_many
associations is that it allows us to associate two schemas via an intermediate schema while automatically taking care of all details about the intermediate schema. Let's rewrite the schemas above to use many_to_many
:
defmodule MyApp.TodoList do
use Ecto.Schema
schema "todo_lists" do
field :title
many_to_many :todo_items, MyApp.TodoItem,
join_through: MyApp.TodoListItem
timestamps()
end
end
defmodule MyApp.TodoListItem do
use Ecto.Schema
schema "todo_list_items" do
belongs_to :todo_list, MyApp.TodoList
belongs_to :todo_item, MyApp.TodoItem
timestamps()
end
end
defmodule MyApp.TodoItem do
use Ecto.Schema
schema "todo_items" do
field :description
timestamps()
end
end
Notice MyApp.TodoList
no longer needs to define a has_many
association pointing to the MyApp.TodoListItem
schema and instead we can just associate to :todo_items
using many_to_many
.
Differently from has_many :through
, many_to_many
associations are also writable. This means we can send data through our forms exactly as we did at the beginning of this guide:
%{"todo_list" => %{
"title" => "shipping list",
"todo_items" => %{
0 => %{"description" => "bread"},
1 => %{"description" => "eggs"},
}
}}
And we no longer need to define a changeset function in the intermediate schema:
# In MyApp.TodoList
def changeset(struct, params \\ %{}) do
struct
|> Ecto.Changeset.cast(params, [:title])
|> Ecto.Changeset.cast_assoc(:todo_items, required: true)
end
# And then in MyApp.TodoItem
def changeset(struct, params \\ %{}) do
struct
|> Ecto.Changeset.cast(params, [:description])
end
In other words, we can use exactly the same code we had in the "todo lists has_many todo items" case. So even when external constraints require us to use a join table, many_to_many
associations can automatically manage them for us. Everything you know about associations will just work with many_to_many
associations as well.
Finally, even though we have specified a schema as the :join_through
option in many_to_many
, many_to_many
can also work without intermediate schemas altogether by simply giving it a table name:
defmodule MyApp.TodoList do
use Ecto.Schema
schema "todo_lists" do
field :title
many_to_many :todo_items, MyApp.TodoItem,
join_through: "todo_list_items"
timestamps()
end
end
In this case, you can completely remove the MyApp.TodoListItem
schema from your application and the code above will still work. The only difference is that when using tables, any autogenerated value that is filled by Ecto schema, such as timestamps, won't be filled as we no longer have a schema. To solve this, you can either drop those fields from your migrations or set a default at the database level.
Summary
In this guide we used many_to_many
associations to drastically improve a polymorphic association design that relied on has_many :through
. Our goal was to allow "todo_items" to associate to different entities in our code base, such as "todo_lists" and "projects". We have done this by creating intermediate tables and by using many_to_many
associations to automatically manage those join tables.
At the end, our schemas may look like:
defmodule MyApp.TodoList do
use Ecto.Schema
schema "todo_lists" do
field :title
many_to_many :todo_items, MyApp.TodoItem,
join_through: "todo_list_items"
timestamps()
end
def changeset(struct, params \\ %{}) do
struct
|> Ecto.Changeset.cast(params, [:title])
|> Ecto.Changeset.cast_assoc(
:todo_items,
required: true
)
end
end
defmodule MyApp.Project do
use Ecto.Schema
schema "projects" do
field :name
many_to_many :todo_items, MyApp.TodoItem,
join_through: "project_items"
timestamps()
end
def changeset(struct, params \\ %{}) do
struct
|> Ecto.Changeset.cast(params, [:name])
|> Ecto.Changeset.cast_assoc(
:todo_items,
required: true
)
end
end
defmodule MyApp.TodoItem do
use Ecto.Schema
schema "todo_items" do
field :description
timestamps()
end
def changeset(struct, params \\ %{}) do
struct
|> Ecto.Changeset.cast(params, [:description])
end
end
And the database migration:
create table("todo_lists") do
add :title
timestamps()
end
create table("projects") do
add :name
timestamps()
end
create table("todo_items") do
add :description
timestamps()
end
# Primary key and timestamps are not required if
# using many_to_many without schemas
create table("todo_list_items", primary_key: false) do
add :todo_item_id, references(:todo_items)
add :todo_list_id, references(:todo_lists)
# timestamps()
end
# Primary key and timestamps are not required if
# using many_to_many without schemas
create table("project_items", primary_key: false) do
add :todo_item_id, references(:todo_items)
add :project_id, references(:projects)
# timestamps()
end
Overall our code looks structurally the same as has_many
would, although at the database level our relationships are expressed with join tables.
While in this guide we changed our code to cope with the parameter format required by cast_assoc
, in Constraints and Upserts we drop cast_assoc
altogether and use put_assoc
which brings more flexibilities when working with associations.