View Source Library Guidelines
This document outlines general guidelines, anti-patterns, and rules for those writing and publishing Elixir libraries meant to be consumed by other developers.
Getting started
You can create a new Elixir library by running the mix new
command:
$ mix new my_library
The project name is given in the snake_case
convention where all letters are lowercase and words are separate with underscores. This is the same convention used by variables, function names and atoms in Elixir. See the Naming Conventions document for more information.
Every project has a mix.exs
file, with instructions on how to build, compile, run tests, and so on. Libraries commonly have a lib
directory, which includes Elixir source code, and a test
directory. A src
directory may also exist for Erlang sources.
For more information on running your project, see the official Mix & OTP guide or Mix documentation.
Applications with supervision tree
The mix new
command also allows the --sup
option to scaffold an application with a supervision tree out of the box. We talk about supervision trees later on when discussing one of the common anti-patterns when writing libraries.
Publishing
Writing code is only the first of many steps to publish a package. We strongly recommend developers to:
Choose a versioning schema. Elixir requires versions to be in the format
MAJOR.MINOR.PATCH
but the meaning of those numbers is up to you. Most projects choose Semantic Versioning.Choose a license. The most common licenses in the Elixir community are the MIT License and the Apache License 2.0. The latter is also the one used by Elixir itself.
Run the code formatter. The code formatter formats your code according to a consistent style shared by your library and the whole community, making it easier for other developers to understand your code and contribute.
Write tests. Elixir ships with a test-framework named ExUnit. The project generated by
mix new
includes sample tests and doctests.Write documentation. The Elixir community is proud of treating documentation as a first-class citizen and making documentation easily accessible. Libraries contribute to the status quo by providing complete API documentation with examples for their modules, types and functions. See the Writing Documentation guide for more information. Projects like ExDoc can be used to generate HTML and EPUB documents from the documentation. ExDoc also supports "extra pages", like this one that you are reading. Such pages augment the documentation with tutorials, guides and references.
Projects are often made available to other developers by publishing a Hex package. Hex also supports private packages for organizations. If ExDoc is configured for the Mix project, publishing a package on Hex will also automatically publish the generated documentation to HexDocs.
Dependency handling
When your library is published and used as a dependency, its lockfile (usually named mix.lock
) is ignored by the host project. Running mix deps.get
in the host project attempts to get the latest possible versions of your library’s dependencies, as specified by the requirements in the deps
section of your mix.exs
. These versions might be greater than those stored in your mix.lock
(and hence used in your tests / CI).
On the other hand, contributors of your library, need a deterministic build, which implies the presence of mix.lock
in your Version Control System (VCS).
The best practice of handling mix.lock
file therefore would be to keep it in VCS, and run two different Continuous Integration (CI) workflows: the usual deterministic one, and another one, that starts with mix deps.unlock --all
and always compiles your library and runs tests against latest versions of dependencies. The latter one might be even run nightly or otherwise recurrently to stay notified about any possible issue in regard to dependencies updates.
Anti-patterns
In this section we document common anti-patterns to avoid when writing libraries.
Avoid using exceptions for control-flow
You should avoid using exceptions for control-flow. For example, instead of:
try do
contents = File.read!("some_path_that_may_or_may_not_exist")
{:it_worked, contents}
rescue
File.Error ->
:it_failed
end
you should prefer:
case File.read("some_path_that_may_or_may_not_exist") do
{:ok, contents} -> {:it_worked, contents}
{:error, _} -> :it_failed
end
As a library author, it is your responsibility to make sure users are not required to use exceptions for control-flow in their applications. You can follow the same convention as Elixir here, using the name without !
for returning :ok
/:error
tuples and appending !
for a version of the function which raises an exception.
It is important to note that a name without !
does not mean a function will never raise. For example, even File.read/1
can fail in case of bad arguments:
iex> File.read(1)
** (FunctionClauseError) no function clause matching in IO.chardata_to_string/1
The usage of :ok
/:error
tuples is about the domain that the function works on, in this case, file system access. Bad arguments, logical errors, invalid options should raise regardless of the function name. If in doubt, prefer to return tuples instead of raising, as users of your library can always match on the results and raise if necessary.
Avoid working with invalid data
Elixir programs should prefer to validate data as close to the end user as possible, so the errors are easy to locate and fix. This practice also saves you from writing defensive code in the internals of the library.
For example, imagine you have an API that receives a filename as a binary. At some point you will want to write to this file. You could have a function like this:
def my_fun(some_arg, file_to_write_to, options \\ []) do
...some code...
AnotherModuleInLib.invoke_something_that_will_eventually_write_to_file(file_to_write_to)
...more code...
end
The problem with the code above is that, if the user supplies an invalid input, the error will be raised deep inside the library, which makes it confusing for users. Furthermore, when you don't validate the values at the boundary, the internals of your library are never quite sure which kind of values they are working with.
A better function definition would be:
def my_fun(some_arg, file_to_write_to, options \\ []) when is_binary(file_to_write_to) do
Elixir also leverages pattern matching and guards in function clauses to provide clear error messages in case invalid arguments are given.
This advice does not only apply to libraries, but to any Elixir code. Every time you receive multiple options or work with external data, you should validate the data at the boundary and convert it to structured data. For example, if you provide a GenServer
that can be started with multiple options, you want to validate those options when the server starts and rely only on structured data throughout the process life cycle. Similarly, if a database or a socket gives you a map of strings, after you receive the data, you should validate it and potentially convert it to a struct or a map of atoms.
Avoid application configuration
You should avoid using the application environment (see Application.get_env/2
) as the configuration mechanism for libraries. The application environment is global which means it becomes impossible for two dependencies to use your library in two different ways.
Let's see a simple example. Imagine that you implement a library that breaks a string in two parts based on the first occurrence of the dash -
character:
defmodule DashSplitter do
def split(string) when is_binary(string) do
String.split(string, "-", parts: 2)
end
end
Now imagine someone wants to split the string in three parts. You decide to make the number of parts configurable via the application environment:
def split(string) when is_binary(string) do
parts = Application.get_env(:dash_splitter, :parts, 2)
String.split(string, "-", parts: parts)
end
Now users can configure your library in their config/config.exs
file as follows:
config :dash_splitter, :parts, 3
Once your library is configured, it will change the behaviour of all users of your library. If a library was expecting it to split the string in 2 parts, since the configuration is global, it will now split it in 3 parts.
The solution is to provide configuration as close as possible to where it is used and not via the application environment. In case of a function, you could expect keyword lists as a new argument:
def split(string, opts \\ []) when is_binary(string) and is_list(opts) do
parts = Keyword.get(opts, :parts, 2)
String.split(string, "-", parts: parts)
end
In case you need to configure a process, the options should be passed when starting that process.
The application environment should be reserved only for configurations that are truly global, for example, to control your application boot process and its supervision tree. And, generally speaking, it is best to avoid global configuration. If you must use configuration, then prefer runtime configuration instead of compile-time configuration. See the Application
module for more information.
For all remaining scenarios, libraries should not force their users to use the application environment for configuration. If the user of a library believes that certain parameter should be configured globally, then they can wrap the library functionality with their own application environment configuration.
Avoid defining modules that are not in your "namespace"
Even though Elixir does not formally have the concept of namespaces, a library should use its name as a "prefix" for all of its modules (except for special cases like mix tasks). For example, if the library's OTP application name is :my_lib
, then all of its modules should start with the MyLib
prefix, for example MyLib.User
, MyLib.SubModule
, and MyLib.Application
.
This is important because the Erlang VM can only load one instance of a module at a time. So if there are multiple libraries that define the same module, then they are incompatible with each other due to this limitation. By always using the library name as a prefix, it avoids module name clashes due to the unique prefix.
Furthermore, when writing a library that is an extension of another library, you should avoid defining modules inside the parent's library namespace. For example, if you are writing a package that adds authentication to Plug
called plug_auth
, its modules should be namespaced under PlugAuth
instead of Plug.Auth
, so it avoids conflicts with Plug
if it were to ever define its own authentication functionality.
Avoid use
when an import
is enough
A library should not provide use MyLib
functionality if all use MyLib
does is to import
/alias
the module itself. For example, this is an anti-pattern:
defmodule MyLib do
defmacro __using__(_) do
quote do
import MyLib
end
end
def some_fun(arg1, arg2) do
...
end
end
The reason why defining the __using__
macro above should be avoided is because when a developer writes:
defmodule MyApp do
use MyLib
end
It allows use MyLib
to run any code into the MyApp
module. For someone reading the code, it is impossible to assess the impact that use MyLib
has in a module without looking at the implementation of __using__
.
The following code is clearer:
defmodule MyApp do
import MyLib
end
The code above says we are only bringing in the functions from MyLib
so we can invoke some_fun(arg1, arg2)
directly without the MyLib.
prefix. Even more important, import MyLib
says that we have an option to not import MyLib
at all as we can simply invoke the function as MyLib.some_fun(arg1, arg2)
.
If the module you want to invoke a function on has a long name, such as SomeLibrary.Namespace.MyLib
, and you find it verbose, you can leverage the alias/2
special form and still refer to the module as MyLib
.
While there are situations where use SomeModule
is necessary, use
should be skipped when all it does is to import
or alias
other modules. In a nutshell, alias
should be preferred, as it is simpler and clearer than import
, while import
is simpler and clearer than use
.
Avoid macros
Although the previous section could be summarized as "avoid macros", both topics are important enough to deserve their own sections.
To quote the official guide on macros:
Even though Elixir attempts its best to provide a safe environment for macros, the major responsibility of writing clean code with macros falls on developers. Macros are harder to write than ordinary Elixir functions and it's considered to be bad style to use them when they're not necessary. So write macros responsibly.
Elixir already provides mechanisms to write your everyday code in a simple and readable fashion by using its data structures and functions. Macros should only be used as a last resort. Remember that explicit is better than implicit. Clear code is better than concise code.
When you absolutely have to use a macro, make sure that a macro is not the only way the user can interface with your library and keep the amount of code generated by a macro to a minimum. For example, the Logger
module provides Logger.debug/2
, Logger.info/2
and friends as macros that are capable of extracting environment information, but a low-level mechanism for logging is still available with Logger.bare_log/3
.
Avoid using processes for code organization
A developer must never use a process for code organization purposes. A process must be used to model runtime properties such as:
- Mutable state and access to shared resources (such as ETS, files, and others)
- Concurrency and distribution
- Initialization, shutdown and restart logic (as seen in supervisors)
- System messages such as timer messages and monitoring events
In Elixir, code organization is done by modules and functions, processes are not necessary. For example, imagine you are implementing a calculator and you decide to put all the calculator operations behind a GenServer
:
def add(a, b) do
GenServer.call(__MODULE__, {:add, a, b})
end
def handle_call({:add, a, b}, _from, state) do
{:reply, a + b, state}
end
def handle_call({:subtract, a, b}, _from, state) do
{:reply, a - b, state}
end
This is an anti-pattern not only because it convolutes the calculator logic but also because you put the calculator logic behind a single process that will potentially become a bottleneck in your system, especially as the number of calls grow. Instead, just define the functions directly:
def add(a, b) do
a + b
end
def subtract(a, b) do
a - b
end
Use processes only to model runtime properties, never for code organization. And even when you think something could be done in parallel with processes, often it is best to let the callers of your library decide how to parallelize, rather than impose a certain execution flow in users of your code.
Avoid spawning unsupervised processes
You should avoid spawning processes outside of a supervision tree, especially long-running ones. Instead, processes must be started inside supervision trees. This guarantees developers have full control over the initialization, restarts, and shutdown of the system.
If your application does not have a supervision tree, one can be added by changing def application
inside mix.exs
to include a :mod
key with the application callback name:
def application do
[
extra_applications: [:logger],
mod: {MyApp.Application, []}
]
end
and then defining a my_app/application.ex
file with the following template:
defmodule MyApp.Application do
# See https://hexdocs.pm/elixir/Application.html
# for more information on OTP Applications
@moduledoc false
use Application
def start(_type, _args) do
children = [
# Starts a worker by calling: MyApp.Worker.start_link(arg)
# {MyApp.Worker, arg}
]
# See https://hexdocs.pm/elixir/Supervisor.html
# for other strategies and supported options
opts = [strategy: :one_for_one, name: MyApp.Supervisor]
Supervisor.start_link(children, opts)
end
end
This is the same template generated by mix new --sup
.
Each process started with the application must be listed as a child under the Supervisor
above. We call those "static processes" because they are known upfront. For handling dynamic processes, such as the ones started during requests and other user inputs, look at the DynamicSupervisor
module.
One of the few times where it is acceptable to start a process outside of a supervision tree is with Task.async/1
and Task.await/2
. Opposite to Task.start_link/1
, the async/await
mechanism gives you full control over the spawned process life cycle - which is also why you must always call Task.await/2
after starting a task with Task.async/1
. Even though, if your application is spawning multiple async processes, you should consider using Task.Supervisor
for better visibility when instrumenting and monitoring the system.