View Source Task (Elixir v1.12.2)
Conveniences for spawning and awaiting tasks.
Tasks are processes meant to execute one particular action throughout their lifetime, often with little or no communication with other processes. The most common use case for tasks is to convert sequential code into concurrent code by computing a value asynchronously:
task = Task.async(fn -> do_some_work() end)
res = do_some_other_work()
res + Task.await(task)
Tasks spawned with async
can be awaited on by their caller
process (and only their caller) as shown in the example above.
They are implemented by spawning a process that sends a message
to the caller once the given computation is performed.
Besides async/1
and await/2
, tasks can also be
started as part of a supervision tree and dynamically spawned
on remote nodes. We will explore these scenarios next.
async and await
One of the common uses of tasks is to convert sequential code
into concurrent code with Task.async/1
while keeping its semantics.
When invoked, a new process will be created, linked and monitored
by the caller. Once the task action finishes, a message will be sent
to the caller with the result.
Task.await/2
is used to read the message sent by the task.
There are two important things to consider when using async
:
If you are using async tasks, you must await a reply as they are always sent. If you are not expecting a reply, consider using
Task.start_link/1
detailed below.async tasks link the caller and the spawned process. This means that, if the caller crashes, the task will crash too and vice-versa. This is on purpose: if the process meant to receive the result no longer exists, there is no purpose in completing the computation.
If this is not desired, you will want to use supervised tasks, described next.
Dynamically supervised tasks
The Task.Supervisor
module allows developers to dynamically
create multiple supervised tasks.
A short example is:
{:ok, pid} = Task.Supervisor.start_link()
task =
Task.Supervisor.async(pid, fn ->
# Do something
end)
Task.await(task)
However, in the majority of cases, you want to add the task supervisor to your supervision tree:
Supervisor.start_link([
{Task.Supervisor, name: MyApp.TaskSupervisor}
], strategy: :one_for_one)
And now you can use async/await once again passig the name of the supervisor isntead of the pid:
Task.Supervisor.async(MyApp.TaskSupervisor, fn ->
# Do something
end)
|> Task.await()
We encourage developers to rely on supervised tasks as much as possible. Supervised tasks enable a huge variety of patterns which allows you explicit control on how to handle the results, errors, and timeouts. Here is a summary:
Use
Task.Supervisor.start_child/2
to start a fire-and-forget task and you don't care about its results nor about if it completes successfullyUse
Task.Supervisor.async/2
+Task.await/2
allows you to execute tasks concurrently and retrieve its result. If the task fails, the caller will also failUse
Task.Supervisor.async_nolink/2
+Task.yield/2
+Task.shutdown/2
allows you to execute tasks concurrently and retrieve their results or the reason they failed within a given time frame. If the task fails, the caller won't fail: you will receive the error reason either onyield
orshutdown
See the Task.Supervisor
module for details on the supported operations.
Distributed tasks
Since Elixir provides a Task.Supervisor
, it is easy to use one
to dynamically start tasks across nodes:
# On the remote node
Task.Supervisor.start_link(name: MyApp.DistSupervisor)
# On the client
supervisor = {MyApp.DistSupervisor, :remote@local}
Task.Supervisor.async(supervisor, MyMod, :my_fun, [arg1, arg2, arg3])
Note that, when working with distributed tasks, one should use the
Task.Supervisor.async/4
function that expects explicit module, function,
and arguments, instead of Task.Supervisor.async/2
that works with anonymous
functions. That's because anonymous functions expect the same module version
to exist on all involved nodes. Check the Agent
module documentation for
more information on distributed processes as the limitations described there
apply to the whole ecosystem.
Statically supervised tasks
The Task
module implements the child_spec/1
function, which
allows it to be started directly under a regular Supervisor
-
instead of a Task.Supervisor
- by passing a tuple with a function
to run:
Supervisor.start_link([
{Task, fn -> :some_work end}
], strategy: :one_for_one)
This is often useful when you need to execute some steps while setting up your supervision tree. For example: to warm up caches, log the initialization status, etc.
If you don't want to put the Task code directly under the Supervisor
,
you can wrap the Task
in its own module, similar to how you would
do with a GenServer
or an Agent
:
defmodule MyTask do
use Task
def start_link(arg) do
Task.start_link(__MODULE__, :run, [arg])
end
def run(arg) do
# ...
end
end
And then passing it to the supervisor:
Supervisor.start_link([
{MyTask, arg}
], strategy: :one_for_one)
Since these tasks are supervised and not directly linked to the caller,
they cannot be awaited on. By default, the functions Task.start
and Task.start_link
are for fire-and-forget tasks, where you don't
care about the results or if it completes successfully or not.
use Task
defines a child_spec/1
function, allowing the
defined module to be put under a supervision tree. The generated
child_spec/1
can be customized with the following options:
:id
- the child specification identifier, defaults to the current module:restart
- when the child should be restarted, defaults to:temporary
:shutdown
- how to shut down the child, either immediately or by giving it time to shut down
Opposite to GenServer
, Agent
and Supervisor
, a Task has
a default :restart
of :temporary
. This means the task will
not be restarted even if it crashes. If you desire the task to
be restarted for non-successful exits, do:
use Task, restart: :transient
If you want the task to always be restarted:
use Task, restart: :permanent
See the "Child specification" section in the Supervisor
module
for more detailed information. The @doc
annotation immediately
preceding use Task
will be attached to the generated child_spec/1
function.
Ancestor and Caller Tracking
Whenever you start a new process, Elixir annotates the parent of that process
through the $ancestors
key in the process dictionary. This is often used to
track the hierarchy inside a supervision tree.
For example, we recommend developers to always start tasks under a supervisor.
This provides more visibility and allows you to control how those tasks are
terminated when a node shuts down. That might look something like
Task.Supervisor.start_child(MySupervisor, task_specification)
. This means
that, although your code is the one who invokes the task, the actual ancestor of
the task is the supervisor, as the supervisor is the one effectively starting it.
To track the relationship between your code and the task, we use the $callers
key in the process dictionary. Therefore, assuming the Task.Supervisor
call
above, we have:
[your code] -- calls --> [supervisor] ---- spawns --> [task]
Which means we store the following relationships:
[your code] [supervisor] <-- ancestor -- [task]
^ |
|--------------------- caller ---------------------|
The list of callers of the current process can be retrieved from the Process
dictionary with Process.get(:"$callers")
. This will return either nil
or
a list [pid_n, ..., pid2, pid1]
with at least one entry Where pid_n
is
the PID that called the current process, pid2
called pid_n
, and pid2
was
called by pid1
.
If a task crashes, the callers field is included as part of the log message
metadata under the :callers
key.
Link to this section Summary
Functions
The Task struct.
Starts a task that must be awaited on.
Starts a task that must be awaited on.
Returns a stream that runs the given function fun
concurrently
on each element in enumerable
.
Returns a stream where the given function (module
and function_name
)
is mapped concurrently on each element in enumerable
.
Awaits a task reply and returns it.
Awaits replies from multiple tasks and returns them.
Returns a specification to start a task under a supervisor.
Unlinks and shuts down the task, and then checks for a reply.
Starts a task.
Starts a task.
Starts a task as part of a supervision tree with the given fun
.
Starts a task as part of a supervision tree with the given
module
, function
, and args
.
Temporarily blocks the current process waiting for a task reply.
Yields to multiple tasks in the given time interval.
Link to this section Types
The Task type.
See %Task{}
for information about each field of the structure.
Link to this section Functions
The Task struct.
It contains these fields:
:pid
- the PID of the task process;nil
if the task does not use a task process:ref
- the task monitor reference:owner
- the PID of the process that started the task
Starts a task that must be awaited on.
fun
must be a zero-arity anonymous function. This function
spawns a process that is linked to and monitored by the caller
process. A Task
struct is returned containing the relevant
information. Developers must eventually call Task.await/2
or
Task.yield/2
followed by Task.shutdown/2
on the returned task.
Read the Task
module documentation for more information about
the general usage of async tasks.
Linking
This function spawns a process that is linked to and monitored by the caller process. The linking part is important because it aborts the task if the parent process dies. It also guarantees the code before async/await has the same properties after you add the async call. For example, imagine you have this:
x = heavy_fun()
y = some_fun()
x + y
Now you want to make the heavy_fun()
async:
x = Task.async(&heavy_fun/0)
y = some_fun()
Task.await(x) + y
As before, if heavy_fun/0
fails, the whole computation will
fail, including the parent process. If you don't want the task
to fail then you must change the heavy_fun/0
code in the
same way you would achieve it if you didn't have the async call.
For example, to either return {:ok, val} | :error
results or,
in more extreme cases, by using try/rescue
. In other words,
an asynchronous task should be thought of as an extension of a
process rather than a mechanism to isolate it from all errors.
If you don't want to link the caller to the task, then you
must use a supervised task with Task.Supervisor
and call
Task.Supervisor.async_nolink/2
.
In any case, avoid any of the following:
Setting
:trap_exit
totrue
- trapping exits should be used only in special circumstances as it would make your process immune to not only exits from the task but from any other processes.Moreover, even when trapping exits, calling
await
will still exit if the task has terminated without sending its result back.Unlinking the task process started with
async
/await
. If you unlink the processes and the task does not belong to any supervisor, you may leave dangling tasks in case the parent dies.
Starts a task that must be awaited on.
Similar to async/1
except the function to be started is
specified by the given module
, function_name
, and args
.
@spec async_stream(Enumerable.t(), (term() -> term()), keyword()) :: Enumerable.t()
Returns a stream that runs the given function fun
concurrently
on each element in enumerable
.
Works the same as async_stream/5
but with an anonymous function instead of a
module-function-arguments tuple. fun
must be a one-arity anonymous function.
Each enumerable
element is passed as argument to the given function fun
and
processed by its own task. The tasks will be linked to the current process,
similarly to async/1
.
Example
Count the code points in each string asynchronously, then add the counts together using reduce.
iex> strings = ["long string", "longer string", "there are many of these"]
iex> stream = Task.async_stream(strings, fn text -> text |> String.codepoints() |> Enum.count() end)
iex> Enum.reduce(stream, 0, fn {:ok, num}, acc -> num + acc end)
47
See async_stream/5
for discussion, options, and more examples.
async_stream(enumerable, module, function_name, args, options \\ [])
View Source (since 1.4.0)@spec async_stream(Enumerable.t(), module(), atom(), [term()], keyword()) :: Enumerable.t()
Returns a stream where the given function (module
and function_name
)
is mapped concurrently on each element in enumerable
.
Each element of enumerable
will be prepended to the given args
and
processed by its own task. The tasks will be linked to an intermediate
process that is then linked to the current process. This means a failure
in a task terminates the current process and a failure in the current process
terminates all tasks.
When streamed, each task will emit {:ok, value}
upon successful
completion or {:exit, reason}
if the caller is trapping exits.
The order of results depends on the value of the :ordered
option.
The level of concurrency and the time tasks are allowed to run can be controlled via options (see the "Options" section below).
Consider using Task.Supervisor.async_stream/6
to start tasks
under a supervisor. If you find yourself trapping exits to handle exits
inside the async stream, consider using Task.Supervisor.async_stream_nolink/6
to start tasks that are not linked to the calling process.
Options
:max_concurrency
- sets the maximum number of tasks to run at the same time. Defaults toSystem.schedulers_online/0
.:ordered
- whether the results should be returned in the same order as the input stream. When the output is ordered, Elixir may need to buffer results to emit them in the original order. Setting this option to false disables the need to buffer at the cost of removing ordering. This is also useful when you're using the tasks only for the side effects. Note that regardless of what:ordered
is set to, the tasks will process asynchronously. If you need to process elements in order, consider usingEnum.map/2
orEnum.each/2
instead. Defaults totrue
.:timeout
- the maximum amount of time (in milliseconds or:infinity
) each task is allowed to execute for. Defaults to5000
.:on_timeout
- what to do when a task times out. The possible values are::exit
(default) - the process that spawned the tasks exits.:kill_task
- the task that timed out is killed. The value emitted for that task is{:exit, :timeout}
.
Example
Let's build a stream and then enumerate it:
stream = Task.async_stream(collection, Mod, :expensive_fun, [])
Enum.to_list(stream)
The concurrency can be increased or decreased using the :max_concurrency
option. For example, if the tasks are IO heavy, the value can be increased:
max_concurrency = System.schedulers_online() * 2
stream = Task.async_stream(collection, Mod, :expensive_fun, [], max_concurrency: max_concurrency)
Enum.to_list(stream)
If you do not care about the results of the computation, you can run
the stream with Stream.run/1
. Also set ordered: false
, as you don't
care about the order of the results either:
stream = Task.async_stream(collection, Mod, :expensive_fun, [], ordered: false)
Stream.run(stream)
Attention: async + take
Given items in an async stream are processed concurrently, doing
async_stream
followed by Enum.take/2
may cause more items than
requested to be processed. Let's see an example:
1..100
|> Task.async_stream(fn i ->
Process.sleep(100)
IO.puts(to_string(i))
end)
|> Enum.take(10)
For a machine with 8 cores, the above will process 16 items instead
of 10. The reason is that async_stream/5
always have 8 elements
processing at once. So by the time Enum
says it got all elements
it needed, there are still 6 elements left to be processed.
The solution here is to use Stream.take/2
instead of Enum.take/2
to filter elements before-hand:
1..100
|> Stream.take(10)
|> Task.async_stream(fn i ->
Process.sleep(100)
IO.puts(to_string(i))
end)
|> Enum.to_list()
If for some reason you cannot take the elements before hand,
you can use :max_concurrency
to limit how many elements
may be over processed at the cost of reducing concurrency.
Awaits a task reply and returns it.
In case the task process dies, the current process will exit with the same reason as the task.
A timeout, in milliseconds or :infinity
, can be given with a default value
of 5000
. If the timeout is exceeded, then the current process will exit. If
the task process is linked to the current process which is the case when a
task is started with async
, then the task process will also exit. If the
task process is trapping exits or not linked to the current process, then it
will continue to run.
This function assumes the task's monitor is still active or the monitor's
:DOWN
message is in the message queue. If it has been demonitored, or the
message already received, this function will wait for the duration of the
timeout awaiting the message.
This function can only be called once for any given task. If you want
to be able to check multiple times if a long-running task has finished
its computation, use yield/2
instead.
Examples
iex> task = Task.async(fn -> 1 + 1 end)
iex> Task.await(task)
2
Compatibility with OTP behaviours
It is not recommended to await
a long-running task inside an OTP
behaviour such as GenServer
. Instead, you should match on the message
coming from a task inside your GenServer.handle_info/2
callback.
A GenServer will receive two messages on handle_info/2
:
{ref, result}
- the reply message whereref
is the monitor reference returned by thetask.ref
andresult
is the task result{:DOWN, ref, :process, pid, reason}
- since all tasks are also monitored, you will also receive the:DOWN
message delivered byProcess.monitor/1
. If you receive the:DOWN
message without a a reply, it means the task crashed
Another consideration to have in mind is that tasks started by Task.async/1
are always linked to their callers and you may not want the GenServer to
crash if the task crashes. Therefore, it is preferable to instead use
Task.Supervisor.async_nolink/3
inside OTP behaviours. For completeness, here
is an example of a GenServer that start tasks and handles their results:
defmodule GenServerTaskExample do
use GenServer
def start_link(opts) do
GenServer.start_link(__MODULE__, :ok, opts)
end
def init(_opts) do
# We will keep all running tasks in a map
{:ok, %{tasks: %{}}}
end
# Imagine we invoke a task from the GenServer to access a URL...
def handle_call(:some_message, _from, state) do
url = ...
task = Task.Supervisor.async_nolink(MyApp.TaskSupervisor, fn -> fetch_url(url) end)
# After we start the task, we store its reference and the url it is fetching
state = put_in(state.tasks[task.ref], url)
{:reply, :ok, state}
end
# If the task succeeds...
def handle_info({ref, result}, state) do
# The task succeed so we can cancel the monitoring and discard the DOWN message
Process.demonitor(ref, [:flush])
{url, state} = pop_in(state.tasks[ref])
IO.puts "Got #{inspect(result)} for URL #{inspect url}"
{:noreply, state}
end
# If the task fails...
def handle_info({:DOWN, ref, _, _, reason}, state) do
{url, state} = pop_in(state.tasks[ref])
IO.puts "URL #{inspect url} failed with reason #{inspect(reason)}"
{:noreply, state}
end
end
With the server defined, you will want to start the task supervisor above and the GenServer in your supervision tree:
children = [
{Task.Supervisor, name: MyApp.TaskSupervisor},
{GenServerTaskExample, name: MyApp.GenServerTaskExample}
]
Supervisor.start_link(children, strategy: :one_for_one)
Awaits replies from multiple tasks and returns them.
This function receives a list of tasks and waits for their replies in the
given time interval. It returns a list of the results, in the same order as
the tasks supplied in the tasks
input argument.
If any of the task processes dies, the current process will exit with the same reason as that task.
A timeout, in milliseconds or :infinity
, can be given with a default value
of 5000
. If the timeout is exceeded, then the current process will exit.
Any task processes that are linked to the current process (which is the case
when a task is started with async
) will also exit. Any task processes that
are trapping exits or not linked to the current process will continue to run.
This function assumes the tasks' monitors are still active or the monitors'
:DOWN
message is in the message queue. If any tasks have been demonitored,
or the message already received, this function will wait for the duration of
the timeout.
This function can only be called once for any given task. If you want to be
able to check multiple times if a long-running task has finished its
computation, use yield_many/2
instead.
Compatibility with OTP behaviours
It is not recommended to await
long-running tasks inside an OTP behaviour
such as GenServer
. See await/2
for more information.
Examples
iex> tasks = [
...> Task.async(fn -> 1 + 1 end),
...> Task.async(fn -> 2 + 3 end)
...> ]
iex> Task.await_many(tasks)
[2, 5]
@spec child_spec(term()) :: Supervisor.child_spec()
Returns a specification to start a task under a supervisor.
arg
is passed as the argument to Task.start_link/1
in the :start
field
of the spec.
For more information, see the Supervisor
module,
the Supervisor.child_spec/2
function and the Supervisor.child_spec/0
type.
Unlinks and shuts down the task, and then checks for a reply.
Returns {:ok, reply}
if the reply is received while shutting down the task,
{:exit, reason}
if the task died, otherwise nil
.
The second argument is either a timeout or :brutal_kill
. In case
of a timeout, a :shutdown
exit signal is sent to the task process
and if it does not exit within the timeout, it is killed. With :brutal_kill
the task is killed straight away. In case the task terminates abnormally
(possibly killed by another process), this function will exit with the same reason.
It is not required to call this function when terminating the caller, unless
exiting with reason :normal
or if the task is trapping exits. If the caller is
exiting with a reason other than :normal
and the task is not trapping exits, the
caller's exit signal will stop the task. The caller can exit with reason
:shutdown
to shut down all of its linked processes, including tasks, that
are not trapping exits without generating any log messages.
If a task's monitor has already been demonitored or received and there is not
a response waiting in the message queue this function will return
{:exit, :noproc}
as the result or exit reason can not be determined.
Starts a task.
fun
must be a zero-arity anonymous function.
This should only used when the task is used for side-effects (like I/O) and you have no interest on its results nor if it completes successfully.
If the current node is shutdown, the node will terminate even
if the task was not completed. For this reason, we recommend
to use Task.Supervisor.start_child/2
instead, which allows
you to control the shutdown time via the :shutdown
option.
Starts a task.
This should only used when the task is used for side-effects (like I/O) and you have no interest on its results nor if it completes successfully.
If the current node is shutdown, the node will terminate even
if the task was not completed. For this reason, we recommend
to use Task.Supervisor.start_child/2
instead, which allows
you to control the shutdown time via the :shutdown
option.
Starts a task as part of a supervision tree with the given fun
.
fun
must be a zero-arity anonymous function.
This is used to start a statically supervised task under a supervision tree.
Starts a task as part of a supervision tree with the given
module
, function
, and args
.
This is used to start a statically supervised task under a supervision tree.
Temporarily blocks the current process waiting for a task reply.
Returns {:ok, reply}
if the reply is received, nil
if
no reply has arrived, or {:exit, reason}
if the task has already
exited. Keep in mind that normally a task failure also causes
the process owning the task to exit. Therefore this function can
return {:exit, reason}
only if
- the task process exited with the reason
:normal
- it isn't linked to the caller
- the caller is trapping exits
A timeout, in milliseconds or :infinity
, can be given with a default value
of 5000
. If the time runs out before a message from the task is received,
this function will return nil
and the monitor will remain active. Therefore
yield/2
can be called multiple times on the same task.
This function assumes the task's monitor is still active or the
monitor's :DOWN
message is in the message queue. If it has been
demonitored or the message already received, this function will wait
for the duration of the timeout awaiting the message.
If you intend to shut the task down if it has not responded within timeout
milliseconds, you should chain this together with shutdown/1
, like so:
case Task.yield(task, timeout) || Task.shutdown(task) do
{:ok, result} ->
result
nil ->
Logger.warn("Failed to get a result in #{timeout}ms")
nil
end
That ensures that if the task completes after the timeout
but before shutdown/1
has been called, you will still get the result, since shutdown/1
is designed to
handle this case and return the result.
Yields to multiple tasks in the given time interval.
This function receives a list of tasks and waits for their
replies in the given time interval. It returns a list
of two-element tuples, with the task as the first element
and the yielded result as the second. The tasks in the returned
list will be in the same order as the tasks supplied in the tasks
input argument.
Similarly to yield/2
, each task's result will be
{:ok, term}
if the task has successfully reported its result back in the given time interval{:exit, reason}
if the task has diednil
if the task keeps running past the timeout
A timeout, in milliseconds or :infinity
, can be given with a default value
of 5000
.
Check yield/2
for more information.
Example
Task.yield_many/2
allows developers to spawn multiple tasks
and retrieve the results received in a given timeframe.
If we combine it with Task.shutdown/2
, it allows us to gather
those results and cancel the tasks that have not replied in time.
Let's see an example.
tasks =
for i <- 1..10 do
Task.async(fn ->
Process.sleep(i * 1000)
i
end)
end
tasks_with_results = Task.yield_many(tasks, 5000)
results =
Enum.map(tasks_with_results, fn {task, res} ->
# Shut down the tasks that did not reply nor exit
res || Task.shutdown(task, :brutal_kill)
end)
# Here we are matching only on {:ok, value} and
# ignoring {:exit, _} (crashed tasks) and `nil` (no replies)
for {:ok, value} <- results do
IO.inspect(value)
end
In the example above, we create tasks that sleep from 1
up to 10 seconds and return the number of seconds they slept for.
If you execute the code all at once, you should see 1 up to 5
printed, as those were the tasks that have replied in the
given time. All other tasks will have been shut down using
the Task.shutdown/2
call.