View Source Exile.Process (exile v0.11.0)
GenServer which wraps spawned external command.
Use Exile.stream!/1
over using this. Use this only if you are
familiar with life-cycle and need more control of the IO streams
and OS process.
Comparison with Port
it is demand driven. User explicitly has to
read
the command output, and the progress of the external command is controlled using OS pipes. Exile never load more output than we can consume, so we should never experience memory issuesit can close stdin while consuming output
tries to handle zombie process by attempting to cleanup external process. Note that there is no middleware involved with exile so it is still possible to endup with zombie process.
selectively consume stdout and stderr
Internally Exile uses non-blocking asynchronous system calls to interact with the external process. It does not use port's message based communication, instead uses raw stdio and NIF. Uses asynchronous system calls for IO. Most of the system calls are non-blocking, so it should not block the beam schedulers. Make use of dirty-schedulers for IO
Introduction
Exile.Process
is a process based wrapper around the external
process. It is similar to port
as an entity but the interface is
different. All communication with the external process must happen
via Exile.Process
interface.
Exile process life-cycle tied to external process and owners. All
system resources such are open file-descriptors, external process
are cleaned up when the Exile.Process
dies.
Owner
Each Exile.Process
has an owner. And it will be the process which
created it (via Exile.Process.start_link/2
). Process owner can not
be changed.
Owner process will be linked to the Exile.Process
. So when the
exile process is dies abnormally the owner will be killed too or
visa-versa. Owner process should avoid trapping the exit signal, if
you want avoid the caller getting killed, create a separate process
as owner to run the command and monitor that process.
Only owner can get the exit status of the command, using
Exile.Process.await_exit/2
. All exile processes MUST be
awaited. Exit status or reason is ALWAYS sent to the owner. It
is similar to Task
. If the
owner exit without await_exit
, the exile process will be killed,
but if the owner continue without await_exit
then the exile
process will linger around till the process exit.
iex> alias Exile.Process
iex> {:ok, p} = Process.start_link(~w(echo hello))
iex> Process.read(p, 100)
{:ok, "hello\n"}
iex> Process.read(p, 100) # read till we get :eof
:eof
iex> Process.await_exit(p)
{:ok, 0}
Pipe & Pipe Owner
Standard IO pipes/channels/streams of the external process such as STDIN, STDOUT, STDERR are called as Pipes. User can either write or read data from pipes.
Each pipe has an owner process and only that process can write or
read from the exile process. By default the process who created the
exile process is the owner of all the pipes. Pipe owner can be
changed using Exile.Process.change_pipe_owner/3
.
Pipe owner is monitored and the pipes are closed automatically when
the pipe owner exit. Pipe Owner can close the pipe early using
Exile.Process.close_stdin/1
etc.
Exile.Process.await_exit/2
closes all of the caller owned pipes by
default.
iex> {:ok, p} = Process.start_link(~w(cat))
iex> writer = Task.async(fn ->
...> :ok = Process.change_pipe_owner(p, :stdin, self())
...> Process.write(p, "Hello World")
...> end)
iex> Task.await(writer)
:ok
iex> Process.read(p, 100)
{:ok, "Hello World"}
iex> Process.await_exit(p)
{:ok, 0}
Pipe Operations
Only Pipe owner can read or write date to the owned pipe.
All Pipe operations (read/write) blocks the caller as a mechanism
to put back-pressure, and this also makes the API simpler.
This is same as how command-line programs works on the shell,
along with pipes in-between, Example: cat larg-file | grep "foo"
.
Internally Exile uses asynchronous IO APIs to avoid blocking VM
(by default NIF calls blocks the VM scheduler),
so you can open several pipes and do concurrent IO operations without
blocking VM.
stderr
by default is :stderr
is connected to console, data written to
stderr will appear on the console.
You can change the behavior by setting :stderr
:
:console
- stderr output is redirected to console (Default):redirect_to_stdout
- stderr output is redirected to stdout:consume
- stderr output read separately, allowing you to consume it separately from stdout. See below for more details:disable
- stderr output is redirected/dev/null
suppressing all output. See below for more details.
Using redirect_to_stdout
stderr data will be redirected to stdout. When you read stdout
you will see both stdout & stderr combined and you won't be
able differentiate stdout and stderr separately.
This is similar to :stderr_to_stdout
option present in
Ports.
Unexpected Behaviors
On many systems,
stdout
andstderr
are separated. And between the source program to Exile, via the kernel, there are several places that may buffer data, even temporarily, before Exile is ready to read them. There is no enforced ordering of the readiness of these independent buffers for Exile to make use of.This can result in unexpected behavior, including:
- mangled data, for example, UTF-8 characters may be incomplete until an additional buffered segment is released on the same source
- raw data, where binary data sent on one source, is incompatible with data sent on the other source.
- interleaved data, where what appears to be synchronous, is not
In short, the two streams might be combined at arbitrary byte position leading to above mentioned issue.
Most well-behaved command-line programs are unlikely to exhibit this, but you need to be aware of the risk.
A good example of this unexpected behavior is streaming JSON from an external tool to Exile, where normal JSON output is expected on stdout, and errors or warnings via stderr. In the case of an unexpected error, the stdout stream could be incomplete, or the stderr message might arrive before the closing data on the stdout stream.
Using consume
stderr data can be consumed separately using
Exile.Process.read_stderr/2
. Special function
Exile.Process.read_any/2
can be used to read from either stdout or
stderr whichever has the data available. See the examples for more
details.
Unexpected Behaviors
When set, the
stderr
output MUST be consumed to avoid blocking the external program when stderr buffer is full.
Reading from stderr using read_stderr
# write "Hello" to stdout and "World" to stderr
iex> script = Enum.join(["echo Hello", "echo World >&2"], "\n")
iex> {:ok, p} = Process.start_link(["sh", "-c", script], stderr: :consume)
iex> Process.read(p, 100)
{:ok, "Hello\n"}
iex> Process.read_stderr(p, 100)
{:ok, "World\n"}
iex> Process.await_exit(p)
{:ok, 0}
Reading using read_any
# write "Hello" to stdout and "World" to stderr
iex> script = Enum.join(["echo Hello", "echo World >&2"], "\n")
iex> {:ok, p} = Process.start_link(["sh", "-c", script], stderr: :consume)
iex> Process.read_any(p)
{:ok, {:stdout, "Hello\n"}}
iex> Process.read_any(p)
{:ok, {:stderr, "World\n"}}
iex> Process.await_exit(p)
{:ok, 0}
Process Termination
When owner does (normally or abnormally) the Exile process always terminated irrespective of pipe status or process status. External process get a chance to terminate gracefully, if that fail it will be killed.
If owner calls await_exit
then the owner owned pipes are closed
and we wait for external process to terminate, if the process
already terminated then call returns immediately with exit
status. Else command will be attempted to stop gracefully following
the exit sequence based on the timeout value (5s by default).
If owner calls await_exit
with timeout
as :infinity
then
Exile does not attempt to forcefully stop the external command and
wait for command to exit on itself. The await_exit
call can be blocked
indefinitely waiting for external process to terminate.
If external process exit on its own, exit status is collected and Exile process will wait for owner to close pipes. Most commands exit with pipes are closed, so just ensuring to close pipes when works is done should be enough.
Example of process getting terminated by SIGTERM
signal
# sleep command does not watch for stdin or stdout, so closing the
# pipe does not terminate the sleep command.
iex> {:ok, p} = Process.start_link(~w(sleep 100000000)) # sleep indefinitely
iex> Process.await_exit(p, 100) # ensure `await_exit` finish within `100ms`. By default it waits for 5s
{:ok, 143} # 143 is the exit status when command exit due to SIGTERM
Examples
Run a command without any input or output
iex> {:ok, p} = Process.start_link(["sh", "-c", "exit 1"])
iex> Process.await_exit(p)
{:ok, 1}
Single process reading and writing to the command
# bc is a calculator, which reads from stdin and writes output to stdout
iex> {:ok, p} = Process.start_link(~w(bc))
iex> Process.write(p, "1 + 1\n") # there must be new-line to indicate the end of the input line
:ok
iex> Process.read(p)
{:ok, "2\n"}
iex> Process.write(p, "2 * 10 + 1\n")
:ok
iex> Process.read(p)
{:ok, "21\n"}
# We must close stdin to signal the `bc` command that we are done.
# since `await_exit` implicitly closes the pipes, in this case we don't have to
iex> Process.await_exit(p)
{:ok, 0}
Running a command which flush the output on stdin close. This is not supported by Erlang/Elixir ports.
# `base64` command reads all input and writes encoded output when stdin is closed.
iex> {:ok, p} = Process.start_link(~w(base64))
iex> Process.write(p, "abcdef")
:ok
iex> Process.close_stdin(p) # we can selectively close stdin and read all output
:ok
iex> Process.read(p)
{:ok, "YWJjZGVm\n"}
iex> Process.read(p) # typically it is better to read till we receive :eof when we are not sure how big the output data size is
:eof
iex> Process.await_exit(p)
{:ok, 0}
Read and write to pipes in separate processes
iex> {:ok, p} = Process.start_link(~w(cat))
iex> writer = Task.async(fn ->
...> :ok = Process.change_pipe_owner(p, :stdin, self())
...> Process.write(p, "Hello World")
...> # no need to close the pipe explicitly here. Pipe will be closed automatically when process exit
...> end)
iex> reader = Task.async(fn ->
...> :ok = Process.change_pipe_owner(p, :stdout, self())
...> Process.read(p)
...> end)
iex> :timer.sleep(500) # wait for the reader and writer to change pipe owner, otherwise `await_exit` will close the pipes before we change pipe owner
iex> Process.await_exit(p, :infinity) # let the reader and writer take indefinite time to finish
{:ok, 0}
iex> Task.await(writer)
:ok
iex> Task.await(reader)
{:ok, "Hello World"}
Summary
Functions
Wait for the program to terminate and get exit status.
Changes the Pipe owner of the pipe to specified pid.
Returns a specification to start this module under a supervisor.
Closes external program's standard error pipe (stderr)
Closes external program's standard input pipe (stdin).
Closes external program's standard output pipe (stdout)
Sends an system signal to external program
Returns OS pid of the command
Returns bytes from executed command's stdout with maximum size max_size
.
Returns bytes from either stdout or stderr with maximum size
max_size
whichever is available at that time.
Returns bytes from executed command's stderr with maximum size max_size
.
Pipe must be enabled with stderr: :consume
to read the data.
Starts Exile.Process
server.
Writes iodata data
to external program's standard input pipe.
Types
@type exit_status() :: non_neg_integer()
@type pipe_name() :: :stdin | :stdout | :stderr
@type signal() :: :sigkill | :sigterm
Functions
@spec await_exit(t(), timeout :: timeout()) :: {:ok, exit_status()}
Wait for the program to terminate and get exit status.
ONLY the Process owner can call this function. And all Exile process MUST be awaited (Similar to Task).
Exile first politely asks the program to terminate by closing the pipes owned by the process owner (by default process owner is the pipes owner). Most programs terminates when standard pipes are closed.
If you have changed the pipe owner to other process, you have to close pipe yourself or wait for the program to exit.
If the program fails to terminate within the timeout (default 5s)
then the program will be killed using the exit sequence by sending
SIGTERM
, SIGKILL
signals in sequence.
When timeout is set to :infinity
await_exit
wait for the
programs to terminate indefinitely.
For more details check module documentation.
Changes the Pipe owner of the pipe to specified pid.
Note that currently any process can change the pipe owner.
For more details about Pipe Owner, please check module docs.
Returns a specification to start this module under a supervisor.
See Supervisor
.
Closes external program's standard error pipe (stderr)
Only owner of the pipe can close the pipe. This call will return immediately.
Closes external program's standard input pipe (stdin).
Only owner of the pipe can close the pipe. This call will return immediately.
Closes external program's standard output pipe (stdout)
Only owner of the pipe can close the pipe. This call will return immediately.
@spec kill(t(), :sigkill | :sigterm) :: :ok
Sends an system signal to external program
Note that :sigkill
kills the program unconditionally.
Avoid sending signals manually, use await_exit
instead.
@spec os_pid(t()) :: pos_integer()
Returns OS pid of the command
This is meant only for debugging. Avoid interacting with the external process directly
@spec read(t(), pos_integer()) :: {:ok, iodata()} | :eof | {:error, any()}
Returns bytes from executed command's stdout with maximum size max_size
.
Blocks if no data present in stdout pipe yet. And returns as soon as data of any size is available.
Note that max_size
is the maximum size of the returned data. But
the returned data can be less than that depending on how the program
flush the data etc.
@spec read_any(t(), pos_integer()) :: {:ok, {:stdout, iodata()}} | {:ok, {:stderr, iodata()}} | :eof | {:error, any()}
Returns bytes from either stdout or stderr with maximum size
max_size
whichever is available at that time.
Blocks if no bytes are written to stdout or stderr yet. And returns as soon as data is available.
Note that max_size
is the maximum size of the returned data. But
the returned data can be less than that depending on how the program
flush the data etc.
@spec read_stderr(t(), pos_integer()) :: {:ok, iodata()} | :eof | {:error, any()}
Returns bytes from executed command's stderr with maximum size max_size
.
Pipe must be enabled with stderr: :consume
to read the data.
Blocks if no bytes are written to stderr yet. And returns as soon as bytes are available
Note that max_size
is the maximum size of the returned data. But
the returned data can be less than that depending on how the program
flush the data etc.
@spec start_link([String.t(), ...], cd: String.t(), env: [{String.t(), String.t()}], stderr: :console | :disable | :stream ) :: {:ok, t()} | {:error, any()}
Starts Exile.Process
server.
Starts external program using cmd_with_args
with options opts
cmd_with_args
must be a list containing command with arguments.
example: ["cat", "file.txt"]
.
Options
cd
- the directory to run the command inenv
- a list of tuples containing environment key-value. These can be accessed in the external programstderr
- different ways to handle stderr stream.:console
- stderr output is redirected to console (Default):redirect_to_stdout
- stderr output is redirected to stdout:disable
- stderr output is redirected/dev/null
suppressing all output:consume
- connects stderr for the consumption. When set, the stderr output must be consumed to avoid external program from blocking.
See
:stderr
for more details and issues associated with them
Caller of the process will be the owner owner of the Exile Process. And default owner of all opened pipes.
Please check module documentation for more details
Writes iodata data
to external program's standard input pipe.
This call blocks when the pipe is full. Returns :ok
when
the complete data is written.