View Source Exile.Process (exile v0.11.0)

GenServer which wraps spawned external command.

Use Exile.stream!/1 over using this. Use this only if you are familiar with life-cycle and need more control of the IO streams and OS process.

Comparison with Port

  • it is demand driven. User explicitly has to read the command output, and the progress of the external command is controlled using OS pipes. Exile never load more output than we can consume, so we should never experience memory issues

  • it can close stdin while consuming output

  • tries to handle zombie process by attempting to cleanup external process. Note that there is no middleware involved with exile so it is still possible to endup with zombie process.

  • selectively consume stdout and stderr

Internally Exile uses non-blocking asynchronous system calls to interact with the external process. It does not use port's message based communication, instead uses raw stdio and NIF. Uses asynchronous system calls for IO. Most of the system calls are non-blocking, so it should not block the beam schedulers. Make use of dirty-schedulers for IO

Introduction

Exile.Process is a process based wrapper around the external process. It is similar to port as an entity but the interface is different. All communication with the external process must happen via Exile.Process interface.

Exile process life-cycle tied to external process and owners. All system resources such are open file-descriptors, external process are cleaned up when the Exile.Process dies.

Owner

Each Exile.Process has an owner. And it will be the process which created it (via Exile.Process.start_link/2). Process owner can not be changed.

Owner process will be linked to the Exile.Process. So when the exile process is dies abnormally the owner will be killed too or visa-versa. Owner process should avoid trapping the exit signal, if you want avoid the caller getting killed, create a separate process as owner to run the command and monitor that process.

Only owner can get the exit status of the command, using Exile.Process.await_exit/2. All exile processes MUST be awaited. Exit status or reason is ALWAYS sent to the owner. It is similar to Task. If the owner exit without await_exit, the exile process will be killed, but if the owner continue without await_exit then the exile process will linger around till the process exit.

iex> alias Exile.Process
iex> {:ok, p} = Process.start_link(~w(echo hello))
iex> Process.read(p, 100)
{:ok, "hello\n"}
iex> Process.read(p, 100) # read till we get :eof
:eof
iex> Process.await_exit(p)
{:ok, 0}

Pipe & Pipe Owner

Standard IO pipes/channels/streams of the external process such as STDIN, STDOUT, STDERR are called as Pipes. User can either write or read data from pipes.

Each pipe has an owner process and only that process can write or read from the exile process. By default the process who created the exile process is the owner of all the pipes. Pipe owner can be changed using Exile.Process.change_pipe_owner/3.

Pipe owner is monitored and the pipes are closed automatically when the pipe owner exit. Pipe Owner can close the pipe early using Exile.Process.close_stdin/1 etc.

Exile.Process.await_exit/2 closes all of the caller owned pipes by default.

iex> {:ok, p} = Process.start_link(~w(cat))
iex> writer = Task.async(fn ->
...>   :ok = Process.change_pipe_owner(p, :stdin, self())
...>   Process.write(p, "Hello World")
...> end)
iex> Task.await(writer)
:ok
iex> Process.read(p, 100)
{:ok, "Hello World"}
iex> Process.await_exit(p)
{:ok, 0}

Pipe Operations

Only Pipe owner can read or write date to the owned pipe. All Pipe operations (read/write) blocks the caller as a mechanism to put back-pressure, and this also makes the API simpler. This is same as how command-line programs works on the shell, along with pipes in-between, Example: cat larg-file | grep "foo". Internally Exile uses asynchronous IO APIs to avoid blocking VM (by default NIF calls blocks the VM scheduler), so you can open several pipes and do concurrent IO operations without blocking VM.

stderr

by default is :stderr is connected to console, data written to stderr will appear on the console.

You can change the behavior by setting :stderr:

  1. :console - stderr output is redirected to console (Default)
  2. :redirect_to_stdout - stderr output is redirected to stdout
  3. :consume - stderr output read separately, allowing you to consume it separately from stdout. See below for more details
  4. :disable - stderr output is redirected /dev/null suppressing all output. See below for more details.

Using redirect_to_stdout

stderr data will be redirected to stdout. When you read stdout you will see both stdout & stderr combined and you won't be able differentiate stdout and stderr separately. This is similar to :stderr_to_stdout option present in Ports.

Unexpected Behaviors

On many systems, stdout and stderr are separated. And between the source program to Exile, via the kernel, there are several places that may buffer data, even temporarily, before Exile is ready to read them. There is no enforced ordering of the readiness of these independent buffers for Exile to make use of.

This can result in unexpected behavior, including:

  • mangled data, for example, UTF-8 characters may be incomplete until an additional buffered segment is released on the same source
  • raw data, where binary data sent on one source, is incompatible with data sent on the other source.
  • interleaved data, where what appears to be synchronous, is not

In short, the two streams might be combined at arbitrary byte position leading to above mentioned issue.

Most well-behaved command-line programs are unlikely to exhibit this, but you need to be aware of the risk.

A good example of this unexpected behavior is streaming JSON from an external tool to Exile, where normal JSON output is expected on stdout, and errors or warnings via stderr. In the case of an unexpected error, the stdout stream could be incomplete, or the stderr message might arrive before the closing data on the stdout stream.

Using consume

stderr data can be consumed separately using Exile.Process.read_stderr/2. Special function Exile.Process.read_any/2 can be used to read from either stdout or stderr whichever has the data available. See the examples for more details.

Unexpected Behaviors

When set, the stderr output MUST be consumed to avoid blocking the external program when stderr buffer is full.

Reading from stderr using read_stderr

# write "Hello" to stdout and "World" to stderr
iex> script = Enum.join(["echo Hello", "echo World >&2"], "\n")
iex> {:ok, p} = Process.start_link(["sh", "-c", script], stderr: :consume)
iex> Process.read(p, 100)
{:ok, "Hello\n"}
iex> Process.read_stderr(p, 100)
{:ok, "World\n"}
iex> Process.await_exit(p)
{:ok, 0}

Reading using read_any

# write "Hello" to stdout and "World" to stderr
iex> script = Enum.join(["echo Hello", "echo World >&2"], "\n")
iex> {:ok, p} = Process.start_link(["sh", "-c", script], stderr: :consume)
iex> Process.read_any(p)
{:ok, {:stdout, "Hello\n"}}
iex> Process.read_any(p)
{:ok, {:stderr, "World\n"}}
iex> Process.await_exit(p)
{:ok, 0}

Process Termination

When owner does (normally or abnormally) the Exile process always terminated irrespective of pipe status or process status. External process get a chance to terminate gracefully, if that fail it will be killed.

If owner calls await_exit then the owner owned pipes are closed and we wait for external process to terminate, if the process already terminated then call returns immediately with exit status. Else command will be attempted to stop gracefully following the exit sequence based on the timeout value (5s by default).

If owner calls await_exit with timeout as :infinity then Exile does not attempt to forcefully stop the external command and wait for command to exit on itself. The await_exit call can be blocked indefinitely waiting for external process to terminate.

If external process exit on its own, exit status is collected and Exile process will wait for owner to close pipes. Most commands exit with pipes are closed, so just ensuring to close pipes when works is done should be enough.

Example of process getting terminated by SIGTERM signal

# sleep command does not watch for stdin or stdout, so closing the
# pipe does not terminate the sleep command.
iex> {:ok, p} = Process.start_link(~w(sleep 100000000)) # sleep indefinitely
iex> Process.await_exit(p, 100) # ensure `await_exit` finish within `100ms`. By default it waits for 5s
{:ok, 143} # 143 is the exit status when command exit due to SIGTERM

Examples

Run a command without any input or output

iex> {:ok, p} = Process.start_link(["sh", "-c", "exit 1"])
iex> Process.await_exit(p)
{:ok, 1}

Single process reading and writing to the command

# bc is a calculator, which reads from stdin and writes output to stdout
iex> {:ok, p} = Process.start_link(~w(bc))
iex> Process.write(p, "1 + 1\n") # there must be new-line to indicate the end of the input line
:ok
iex> Process.read(p)
{:ok, "2\n"}
iex> Process.write(p, "2 * 10 + 1\n")
:ok
iex> Process.read(p)
{:ok, "21\n"}
# We must close stdin to signal the `bc` command that we are done.
# since `await_exit` implicitly closes the pipes, in this case we don't have to
iex> Process.await_exit(p)
{:ok, 0}

Running a command which flush the output on stdin close. This is not supported by Erlang/Elixir ports.

# `base64` command reads all input and writes encoded output when stdin is closed.
iex> {:ok, p} = Process.start_link(~w(base64))
iex> Process.write(p, "abcdef")
:ok
iex> Process.close_stdin(p) # we can selectively close stdin and read all output
:ok
iex> Process.read(p)
{:ok, "YWJjZGVm\n"}
iex> Process.read(p) # typically it is better to read till we receive :eof when we are not sure how big the output data size is
:eof
iex> Process.await_exit(p)
{:ok, 0}

Read and write to pipes in separate processes

iex> {:ok, p} = Process.start_link(~w(cat))
iex> writer = Task.async(fn ->
...>   :ok = Process.change_pipe_owner(p, :stdin, self())
...>   Process.write(p, "Hello World")
...>   # no need to close the pipe explicitly here. Pipe will be closed automatically when process exit
...> end)
iex> reader = Task.async(fn ->
...>   :ok = Process.change_pipe_owner(p, :stdout, self())
...>   Process.read(p)
...> end)
iex> :timer.sleep(500) # wait for the reader and writer to change pipe owner, otherwise `await_exit` will close the pipes before we change pipe owner
iex> Process.await_exit(p, :infinity) # let the reader and writer take indefinite time to finish
{:ok, 0}
iex> Task.await(writer)
:ok
iex> Task.await(reader)
{:ok, "Hello World"}

Summary

Functions

Wait for the program to terminate and get exit status.

Changes the Pipe owner of the pipe to specified pid.

Returns a specification to start this module under a supervisor.

Closes external program's standard error pipe (stderr)

Closes external program's standard input pipe (stdin).

Closes external program's standard output pipe (stdout)

Sends an system signal to external program

Returns OS pid of the command

Returns bytes from executed command's stdout with maximum size max_size.

Returns bytes from either stdout or stderr with maximum size max_size whichever is available at that time.

Returns bytes from executed command's stderr with maximum size max_size. Pipe must be enabled with stderr: :consume to read the data.

Writes iodata data to external program's standard input pipe.

Types

@type exit_status() :: non_neg_integer()
@type pipe_name() :: :stdin | :stdout | :stderr
@type signal() :: :sigkill | :sigterm
@type t() :: %Exile.Process{
  exit_ref: reference(),
  monitor_ref: reference(),
  owner: pid(),
  pid: pid() | nil
}

Functions

Link to this function

await_exit(process, timeout \\ 5000)

View Source
@spec await_exit(t(), timeout :: timeout()) :: {:ok, exit_status()}

Wait for the program to terminate and get exit status.

ONLY the Process owner can call this function. And all Exile process MUST be awaited (Similar to Task).

Exile first politely asks the program to terminate by closing the pipes owned by the process owner (by default process owner is the pipes owner). Most programs terminates when standard pipes are closed.

If you have changed the pipe owner to other process, you have to close pipe yourself or wait for the program to exit.

If the program fails to terminate within the timeout (default 5s) then the program will be killed using the exit sequence by sending SIGTERM, SIGKILL signals in sequence.

When timeout is set to :infinity await_exit wait for the programs to terminate indefinitely.

For more details check module documentation.

Link to this function

change_pipe_owner(process, pipe_name, target_owner_pid)

View Source
@spec change_pipe_owner(t(), pipe_name(), pid()) :: :ok | {:error, any()}

Changes the Pipe owner of the pipe to specified pid.

Note that currently any process can change the pipe owner.

For more details about Pipe Owner, please check module docs.

Returns a specification to start this module under a supervisor.

See Supervisor.

@spec close_stderr(t()) :: :ok | {:error, any()}

Closes external program's standard error pipe (stderr)

Only owner of the pipe can close the pipe. This call will return immediately.

@spec close_stdin(t()) ::
  :ok | {:error, :pipe_closed_or_invalid_caller} | {:error, any()}

Closes external program's standard input pipe (stdin).

Only owner of the pipe can close the pipe. This call will return immediately.

@spec close_stdout(t()) :: :ok | {:error, any()}

Closes external program's standard output pipe (stdout)

Only owner of the pipe can close the pipe. This call will return immediately.

@spec kill(t(), :sigkill | :sigterm) :: :ok

Sends an system signal to external program

Note that :sigkill kills the program unconditionally.

Avoid sending signals manually, use await_exit instead.

@spec os_pid(t()) :: pos_integer()

Returns OS pid of the command

This is meant only for debugging. Avoid interacting with the external process directly

Link to this function

read(process, max_size \\ 65535)

View Source
@spec read(t(), pos_integer()) :: {:ok, iodata()} | :eof | {:error, any()}

Returns bytes from executed command's stdout with maximum size max_size.

Blocks if no data present in stdout pipe yet. And returns as soon as data of any size is available.

Note that max_size is the maximum size of the returned data. But the returned data can be less than that depending on how the program flush the data etc.

Link to this function

read_any(process, size \\ 65535)

View Source
@spec read_any(t(), pos_integer()) ::
  {:ok, {:stdout, iodata()}}
  | {:ok, {:stderr, iodata()}}
  | :eof
  | {:error, any()}

Returns bytes from either stdout or stderr with maximum size max_size whichever is available at that time.

Blocks if no bytes are written to stdout or stderr yet. And returns as soon as data is available.

Note that max_size is the maximum size of the returned data. But the returned data can be less than that depending on how the program flush the data etc.

Link to this function

read_stderr(process, size \\ 65535)

View Source
@spec read_stderr(t(), pos_integer()) :: {:ok, iodata()} | :eof | {:error, any()}

Returns bytes from executed command's stderr with maximum size max_size. Pipe must be enabled with stderr: :consume to read the data.

Blocks if no bytes are written to stderr yet. And returns as soon as bytes are available

Note that max_size is the maximum size of the returned data. But the returned data can be less than that depending on how the program flush the data etc.

Link to this function

start_link(cmd_with_args, opts \\ [])

View Source
@spec start_link([String.t(), ...],
  cd: String.t(),
  env: [{String.t(), String.t()}],
  stderr: :console | :disable | :stream
) :: {:ok, t()} | {:error, any()}

Starts Exile.Process server.

Starts external program using cmd_with_args with options opts

cmd_with_args must be a list containing command with arguments. example: ["cat", "file.txt"].

Options

  • cd - the directory to run the command in

  • env - a list of tuples containing environment key-value. These can be accessed in the external program

  • stderr - different ways to handle stderr stream.

    1. :console - stderr output is redirected to console (Default)
    2. :redirect_to_stdout - stderr output is redirected to stdout
    3. :disable - stderr output is redirected /dev/null suppressing all output
    4. :consume - connects stderr for the consumption. When set, the stderr output must be consumed to avoid external program from blocking.

    See :stderr for more details and issues associated with them

Caller of the process will be the owner owner of the Exile Process. And default owner of all opened pipes.

Please check module documentation for more details

@spec write(t(), binary()) :: :ok | {:error, any()}

Writes iodata data to external program's standard input pipe.

This call blocks when the pipe is full. Returns :ok when the complete data is written.