URP (urp v0.10.0)

Copy Markdown

Pure Elixir client for document conversion via LibreOffice's UNO Remote Protocol.

Talks directly to a soffice process over TCP. No Python, no unoserver, no Gotenberg.

Setup

Add :urp to your dependencies — that's it. A default connection pool starts automatically, connecting to localhost:2002.

# config/runtime.exs (optional — defaults shown)
config :urp, :default,
  host: "soffice",
  port: 2002,
  pool_size: 1,
  backoff_initial: 500,  # ms, initial reconnection delay
  backoff_max: 5_000     # ms, max reconnection delay

See convert/2 for usage examples and options.

Diagnostics

Query soffice state without converting anything:

{:ok, "26.2.0.3"} = URP.version()
{:ok, services} = URP.services()
{:ok, filters} = URP.filters()
{:ok, types} = URP.types()
{:ok, locale} = URP.locale()

Named pools

For multiple soffice instances, configure named pools:

config :urp, :pools,
  spreadsheets: [host: "soffice-2", port: 2002, pool_size: 3]

{:ok, pdf} = URP.convert({:binary, bytes}, filter: "calc_pdf_Export", pool: :spreadsheets)

Named pools are started on first use.

Testing

URP.Test.stub(fn _input, _opts -> {:ok, "/tmp/fake.pdf"} end)
{:ok, _} = URP.convert({:binary, docx_bytes}, filter: "writer_pdf_Export")

See URP.Test for details.

Summary

Functions

Convert a document via LibreOffice.

Like convert/2 but raises on error.

List all export filter names registered in soffice.

Like filters/1 but raises on error.

Query the soffice locale string.

Like locale/1 but raises on error.

List all service names registered in the UNO service manager.

Like services/1 but raises on error.

List all document type names registered in soffice.

Like types/1 but raises on error.

Query the soffice version string over URP.

Like version/1 but raises on error.

Types

io_mode()

@type io_mode() :: :file | :stream | {:file | :stream, :file | :stream}

opt()

@type opt() ::
  {:output, output()}
  | {:pool, atom()}
  | {:filter, String.t()}
  | {:filter_data, keyword()}
  | {:settings, [setting()]}
  | {:io, io_mode()}
  | {:timeout, non_neg_integer()}
  | {:recv_timeout, timeout()}
  | {:max_frame_size, pos_integer()}

output()

@type output() :: Path.t() | :binary | (binary() -> any())

setting()

@type setting() :: {String.t(), String.t(), boolean() | integer() | String.t()}

Functions

convert(input, opts \\ [])

@spec convert(binary() | {:binary, binary()} | Enumerable.t(), [opt()]) ::
  {:ok, Path.t()} | {:ok, binary()} | :ok | {:error, String.t()}

Convert a document via LibreOffice.

Input types

  • path (binary) — local file path, loaded via file-backed streaming
  • {:binary, bytes} — raw document bytes
  • enumerable — any Enumerable (e.g. File.stream!/2), streamed lazily

Options

  • :filter — export filter name (required). See moduledoc for common filters.
  • :filter_data — keyword list of filter-specific export options. Values can be booleans, integers, or strings. For PDF filters, see PDF export options (e.g. [UseLosslessCompression: true, ExportFormFields: false]).
  • :settings — list of {path, property, value} triplets to set on soffice before conversion via ConfigurationUpdateAccess. Useful for tuning cache limits, graphic memory, etc. Values can be booleans, integers, or strings. See officecfg schema for all available settings.
  • :io — I/O transfer strategy (default :file):
    • :file — transfer complete files via temp files on soffice's filesystem. Fast (~6 URP round-trips), but requires temp disk space on soffice.
    • :stream — stream document bytes over the URP socket via XInputStream/XOutputStream. Stream input is ~40-50% slower (many seek/read round-trips for ZIP formats). Stream output adds <5% overhead — soffice writes in fixed 32 767-byte chunks, so a 7 MB PDF is ~223 writeBytes calls. No temp files and constant memory usage.
    • {:file, :stream} or {:stream, :file} — mix strategies independently for input and output. {:file, :stream} is a good defensive choice: fast file-based input with chunked stream output (no large single allocation on the BEAM).
  • :output — where to write converted output:
    • path string — write to file, returns {:ok, path}
    • :binary — return bytes, returns {:ok, bytes}
    • fun/1 — call with each chunk, returns :ok
    • not set — write to temp file, returns {:ok, tmp_path}
  • :pool — named pool to use (default: the auto-started pool)
  • :timeout — checkout timeout in ms (default 120_000)

Examples

File path input with various output modes:

{:ok, pdf_path} = URP.convert("/tmp/report.docx", filter: "writer_pdf_Export")
{:ok, "/tmp/out.pdf"} = URP.convert("/tmp/report.docx", filter: "writer_pdf_Export", output: "/tmp/out.pdf")
{:ok, pdf_bytes} = URP.convert("/tmp/report.docx", filter: "writer_pdf_Export", output: :binary)

Raw bytes:

{:ok, pdf_bytes} = URP.convert({:binary, docx_bytes}, filter: "writer_pdf_Export", output: :binary)

Enumerable (e.g. streaming a large file):

{:ok, pdf_path} = URP.convert(File.stream!("huge.docx", 65_536), filter: "writer_pdf_Export")

With soffice settings (e.g. raise graphic memory cache for image-heavy docs):

{:ok, pdf} = URP.convert("charts.pptx",
  filter: "impress_pdf_Export",
  settings: [
    {"org.openoffice.Office.Common/Cache/GraphicManager", "GraphicMemoryLimit", 500_000_000}
  ]
)

The :filter option is required:

iex> URP.convert({:binary, "bytes"}, output: :binary)
** (ArgumentError) URP.convert/2 requires the :filter option. Common filters: "writer_pdf_Export", "calc_pdf_Export", "impress_pdf_Export", "Markdown"

With a test stub (see URP.Test):

iex> URP.Test.stub(fn _input, _opts -> {:ok, "/tmp/fake.pdf"} end)
:ok
iex> URP.convert("/tmp/test.docx", filter: "writer_pdf_Export")
{:ok, "/tmp/fake.pdf"}

convert!(input, opts \\ [])

@spec convert!(binary() | {:binary, binary()} | Enumerable.t(), [opt()]) ::
  Path.t() | binary() | :ok

Like convert/2 but raises on error.

filters(opts \\ [])

@spec filters(keyword()) :: {:ok, [String.t()]} | {:error, String.t()}

List all export filter names registered in soffice.

Returns a list of filter name strings like "writer_pdf_Export". Useful for discovering which filters are available on the connected soffice.

Examples

{:ok, filters} = URP.filters()
"writer_pdf_Export" in filters
# => true

Options

  • :pool — named pool to use (default: the auto-started pool)
  • :timeout — checkout timeout in ms (default 120_000)

filters!(opts \\ [])

@spec filters!(keyword()) :: [String.t()]

Like filters/1 but raises on error.

locale(opts \\ [])

@spec locale(keyword()) :: {:ok, String.t()} | {:error, String.t()}

Query the soffice locale string.

Returns the locale string (e.g. "en-US") or "" if not configured.

Examples

{:ok, locale} = URP.locale()
# => "en-US"

Options

  • :pool — named pool to use (default: the auto-started pool)
  • :timeout — checkout timeout in ms (default 120_000)

locale!(opts \\ [])

@spec locale!(keyword()) :: String.t()

Like locale/1 but raises on error.

services(opts \\ [])

@spec services(keyword()) :: {:ok, [String.t()]} | {:error, String.t()}

List all service names registered in the UNO service manager.

Returns a list of service name strings like "com.sun.star.frame.Desktop".

Examples

{:ok, services} = URP.services()
"com.sun.star.frame.Desktop" in services
# => true

Options

  • :pool — named pool to use (default: the auto-started pool)
  • :timeout — checkout timeout in ms (default 120_000)

services!(opts \\ [])

@spec services!(keyword()) :: [String.t()]

Like services/1 but raises on error.

types(opts \\ [])

@spec types(keyword()) :: {:ok, [String.t()]} | {:error, String.t()}

List all document type names registered in soffice.

Returns a list of type name strings like "writer8". These are the internal names soffice uses for file format detection.

Examples

{:ok, types} = URP.types()
"writer8" in types
# => true

Options

  • :pool — named pool to use (default: the auto-started pool)
  • :timeout — checkout timeout in ms (default 120_000)

types!(opts \\ [])

@spec types!(keyword()) :: [String.t()]

Like types/1 but raises on error.

version(opts \\ [])

@spec version(keyword()) :: {:ok, String.t()} | {:error, String.t()}

Query the soffice version string over URP.

Returns the raw version string (e.g. "26.2.0.3"). Callers can use Version.parse/1 if needed.

Examples

{:ok, version} = URP.version()
# => "26.2.0.3"

Options

  • :pool — named pool to use (default: the auto-started pool)
  • :timeout — checkout timeout in ms (default 120_000)

version!(opts \\ [])

@spec version!(keyword()) :: String.t()

Like version/1 but raises on error.