View Source Pdf.Reader.Utils (ExPDF v1.0.1)

Shared utility helpers for Pdf.Reader sub-modules.

Provides string decoding and rectangle parsing used by AcroForm, Outlines, Annotations, and Destination modules.

Spec references

Summary

Functions

Decodes a PDF string value to a UTF-8 String.t().

Parses a PDF /Rect array into a {x1, y1, x2, y2} tuple of floats.

Functions

Link to this function

decode_pdf_string(binary)

View Source
@spec decode_pdf_string(any()) :: String.t() | nil

Decodes a PDF string value to a UTF-8 String.t().

Handles the following input variants:

  • nilnil
  • non-binary, non-tuple → nil
  • {:string, binary} tuple — unwraps and decodes the binary
  • <<0xFE, 0xFF, ...>> — UTF-16BE BOM prefix → decoded to UTF-8 via :unicode
  • plain binary — if valid UTF-8, returned as-is; otherwise best-effort ASCII extraction (non-ASCII bytes replaced with "?")

Spec reference

PDF 1.7 § 7.9.2.2 — Text String Type (UTF-16BE BOM).

@spec parse_rect(any()) :: {number(), number(), number(), number()} | nil

Parses a PDF /Rect array into a {x1, y1, x2, y2} tuple of floats.

Returns nil for any input that is not a 4-element list of numbers.

Examples

iex> Pdf.Reader.Utils.parse_rect([0, 0, 100, 200])
{0.0, 0.0, 100.0, 200.0}

iex> Pdf.Reader.Utils.parse_rect(nil)
nil