View Source Pdf.Reader.CID.CMapParser (ExPDF v1.0.1)

Minimal PostScript subset parser for Adobe predefined CMap files.

Handles only the operators required for CID lookup: begin/endcodespacerange, begin/endcidchar, begin/endcidrange, begin/endnotdefchar, begin/endnotdefrange, usecmap.

All other PostScript content (comments, /CMapName, /CIDSystemInfo, /WMode, dict/array literals, dup/def/pop, etc.) is silently skipped.

Returns a parsed struct compatible with Pdf.Reader.CID.PredefinedCMap for caching and lookup.

Spec references

Summary

Functions

Parse a PostScript CMap text and return a plain map with the extracted CID mapping data.

Types

@type cmap() :: %{
  cidchar: %{required(non_neg_integer()) => non_neg_integer()},
  cidrange: [{non_neg_integer(), non_neg_integer(), non_neg_integer()}],
  notdef_chars: %{required(non_neg_integer()) => non_neg_integer()},
  notdef_ranges: [{non_neg_integer(), non_neg_integer(), non_neg_integer()}],
  codespaces: %{required(1..4) => [{non_neg_integer(), non_neg_integer()}]},
  parent: String.t() | nil
}

Functions

@spec parse(text :: binary()) :: {:ok, cmap()} | {:error, term()}

Parse a PostScript CMap text and return a plain map with the extracted CID mapping data.

Returns {:ok, cmap_fields} on success or {:error, reason} if the input is fundamentally unparseable. Unknown or irrelevant tokens are silently skipped — this function NEVER raises.

Return map keys

  • :cidchar%{code_integer => cid_integer}
  • :cidrange[{lo, hi, base_cid}]
  • :notdef_chars%{code_integer => cid_integer}
  • :notdef_ranges[{lo, hi, base_cid}]
  • :codespaces%{byte_length => [{lo, hi}]}, grouped by byte width
  • :parentString.t() | nil — name from usecmap directive