# `Arcana.FileParser.PDF`
[🔗](https://github.com/georgeguimaraes/arcana/blob/main/lib/arcana/file_parser/pdf.ex#L1)

Behaviour for PDF parsing providers.

Arcana accepts any module that implements this behaviour for PDF text extraction.
The built-in implementation uses poppler's `pdftotext` command.

## Configuration

Configure your PDF parser in `config.exs`:

    # Default: poppler's pdftotext
    config :arcana, pdf_parser: :poppler

    # Custom module implementing this behaviour
    config :arcana, pdf_parser: MyApp.PDFParser
    config :arcana, pdf_parser: {MyApp.PDFParser, some_option: "value"}

## Implementing a Custom PDF Parser

Create a module that implements this behaviour:

    defmodule MyApp.PDFParser do
      @behaviour Arcana.FileParser.PDF

      @impl true
      def parse(path, opts) when is_binary(path) do
        # Parse PDF at file path
        {:ok, extracted_text}
      end

      # Optional: handle binary content directly
      def parse(binary, opts) when is_binary(binary) do
        # Parse PDF binary content
        {:ok, extracted_text}
      end
    end

Then configure:

    config :arcana, pdf_parser: {MyApp.PDFParser, some_option: "value"}

# `parse`

```elixir
@callback parse(path_or_binary :: binary(), opts :: keyword()) ::
  {:ok, String.t()} | {:error, term()}
```

Parses a PDF and extracts text content.

The first argument can be either:
- A file path (string) - the implementation reads the file
- Binary content - the implementation parses directly

Returns `{:ok, text}` on success, or `{:error, reason}` on failure.

## Options

Options are implementation-specific. The default Poppler implementation
supports:

  * `:layout` - Preserve original layout (default: true)

# `parse`

Parses a PDF using the configured parser.

The parser is a `{module, opts}` tuple where module implements
this behaviour.

# `supports_binary?`

Checks if the given parser module supports binary input.

Some parsers (like Poppler) require a file path and don't support
parsing binary content directly.

---

*Consult [api-reference.md](api-reference.md) for complete listing*
