View Source ChromicPDF (ChromicPDF v1.17.0)
ChromicPDF is a fast HTML-to-PDF/A renderer based on Chrome & Ghostscript.
usage
Usage
start
Start
Start ChromicPDF as part of your supervision tree:
def MyApp.Application do
def start(_type, _args) do
children = [
# other apps...
{ChromicPDF, chromic_pdf_opts()}
]
Supervisor.start_link(children, strategy: :one_for_one, name: MyApp.Supervisor)
end
defp chromic_pdf_opts do
[]
end
end
print-a-pdf
Print a PDF
ChromicPDF.print_to_pdf({:file, "example.html"}, output: "output.pdf")
This tells Chrome to open the example.html
file from your current directory and save the
rendered page as output.pdf
. PDF printing comes with a ton of options. Please see
ChromicPDF.print_to_pdf/2
for details.
print-a-pdf-a
Print a PDF/A
ChromicPDF.print_to_pdfa({:file, "example.html"}, output: "output.pdf")
This prints the same PDF with Chrome and afterwards passes it to Ghostscript to convert it to a
PDF/A. Please see ChromicPDF.print_to_pdfa/2
or ChromicPDF.convert_to_pdfa/2
for details.
security-considerations
Security considerations
By default, ChromicPDF will allow Chrome to make use of its own "sandbox" process jail. The sandbox tries to limit system resource access of the renderer processes to the minimum resources they require to perform their task. It is designed to make displaying HTML pages relatively safe, in terms of preventing undesired access of a page to the host operating system.
Nevertheless, running a browser as part of your application, especially when used to process user-supplied content, significantly increases your attack surface. Hence, before adding ChromicPDF to your application's (perhaps already long) list of dependencies, you may want to consider the security hints below.
architectural-isolation
Architectural isolation
A great, if not the best option to mitigate security risks due to the use of ChromicPDF / a Browser in your stack, is to turn your "document renderer" component into a containerized service with a small RPC interface. This will create a nice barrier between Chrome and the rest of your application, so that even if an attacker manages to escape Chrome's sandbox, they will still be jailed within the container. It also has other benefits like better control of resources, e.g. how much CPU you want to dedicate to PDF rendering.
escape-user-supplied-data
Escape user-supplied data
Make sure to always escape user-provided data with something like Phoenix.HTML.html_escape
.
This should prevent an attacker from injecting malicious scripts into your template.
disabling-scripts
Disabling scripts
If your template allows, you can disable JavaScript execution altogether (using the
DevTools command Emulation.setScriptExecutionDisabled
)
with the :disable_scripts
option:
def chromic_pdf_opts do
[disable_scripts: true]
end
Note that this doesn't prevent other features like the evaluate
option from working, it
solely applies to scripts being supplied by the rendered page itself.
running-in-offline-mode
Running in offline mode
To prevent your templates from accessing any remote hosts, the browser targets can be spawned
in "offline mode" (using the DevTools command Network.emulateNetworkConditions
).
Chrome targets with network conditions set to offline
can't resolve any external URLs (e.g.
https://
), neither entered as navigation URL nor contained within the HTML body.
def chromic_pdf_opts do
[offline: true]
end
chrome-sandbox-in-docker-containers
Chrome Sandbox in Docker containers
In Docker containers running Linux images (e.g. images based on Alpine), and which are configured to run their main job as a non-root user, the sandbox may cause Chrome to crash on startup as it requires root privileges.
The error output (discard_stderr: false
option) looks as follows:
Failed to move to new namespace: PID namespaces supported, Network namespace supported,
but failed: errno = Operation not permitted
The best way to resolve this issue is to configure your Docker container to use seccomp rules that grant Chrome access to the relevant system calls. See the excellent Zenika/alpine-chrome repository for details on how to make this work.
Alternatively, you may choose to disable Chrome's sandbox with the no_sandbox
option.
defp chromic_pdf_opts do
[no_sandbox: true]
end
Only local Chrome instances
This option is available only for local Chrome instances.
ssl-connections
SSL connections
In you are fetching your print source from a https://
URL, as usual Chrome verifies the
remote host's SSL certificate when establishing the secure connection, and errors out of
navigation if the certificate has expired or is not signed by a known certificate authority
(i.e. no self-signed certificates).
For production systems, this security check is essential and should not be circumvented.
However, if for some reason you need to bypass certificate verification in development or test,
you can do this with the :ignore_certificate_errors
option.
defp chromic_pdf_opts do
[ignore_certificate_errors: true]
end
session-pool
Session pool
ChromicPDF spawns a pool of targets (= tabs) inside the launched Chrome process. These are held in memory to reduce initialization time in the PDF print jobs.
operation-timeouts
Operation timeouts
By default, ChromicPDF allows the print process to take 5 seconds to finish. In case you are
printing large PDFs and run into timeouts, these can be configured configured by passing the
timeout
option to the session pool.
defp chromic_pdf_opts do
[
session_pool: [timeout: 10_000] # in milliseconds
]
end
concurrency
Concurrency
ChromicPDF depends on the NimblePool library to manage the browser sessions in a pool. To increase or limit the number of concurrent sessions, you can pass pool configuration to the supervisor.
defp chromic_pdf_opts do
[
session_pool: [size: 3]
]
end
NimblePool performs simple queueing of operations. The maximum time an operation is allowed to
wait in the queue is configurable with the :checkout_timeout
option, and defaults to 5 seconds.
defp chromic_pdf_opts do
[
session_pool: [checkout_timeout: 5_000]
]
end
Please note that this is not a persistent queue. If your concurrent demand exceeds the configured concurrency, your jobs will begin to time out. In this case, an asynchronous approach backed by a persistent job processor like Oban will give you better results, and likely improve your application's UX.
automatic-session-restarts-to-avoid-memory-drain
Automatic session restarts to avoid memory drain
By default, ChromicPDF will restart sessions within the Chrome process after 1000 operations.
This helps to prevent infinite growth in Chrome's memory consumption. The "max age" of a session
can be configured with the :max_uses
option.
defp chromic_pdf_opts do
[
session_pool: [max_uses: 1000]
]
end
multiple-session-pools
Multiple session pools
ChromicPDF supports running multiple named session pools to allow varying session configuration. For example, this makes it possible to have one pool that is not allowed to execute JavaScript while others can use JavaScript.
defp chromic_pdf_opts do
[
session_pool: %{
with_scripts: [],
without_scripts: [disabled_scripts: true]
}
]
end
When you define multiple session pools, you need to assign the pool to use in each PDF job:
ChromicPDF.print_to_pdf(..., session_pool: :without_scripts)
Global options are used as defaults for each configured pool. See
ChromicPDF.session_option/0
for a list of options for the session pools.
chrome-zombies
Chrome zombies
Help, a Chrome army tries to take over my system!
ChromicPDF tries its best to gracefully close the external Chrome process when its supervisor is terminated. Unfortunately, when the BEAM is not shutdown gracefully, Chrome processes will keep running. While in a containerized production environment this is unlikely to be of concern, in development it can lead to unpleasant performance degradation of your operation system.
In particular, the BEAM is not shutdown properly…
- when you exit your application or
iex
console with the Ctrl+C abort mechanism (see issue #56), - and when you run your tests. No, after an ExUnit run your application's supervisor is not terminated cleanly.
There are a few ways to mitigate this issue.
on-demand-mode
"On Demand" mode
In case you habitually end your development server with Ctrl+C, you should consider enabling "On Demand" mode which disables the session pool, and instead starts and stops Chrome instances as needed. If multiple PDF operations are requested simultaneously, multiple Chrome processes will be launched (each with a pool size of 1, disregarding the pool configuration).
defp chromic_pdf_opts do
[on_demand: true]
end
To enable it only for development, you can load the option from the application environment.
# config/config.exs
config :my_app, ChromicPDF, on_demand: false
# config/dev.exs
config :my_app, ChromicPDF, on_demand: true
# application.ex
@chromic_pdf_opts Application.compile_env!(:my_app, ChromicPDF)
defp chromic_pdf_opts do
@chromic_pdf_opts ++ [... other opts ...]
end
terminating-your-supervisor-after-your-test-suite
Terminating your supervisor after your test suite
You can enable "On Demand" mode for your tests, as well. However, please be aware that each test that prints a PDF will have an increased runtime (plus about 0.5s) due to the added Chrome boot time cost. Luckily, ExUnit provides a method to run code at the end of your test suite.
# test/test_helper.exs
ExUnit.after_suite(fn _ -> Supervisor.stop(MyApp.Supervisor) end)
ExUnit.start()
only-start-chromicpdf-in-production
Only start ChromicPDF in production
The easiest way to prevent Chrome from spawning in development is to only run ChromicPDF in
the prod
environment. However, obviously you won't be able to print PDFs in development or
test then.
chrome-options
Chrome options
By default, ChromicPDF will try to run a Chrome instance in the local environment. The following options allow to customize the generated command line.
custom-command-line-switches
Custom command line switches
The :chrome_args
option allows to pass arbitrary options to the Chrome/Chromium executable.
defp chromic_pdf_opts do
[chrome_args: "--font-render-hinting=none"]
end
In some cases, ChromicPDF's default arguments (e.g. --disable-gpu
) may conflict with the ones
you would like to add. In this case, use can supply a keyword list to the :chrome_args
option
which allows targeted removing of default arguments.
defp chromic_pdf_opts do
[chrome_args: [
append: "--headless=new --angle=swiftshader",
remove: ["--headless", "--disable-gpu"]
]]
end
The :chrome_executable
option allows to specify a custom Chrome/Chromium executable.
defp chromic_pdf_opts do
[chrome_executable: "/usr/bin/google-chrome-beta"]
end
font-rendering-issues-on-linux
Font rendering issues on Linux
On Linux, Chrome and its rendering engine Skia have longstanding issues with rendering certain fonts for print media, especially with regards to letter kerning. See this issue in puppeteer for a discussion. If your documents suffer from incorrectly spaced letters, you can try some of the following:
- Apply the
text-rendering: geometricPrecision
CSS rule. In our tests, this has shown to be the most reliable option. Besides, it is also the most flexible option as you can apply it to individual elements depending on the font-face they use. Recommended. - Set
--font-render-hinting=none
or--disable-font-subpixel-positioning
command line switches (see:chrome_args
option above). While this generally improved text rendering in all our tests, it is a bit of a mallet method.
See also this blog post for more hints.
debugging-chrome-errors
Debugging Chrome errors
Chrome's stderr logging is silently discarded to not obscure your logfiles. In case you would
like to take a peek, add the discard_stderr: false
option.
defp chromic_pdf_opts do
[discard_stderr: false]
end
remote-chrome
Remote Chrome
Instead of running a local Chrome instance, you may connect to an external Chrome instance via its websocket-based debugging interface. For example, you can run a headless Chrome inside a docker container using the minimalistic Zenika/alpine-chrome images:
$ docker run --rm -p 9222:9222 \
zenika/alpine-chrome:114 \
--no-sandbox \
--headless \
--remote-debugging-port=9222 \
--remote-debugging-address=0.0.0.0
See the ChromicPDF.ChromeRunner
module for a list of command line arguments that may improve your headless Chrome experience.
To enable remote connections to Chrome, you need to specify the hostname and port of the running Chrome instance using the :chrome_address
option. Setting this option will disable the command line-related options discussed above.
defp chromic_pdf_opts do
[chrome_address: {"localhost", 9222}]
end
To communicate with Chrome through its the websocket interface, ChromicPDF has an optional dependency on the websockex package, which you need to explicitly add to your mix.exs
:
def deps do
[
{:chromic_pdf, "..."},
{:websockex, "~> 0.4.3"}
]
end
In case you have added websockex
after chromic_pdf
had already been compiled, you need to force a recompilation with mix deps.compile --force chromic_pdf
.
Experimental
Please note that support for remote connections is considered experimental. Be aware that between restarts ChromicPDF may leave tabs behind and your external Chrome process may leak memory.
ghostscript-pool
Ghostscript pool
In addition to the session pool, a pool of ghostscript "executors" is started, in order to limit this resource as well. By default, ChromicPDF allows the same number of concurrent Ghostscript processes to run as it spawns sessions in Chrome itself.
defp chromic_pdf_opts do
[
ghostscript_pool: [size: 10]
]
end
telemetry-support
Telemetry support
To provide insights into PDF and PDF/A generation performance, ChromicPDF executes the following telemetry events:
[:chromic_pdf, :print_to_pdf, :start | :stop | exception]
[:chromic_pdf, :capture_screenshot, :start | :stop | :exception]
[:chromic_pdf, :convert_to_pdfa, :start | :stop | exception]
Please see :telemetry.span/3
for
details on their payloads, and :telemetry.attach/4
for how to attach to them.
Each of the corresponding functions accepts a telemetry_metadata
option which is passed to
the attached event handler. This can, for instance, be used to mark events with custom tags such
as the type of the print document.
ChromicPDF.print_to_pdf(..., telemetry_metadata: %{template: "invoice"})
The print_to_pdfa/2
function emits both the print_to_pdf
and convert_to_pdfa
event series,
in that order.
Last but not least, the print_to_pdf/2
function emits :join_pdfs
events when concatenating
multiple input sources.
[:chromic_pdf, :join_pdfs, :start | :stop | exception]
further-options
Further options
debugging-javascript-errors-warnings
Debugging JavaScript errors & warnings
By default, unhandled runtime exceptions thrown in JavaScript execution contexts are logged. You may choose to instead convert them into an Elixir exception by passing the following option:
defp chromic_pdf_opts do
# :ignore | :log (default) | :raise
[unhandled_runtime_exceptions: :raise]
end
Alternatively, you can pass :ignore
to silence the log statement.
Calls to console.log
& friends are ignored by default, and can be configured to be logged
like this:
defp chromic_pdf_opts do
# :ignore (default) | :log | :raise
[console_api_calls: :log]
end
on-accessibility-pdf-ua
On Accessibility / PDF/UA
Since its version 85, Chrome generates "Tagged PDF" files by
default. These files
contain structural information about the document, i.e. type information about the nodes
(headings, paragraph, etc.), as well as metadata like node attributes (e.g., image alt texts).
This information allows assistive tools like screen readers to do their job, at the cost of
(at times significantly) increasing the file size. To check whether a PDF file is tagged, you
can use the pdfinfo
utility, it reports these files as Tagged: yes
. You can review some of
the contained information with the pdfinfo -struct-text <file>
command. Tagging may be
disabled by passing the --disable-pdf-tagging
argument to Chrome via the chrome_args
option.
However, at the time of writing, Chrome's most recent beta version 109 does not generate files compliant to the PDF/UA standard (ISO 14289-1:2014). Both the "PAC 2021" accessibility checker and the VeraPDF validator (capable of validating a subset of the PDF/UA rules since version 1.18 from April 2021) report rule violations concerning mandatory metadata.
So, if your use-case requires you to generate fully PDF/UA-compliant files, at the moment Chrome (and by extension, ChromicPDF) is not going fulfill your needs.
Furthermore, any operation that involves running the Chrome-generated file through Ghostscript
(PDF/A conversion, concatenation) will remove all structural information, so that pdfinfo
reports Tagged: no
, and thereby prevent assistive tools from proper functioning.
Link to this section Summary
Types
These options apply to remote Chrome instances only.
These options apply to local Chrome instances only.
Functions
Captures a screenshot.
Returns a specification to start this module as part of a supervision tree.
Converts a PDF to PDF/A (either PDF/A-2b or PDF/A-3b).
Retrieves the currently set name (set using put_dynamic_name/1
) or the default name.
Prints a PDF.
Prints a PDF and converts it to PDF/A in a single call.
Activate a particular ChromicPDF instance, which was started with the name
option.
After calling this function, all calls in the current process will use this instance of ChromicPDF.
Starts ChromicPDF.
Runs a one-off Chrome process to allow Chrome to initialize its caches.
Link to this section Types
@type capture_screenshot_option() :: {:capture_screenshot, map()} | {:full_page, boolean()} | {:protocol, module()} | navigate_option()
@type deprecated_max_session_uses_option() :: {:max_session_uses, non_neg_integer()}
@type evaluate_option() :: {:evaluate, %{expression: binary()}}
@type ghostscript_pool_option() :: {:size, non_neg_integer()}
@type global_option() :: {:name, atom()} | {:on_demand, boolean()} | session_option() | {:session_pool, [session_pool_option()]} | {:session_pool, named_session_pools()} | {:ghostscript_pool, [ghostscript_pool_option()]} | local_chrome_option() | inet_chrome_option() | deprecated_max_session_uses_option()
@type inet_chrome_option() :: {:chrome_address, {host :: binary(), port :: non_neg_integer()}}
These options apply to remote Chrome instances only.
@type info_option() :: {:info, %{ optional(:title) => binary(), optional(:author) => binary(), optional(:subject) => binary(), optional(:keywords) => binary(), optional(:creator) => binary(), optional(:creation_date) => binary() | DateTime.t(), optional(:mod_date) => binary() | DateTime.t() }}
@type local_chrome_option() :: {:no_sandbox, boolean()} | {:discard_stderr, boolean()} | {:chrome_args, binary() | extended_chrome_args()} | {:chrome_executable, binary()}
These options apply to local Chrome instances only.
@type named_session_pools() :: %{required(atom()) => [session_pool_option()]}
@type output_function() :: (any() -> output_function_result())
@type output_function_result() :: any()
@type output_option() :: {:output, binary()} | {:output, output_function()}
@type path() :: binary()
@type pdf_option() :: {:print_to_pdf, map()} | {:protocol, module()} | navigate_option()
@type pdfa_option() :: {:pdfa_version, binary()} | {:compatibility_level, binary()} | {:pdfa_def_ext, binary()} | {:permit_read, binary()} | info_option()
@type plug_option() :: {:url, url()} | {:forward, plug_forward()}
@type result() :: :ok | {:ok, any()} | {:ok, output_function_result()}
@type session_pool_option() :: session_option() | {:size, non_neg_integer()} | {:max_uses, non_neg_integer()} | {:init_timeout, timeout()} | {:checkout_timeout, timeout()} | {:timeout, timeout()}
@type source() :: source() | source_and_options()
@type source_and_options() :: %{source: source_tuple(), opts: [pdf_option()]}
@type source_tuple() :: {:url, url()} | {:html, iodata()} | {:plug, [plug_option()]}
@type url() :: binary()
Link to this section Functions
@spec capture_screenshot(source(), [capture_screenshot_option() | shared_option()]) :: result()
Captures a screenshot.
This call blocks until the screenshot has been created.
print-and-return-base64-encoded-png
Print and return Base64-encoded PNG
{:ok, blob} = ChromicPDF.capture_screenshot({:url, "file:///example.html"})
custom-options-for-page-capturescreenshot
Custom options for Page.captureScreenshot
Custom options for the Page.captureScreenshot
call can be specified by passing a map to the :capture_screenshot
option.
ChromicPDF.capture_screenshot(
{:url, "file:///example.html"},
capture_screenshot: %{
format: "jpeg"
}
)
For navigational options (source, cookies, evaluating scripts) see print_to_pdf/2
.
You may also use ChromicPDF.Template
as an input source for capture_screenshot/2
, yet
keep in mind that many of the page-related styles do not take effect for screenshots.
full-page-screenshots
Full page screenshots
You can pass the :full_page
option to make ChromicPDF increase the viewport dimensions to
fit the entire content. This option only works with Chrome version 91 or greater.
ChromicPDF.capture_screenshot(
{:url, "file:///very-long-content.html"},
full_page: true
)
@spec child_spec([global_option()]) :: Supervisor.child_spec()
Returns a specification to start this module as part of a supervision tree.
@spec convert_to_pdfa(path(), [pdfa_option()]) :: result()
Converts a PDF to PDF/A (either PDF/A-2b or PDF/A-3b).
convert-an-input-pdf-and-return-a-base64-encoded-blob
Convert an input PDF and return a Base64-encoded blob
{:ok, blob} = ChromicPDF.convert_to_pdfa("some_pdf_file.pdf")
convert-and-write-to-file
Convert and write to file
ChromicPDF.convert_to_pdfa("some_pdf_file.pdf", output: "output.pdf")
pdf-a-versions-levels
PDF/A versions & levels
Ghostscript supports both PDF/A-2 and PDF/A-3 versions, both in their b
(basic) level. By
default, ChromicPDF generates version PDF/A-3b files. Set the pdfa_version
option for
version 2.
ChromicPDF.convert_to_pdfa("some_pdf_file.pdf", pdfa_version: "2")
Generated files pass the verapdf validation. When you verify this,
please pass the corresponding profile arguments (-f 2b
or -f 3b
).
specifying-pdf-metadata
Specifying PDF metadata
The converter is able to transfer PDF metadata (the Info
dictionary) from the original
PDF file to the output file. However, files printed by Chrome do not contain any metadata
information (except "Creator" being "Chrome").
The :info
option of the PDF/A converter allows to specify metadata for the output file
directly.
ChromicPDF.convert_to_pdfa("some_pdf_file.pdf", info: %{creator: "ChromicPDF"})
The converter understands the following keys, all of which accept String values:
:title
:author
:subject
:keywords
:creator
:creation_date
:mod_date
By specification, date values in :creation_date
and :mod_date
do not need to follow a
specific syntax. However, Ghostscript inserts date strings like "D:20200208153049+00'00'"
and Info extractor tools might rely on this or another specific format. The converter will
automatically format given DateTime
values like this.
Both :creation_date
and :mod_date
are filled with the current date automatically (by
Ghostscript), if the original file did not contain any.
adding-more-postscript-to-the-conversion
Adding more PostScript to the conversion
The pdfa_def_ext
option can be used to feed more PostScript code into the final conversion
step.
ChromicPDF.convert_to_pdfa(
"some_pdf_file.pdf",
pdfa_def_ext: "[/Title (OverriddenTitle) /DOCINFO pdfmark",
)
If your extra Postscript requires read permissions for additional files, pass the
:permit_read
option.
ChromicPDF.convert_to_pdfa(
"some_pdf_file.pdf",
pdfa_def_ext: "custom-postscript",
permit_read: "/some/path",
permit_read: "/some/other/path"
)
embedded-color-scheme
Embedded color scheme
Since it is required to embed a color scheme into PDF/A files, ChromicPDF ships with a copy of
the royalty-free eciRGB_V2
scheme by the European Color Initiative.
If you need to to use a different color scheme, please open an issue.
accessibility
Accessibility
Please note that running a PDF file through Ghostscript removes all structural annotations ("Tags") and hence disables accessibility features of assistive technologies. See On Accessibility / PDF/UA section for details.
@spec get_dynamic_name() :: atom()
Retrieves the currently set name (set using put_dynamic_name/1
) or the default name.
@spec print_to_pdf(source() | [source()], [pdf_option() | shared_option()]) :: result()
Prints a PDF.
This call blocks until the PDF has been created.
output-options
Output options
print-and-return-base64-encoded-pdf
Print and return Base64-encoded PDF
{:ok, blob} = ChromicPDF.print_to_pdf({:url, "file:///example.html"})
# Can be displayed in iframes
"data:application/pdf;base64,\#{blob}"
print-to-file
Print to file
:ok = ChromicPDF.print_to_pdf({:url, "file:///example.html"}, output: "output.pdf")
print-to-temporary-file
Print to temporary file
{:ok, :some_result} =
ChromicPDF.print_to_pdf({:url, "file:///example.html"}, output: fn path ->
send_download(path)
:some_result
end)
The temporary file passed to the callback will be deleted when the callback returns.
input-options
Input options
You can choose between multiple methods of supplying Chrome with the HTML source to print:
- Printing from a URL
- Internal endpoint with request forwarding
- Injecting the HTML markup directly into the DOM through the remote debugging API
print-from-url
Print from URL
Passing in a URL is the simplest way of printing a PDF. A target in Chrome is told to navigate to the given URL. When navigation is finished, the PDF is printed.
ChromicPDF.print_to_pdf({:url, "file:///example.html"})
ChromicPDF.print_to_pdf({:url, "http://example.net"})
ChromicPDF.print_to_pdf({:url, "https://example.net"})
Printing from URL has the benefit of being the tried-and-true solution, as Chrome's content loading works just as you would expect, including its assets cache.
Cookies
If your URL requires authentication, you can pass in a session cookie. The cookie is automatically cleared after the PDF has been printed.
cookie = %{
name: "foo",
value: "bar",
domain: "localhost"
}
ChromicPDF.print_to_pdf({:url, "http:///example.net"}, set_cookie: cookie)
See Network.setCookie
for options. name
and value
keys are required.
internal-endpoint-with-request-forwarding
Internal endpoint with request forwarding
Serving HTML templates from an internal endpoint allows you to leverage your existing HTTP
server and HTML rendering infrastructure. Usually, you will want to render a HTML template
from data you have in hand when calling print_to_pdf/2
. ChromicPDF.Plug
allows you to
pass a callback function from the caller to the process serving Chrome's HTTP request.
ChromicPDF.print_to_pdf(
{:plug,
url: "http://localhost:4000/makepdf",
forward:
fn conn ->
# this is executed in the context of the incoming Chrome request
end
}
)
You can also pass a {module, func, [postargs]}
tuple to the :forward
option.
print-from-in-memory-html
Print from in-memory HTML
Alternatively, print_to_pdf/2
allows to pass an in-memory HTML blob to Chrome in a
{:html, blob()}
tuple. The HTML is sent to the target using the Page.setDocumentContent
function.
ChromicPDF.print_to_pdf(
{:html, "<h1>Hello World!</h1>"}
)
This method is useful for setups where Chrome has no network access to the application that hosts ChromicPDF, or you prefer not to have an HTTP server in your application.
Caveats
However, in-memory HTML printing comes with a few caveats.
References to external files in HTML source
Since the document content is replaced without navigating to a URL, Chrome has no way of telling which host it should contact to resolve relative URLs contained in the source.
If your HTML contains markup like
<!-- BAD: relative link to stylesheet in <head> element -->
<head>
<link rel="stylesheet" href="selfhtml.css">
</head>
<!-- BAD: relative link to image -->
<img src="some_logo.png">
... you will need to replace these lines with either absolute URLs or inline data.
Of course, absolute URLs can use the file://
scheme to point to files on the local
filesystem, assuming Chrome has access to them. For the purpose of displaying small
inline images (e.g. logos), data URLs
are a good way of embedding them without the need for an absolute URL.
<!-- GOOD: inline styles -->
<style>
/* ... */
</style>
<!-- GOOD: data URLs -->
<img src="data:image/png;base64,R0lGODdhEA...">
<!-- GOOD: absolute URLs -->
<img src="http://localhost/path/to/image.png">
<img src="file:///path/to/image.png">
Content from Phoenix templates
If your content is generated by a Phoenix template (and hence comes in the form of
{:safe, iodata()}
or %Phoenix.LiveView.Rendered{}
), you will need to pass it to
Phoenix.HTML.safe_to_string/1
first.
content = SomeView.render("body.html") |> Phoenix.HTML.safe_to_string()
ChromicPDF.print_to_pdf({:html, content})
concatenating-multiple-sources
Concatenating multiple sources
Pass a list of sources as first argument to instruct ChromicPDF to create a PDF file for
each source and concatenate these using Ghostscript. This is particularly useful when some
sections of your final document require a different page layout than others. You may use
ChromicPDF.Template
or tuple sources.
[
ChromicPDF.Template.source_and_options(
content: "<h1>First part with header</h1>",
header_height: "20mm",
header: "<p>Some header text</p>"
),
{:html, "second part without header"}
]
|> ChromicPDF.print_to_pdf()
You can pass additional options to print_to_pdf/2
as usual, e.g. :output
to control
the return value handling.
Individual sources are processed sequentially and eventually concatenated, so expect runtime
to increase linearly with the number of sources. The session timeout is applied per source.
Each source emits the normal :print_to_pdf
telemetry events. The final concatenation emits
:join_pdfs
events.
Please note that running PDF files through Ghostscript removes all structural annotations ("Tags") and hence disables accessibility features of assistive technologies. See On Accessibility / PDF/UA section for details.
custom-options-for-page-printtopdf
Custom options for Page.printToPDF
You can provide custom options for the Page.printToPDF
call by passing a map to the :print_to_pdf
option.
ChromicPDF.print_to_pdf(
{:url, "file:///example.html"},
print_to_pdf: %{
# Margins are in given inches
marginTop: 0.393701,
marginLeft: 0.787402,
marginRight: 0.787402,
marginBottom: 1.1811,
# Print header and footer (on each page).
# This will print the default templates if none are given.
displayHeaderFooter: true,
# Even on empty string.
# To disable header or footer, pass an empty element.
headerTemplate: "<span></span>",
# Example footer template.
# They are completely unstyled by default and have a font-size of zero,
# so don't despair if they don't show up at first.
# There's a lot of documentation online about how to style them properly,
# this is just a basic example. Also, take a look at the documentation for the
# ChromicPDF.Template module.
# The <span> classes shown below are interpolated by Chrome.
footerTemplate: """
<style>
p {
color: #333;
font-size: 10pt;
text-align: right;
margin: 0 0.787402in;
width: 100%;
z-index: 1000;
}
</style>
<p>
Page <span class="pageNumber"></span> of <span class="totalPages"></span>
</p>
"""
}
)
Please note the camel-case. For a full list of options, please see the Chrome documentation at:
https://chromedevtools.github.io/devtools-protocol/tot/Page#method-printToPDF
page-size-and-margins
Page size and margins
Chrome will use the provided pagerWidth
and paperHeight
dimensions as the PDF paper
format. Please be aware that the @page
section in the body CSS is not correctly
interpreted, see ChromicPDF.Template
for a discussion.
header-and-footer
Header and footer
Chrome's support for native header and footer sections is a little bit finicky. Still, to the best of my knowledge, Chrome is currently the only well-functioning solution for HTML-to-PDF conversion if you need headers or footers that are repeated on multiple pages even in the presence of body elements stretching across a page break.
In order to make header and footer visible in the first place, you will need to be aware of a couple of caveats:
You can not use any external (
http://
orhttps://
) resources in the header or footer, not even per absolute URL. You need to inline all your CSS and convert your images to data-URLs.Javascript is not interpreted either.
HTML for header and footer is interpreted in a new page context which means no body styles will be applied. In fact, even default browser styles are not present, so all content will have a default
font-size
of zero, and so on.You need to make space for the header and footer templates first, by adding page margins. Margins can either be given using the
marginTop
andmarginBottom
options or with CSS styles. If you use the options, the height of header and footer elements will inherit these values. If you use CSS styles, make sure to set the height of the elements in CSS as well.Header and footer have a default padding to the page ends of 0.4 centimeters. To remove this, add the following to header/footer template styles (source).
#header, #footer { padding: 0 !important; }
Header and footer have a default
zoom
level of 1/0.75 so everything appears to be smaller than in the body when the same styles are applied.If header or footer are not displayed even though they should, make sure your HTML is valid. Tuning the margins for an hour looking for mistakes there, only to discover that you are missing a closing
</style>
tag, can be quite painful.Background colors are not applied unless you include
-webkit-print-color-adjust: exact
in your stylesheet.
See print_header_footer_template.html
from the Chromium sources to see how these values are interpreted.
dynamic-content
Dynamic Content
Evaluate script before printing
In case your print source is generated by client-side scripts, for instance to render graphics or load additional resources, you can trigger these by evaluating a JavaScript expression before the PDF is printed.
evaluate = %{
expression: """
document.querySelector('body').innerHTML = 'hello world';
"""
}
ChromicPDF.print_to_pdf({:url, "http://example.net"}, evaluate: evaluate)
If your script returns a Promise, Chrome will wait for it to be resolved.
Wait for attribute on element
Some JavaScript libraries signal their successful initialization to the user by setting an
attribute on a DOM element. The wait_for
option allows you to wait for this attribute to
be set before printing. It evaluates a script that repeatedly queries the element given by
the query selector and tests whether it has the given attribute.
wait_for = %{
selector: "#my-element",
attribute: "ready-to-print"
}
ChromicPDF.print_to_pdf({:url, "http:///example.net"}, wait_for: wait_for)
@spec print_to_pdfa(source() | [source()], [ pdf_option() | pdfa_option() | shared_option() ]) :: result()
Prints a PDF and converts it to PDF/A in a single call.
See print_to_pdf/2
and convert_to_pdfa/2
for options.
example
Example
ChromicPDF.print_to_pdfa({:url, "https://example.net"})
Activate a particular ChromicPDF instance, which was started with the name
option.
After calling this function, all calls in the current process will use this instance of ChromicPDF.
You can use this function if you need to run ChromicPDF as part of a supervision tree with a particular name, for example:
defmodule MySupervisor do
use Supervisor
@impl true
def init(opts) do
children = [
# other apps...
{ChromicPDF, name: MyName}
]
Supervisor.init(children, strategy: :one_for_one, name: MyApp.Supervisor)
end
end
Returns the previously set name or the default name.
@spec start_link([global_option()]) :: Supervisor.on_start()
Starts ChromicPDF.
on-demand-mode
"On Demand" mode
If the given config includes the on_demand: true
flag, this will not spawn a Chrome
instance but instead hold the configuration in an Agent until a PDF print job is triggered.
The print job will launch a temporary browser process and perform a graceful shutdown at
the end.
Please note that the browser process is spawned from your client process and that these
processes are linked. If your client process is trapping EXIT
signals, you will receive
a message when the browser is terminated.
@spec warm_up([local_chrome_option()]) :: {:ok, binary()}
Runs a one-off Chrome process to allow Chrome to initialize its caches.
On some infrastructure (notably, Github Actions), Chrome occasionally takes a long nap between process launch and first replying to DevTools commands. If meanwhile you happen to print a PDF (so, before any sessions have been spawned by the session pool), the session checkout will fail with a timeout error:
Caught EXIT signal from NimblePool.checkout!/4
** (EXIT) time out
This function mitigates the issue by launching a Chrome process via a shell command, bypassing ChromicPDF's internals.
usage
Usage
# in your test_helper.exs
{:ok, _} = ChromicPDF.warm_up()
...
ExUnit.start()
options
Options
This function accepts all options of print_to_pdf/2
related to external Chrome process.
If you pass discard_stderr: false
, Chrome's standard error is returned.
{:ok, stderr} = ChromicPDF.warm_up(discard_stderr: false)
IO.inspect(stderr, label: "chrome stderr")
mix-task
Mix Task
Alternatively, you can choose to run a mix task as part of your CI script, see
Mix.Tasks.ChromicPdf.WarmUp
. The task currently does not accept any options.
...
$ mix chromic_pdf.warm_up
$ mix test