View Source PDFInfo (PDFInfo v0.1.17)
Extracts all /Info and /Metadata objects from a PDF binary using Regex and with zero dependencies.
Limitations: If the PDF is encrypted or the metadata is compressed you have to first decrypt and uncompress:
qpdf --stream-data=uncompress --compress-streams=n --decrypt --password='' myfile.pdf myfile_out.pdf
Summary
Functions
Returns a list of /Encrypt reference strings.
Maps /Info reference strings to objects and parses the objects.
Returns a list of /Info reference strings.
Returns true if PDF has at least one /Encrypt reference.
Returns false if PDF has no /Encrypt reference.
Checks if the binary starts with the PDF header.
Maps /Metadata reference strings to objects and parses the objects.
Returns a list of /Metadata reference strings.
Extracts PDF version from the PDF header.
Maps the /Info reference strings to the raw objects.
Maps the /Metadata reference strings to the raw objects.
Functions
Returns a list of /Encrypt reference strings.
Examples
iex> PDFInfo.encrypt_refs(binary)
["/Encrypt 52 0 R"]
Maps /Info reference strings to objects and parses the objects.
Examples
iex> PDFInfo.info_objects(binary)
%{
"/Info 1 0 R" => [
%{
"Author" => "The PostgreSQL Global Development Group",
"CreationDate" => "D:20200212212756Z",
...
}
]
}
Returns a list of /Info reference strings.
Examples
iex> PDFInfo.info_refs(binary)
["/Info 1 0 R"]
Returns true if PDF has at least one /Encrypt reference.
Returns false if PDF has no /Encrypt reference.
Checks if the binary starts with the PDF header.
The PDF header can be anywhere in the first 1024 bytes.
Returns true if the binary starts with the PDF header.
Returns false otherwise.
Maps /Metadata reference strings to objects and parses the objects.
Examples
iex> PDFInfo.metadata_objects(binary)
[
%{
{"dc", "format"} => "application/pdf",
{"pdf", "Producer"} => "Adobe PDF Library 15.0",
{"xmp", "CreateDate"} => "2018-06-06T17:02:53+02:00",
{"xmp", "CreatorTool"} => "Acrobat PDFMaker 17 für Word",
{"xmp", "MetadataDate"} => "2018-06-06T17:03:13+02:00",
{"xmp", "ModifyDate"} => "2018-06-06T17:03:13+02:00",
...
}
]
Returns a list of /Metadata reference strings.
Examples
iex> PDFInfo.metadata_refs(binary)
["/Metadata 5 0 R"]
Extracts PDF version from the PDF header.
The PDF header can be anywhere in the first 1024 bytes.
Returns {:ok, version} if the PDF header is correct.
Returns :error if the PDF header is incorrect.
Examples
iex> PDFInfo.pdf_version(binary)
{:ok, "1.5"}
iex> PDFInfo.pdf_version("not a pdf")
:error
Maps the /Info reference strings to the raw objects.
Examples
iex> PDFInfo.raw_info_objects(binary)
%{"/Info 1 0 R" => ["1 0 obj <<..."]}
Maps the /Metadata reference strings to the raw objects.
Examples
iex> PDFInfo.raw_metadata_objects(binary)
["<x:xmpmeta" <> ...]