PDFInfo v0.1.14 PDFInfo View Source
Extracts all /Info and /Metadata objects from a PDF binary using Regex and without any dependencies.
Limitations: If the PDF is encrypted or the metadata is compressed you have to first decrypt and uncompress:
qpdf --stream-data=uncompress --compress-streams=n --decrypt --password='' myfile.pdf myfile_out.pdf
Link to this section Summary
Functions
Returns a list of /Encrypt reference strings.
Maps /Info reference strings to objects and parses the objects.
Returns a list of /Info reference strings.
Returns true if PDF has at least one /Encrypt reference.
Returns false if PDF has no /Encrypt reference.
Checks if the binary starts with the PDF header.
Maps /Metadata reference strings to objects and parses the objects.
Returns a list of /Metadata reference strings.
Extracts PDF version from the PDF header.
Maps the /Info reference strings to the raw objects.
Maps the /Metadata reference strings to the raw objects.
Link to this section Functions
Specs
Returns a list of /Encrypt reference strings.
Examples
iex> PDFInfo.encrypt_refs(binary)
["/Encrypt 52 0 R"] Specs
Maps /Info reference strings to objects and parses the objects.
Examples
iex> PDFInfo.info_objects(binary)
%{
"/Info 1 0 R" => [
%{
"Author" => "The PostgreSQL Global Development Group",
"CreationDate" => "D:20200212212756Z",
...
}
]
} Specs
Returns a list of /Info reference strings.
Examples
iex> PDFInfo.info_refs(binary)
["/Info 1 0 R"] Specs
Returns true if PDF has at least one /Encrypt reference.
Returns false if PDF has no /Encrypt reference.
Specs
Checks if the binary starts with the PDF header.
The PDF header can be anywhere in the first 1024 bytes.
Returns true if the binary starts with the PDF header.
Returns false otherwise.
Specs
Maps /Metadata reference strings to objects and parses the objects.
Examples
iex> PDFInfo.metadata_objects(binary)
%{
"/Metadata 285 0 R" => [
%{
{"dc", "format"} => "application/pdf",
{"pdf", "Producer"} => "Adobe PDF Library 15.0",
{"xmp", "CreateDate"} => "2018-06-06T17:02:53+02:00",
{"xmp", "CreatorTool"} => "Acrobat PDFMaker 17 für Word",
{"xmp", "MetadataDate"} => "2018-06-06T17:03:13+02:00",
{"xmp", "ModifyDate"} => "2018-06-06T17:03:13+02:00",
...
}
]
} Specs
Returns a list of /Metadata reference strings.
Examples
iex> PDFInfo.metadata_refs(binary)
["/Metadata 5 0 R"] Specs
Extracts PDF version from the PDF header.
The PDF header can be anywhere in the first 1024 bytes.
Returns {:ok, version} if the PDF header is correct.
Returns :error if the PDF header is incorrect.
Examples
iex> PDFInfo.pdf_version(binary)
{:ok, "1.5"}
iex> PDFInfo.pdf_version("not a pdf")
:error Specs
Maps the /Info reference strings to the raw objects.
Examples
iex> PDFInfo.raw_info_objects(binary)
%{"/Info 1 0 R" => ["1 0 obj <<..."]} Specs
Maps the /Metadata reference strings to the raw objects.
Examples
iex> PDFInfo.raw_metadata_objects(binary)
["<x:xmpmeta" <> ...]