Docxir.XmlExtractor (docxir v0.1.0)
View SourceExtracts XML content from DOCX files.
DOCX files are ZIP archives containing XML documents. This module handles the extraction of the main document XML from the archive.
Summary
Functions
Extracts the document.xml content from a DOCX file.
Extracts the document.xml content from a DOCX file, raising on error.
Functions
Extracts the document.xml content from a DOCX file.
Parameters
docx_path- Path to the DOCX file as a string or charlist
Returns
{:ok, xml_content}- The XML content as a binary{:error, reason}- Error tuple with reason
Examples
iex> Docxir.XmlExtractor.extract("contract.docx")
{:ok, "<?xml version=\"1.0\"..."}
iex> Docxir.XmlExtractor.extract("nonexistent.docx")
{:error, :enoent}
Extracts the document.xml content from a DOCX file, raising on error.
Parameters
docx_path- Path to the DOCX file
Returns
- The XML content as a binary
Raises
File.Errorif the file cannot be read or is not a valid DOCX
Examples
iex> Docxir.XmlExtractor.extract!("contract.docx")
"<?xml version=\"1.0\"..."