NLdoc.Conversion.Reader.Docx.Convert (NLdoc.Conversion.Reader.Docx v1.0.43)

View Source

Converts a Docx struct into an NLdoc spec document, raising any errors that occur during the conversion.

This module implements a recursive conversion of the XML structure of a Word document into an NLdoc spec document. Its implementation in NLdoc was first written by Stephan Meijer and later refactored by Bart van Oort.

Huge thanks to Pandoc for inspiration and guidance on how to convert Word documents to a structured format.

Supported Features

  • [x] Title elements (becomes heading)
  • [x] Headings
  • [x] Text Paragraphs
  • [x] Definition lists, details and terms
  • [x] Preformatted blocks, also known as "Source Code"
  • [x] Text in Wingdings, Webdings and Symbol fonts is converted to UTF-8 equivalents
  • [-] Hyperlinks (missing support for w:r with w:instrText child that contains a hyperlink field code)
  • [-] Tables (missing support for colspan and rowspan, table captions)
  • [-] Ordered and Unordered lists (Windings as list characters are not implemented).
  • [-] Images (only in w:drawing, media is not yet exported to assets and missing support for captions)
  • [ ] Captions
  • [ ] LineBreak
  • [ ] Alternate content

Summary

Types

conversion()

@type conversion() ::
  {resources :: [NLdoc.Spec.object()],
   state :: NLdoc.Conversion.Reader.Docx.State.t()}

Functions

convert(docx)

convert(list, acc)

convert(arg1, acc, docx)

@spec convert(
  elements :: Saxy.XML.element() | [Saxy.XML.element()],
  accumulator :: conversion(),
  docx :: NLdoc.Conversion.Reader.Docx.t()
) :: conversion()