Xlsxir v1.6.4 Xlsxir.SaxParser View Source
Provides SAX (Simple API for XML) parsing functionality of the .xlsx
file via the Erlsom Erlang library. SAX (Simple API for XML) is an event-driven
parsing algorithm for parsing large XML files in chunks, preventing the need to load the entire DOM into memory. Current chunk size is set to 10,000.
Link to this section Summary
Functions
Parses XmlFile
(xl/worksheets/sheet#{n}.xml
at index n
, xl/styles.xml
, xl/workbook.xml
or xl/sharedStrings.xml
) using SAX parsing. An Erlang Term Storage (ETS) process is started to hold the state of data
parsed. The style and sharedstring XML files (if they exist) must be parsed first in order for the worksheet parser to sucessfully complete
Link to this section Functions
parse(xml_file, type, excel \\ nil) View Source
Parses XmlFile
(xl/worksheets/sheet#{n}.xml
at index n
, xl/styles.xml
, xl/workbook.xml
or xl/sharedStrings.xml
) using SAX parsing. An Erlang Term Storage (ETS) process is started to hold the state of data
parsed. The style and sharedstring XML files (if they exist) must be parsed first in order for the worksheet parser to sucessfully complete.
Parameters
content
- XML string to parsetype
- file type identifier (:worksheet, :style or :string) of XML file to be parsedmax_rows
- the maximum number of rows in this worksheet that should be parsed
Example
An example file named test.xlsx
located in ./test/test_data
containing the following in worksheet at index 0
:
- cell 'A1' -> "string one"
- cell 'B1' -> "string two"
- cell 'C1' -> integer of 10
- cell 'D1' -> formula of
=4*5
cell 'E1' -> date of 1/1/2016 or Excel date serial of 42370 The
.xlsx
file contents have been extracted to./test/test_data/test
. For purposes of this example, we utilize theget_at/1
function of each ETS process module to pull a sample of the parsed data. Keep in mind that the worksheet data is stored in the ETS process as a list of row lists, so theXlsxir..get_row/2
function will return a full row of values.iex> {:ok, %Xlsxir.ParseStyle{tid: tid1}, } = Xlsxir.SaxParser.parse(%Xlsxir.XmlFile{content: File.read!("./test/test_data/test/xl/styles.xml")}, :style) iex> :ets.lookup(tid1, 0) [{0, nil}] iex> {:ok, %Xlsxir.ParseString{tid: tid2}, } = Xlsxir.SaxParser.parse(%Xlsxir.XmlFile{content: File.read!("./test/testdata/test/xl/sharedStrings.xml")}, :string) iex> :ets.lookup(tid2, 0) [{0, "string one"}] iex> {:ok, %Xlsxir.ParseWorkbook{tid: tid3}, } = Xlsxir.SaxParser.parse(%Xlsxir.XmlFile{content: File.read!("./test/testdata/test/xl/workbook.xml")}, :workbook) iex> :ets.lookup(tid3, 1) [{1, "Sheet1"}] iex> {:ok, %Xlsxir.ParseWorksheet{tid: tid4}, } = Xlsxir.SaxParser.parse(%Xlsxir.XmlFile{name: "sheet1.xml", content: File.read!("./test/test_data/test/xl/worksheets/sheet1.xml")}, :worksheet, %Xlsxir.XlsxFile{shared_strings: tid2, styles: tid1, workbook: tid3}) iex> :ets.lookup(tid4, 1) [{1, [["A1", "string one"], ["B1", "string two"], ["C1", 10], ["D1", 20], ["E1", {2016, 1, 1}]]}]