Crawly.Pipelines.WriteToFile (Crawly v0.17.2) View Source
Stores a given item into Filesystem
Pipeline Lifecycle:
- When run (by
Crawly.Utils.pipe
), creates a file descriptor if not already created. - Performs the write operation
- File descriptor is reused by passing it through the pipeline state with
:write_to_file_fd
Note:
File.close
is not necessary due to the file descriptor being automatically closed upon the end of a the parent process.Refer to https://github.com/oltarasenko/crawly/pull/19#discussion_r350599526 for relevant discussion.
Options
In the absence of tuple-based options being passed, the pipeline will fallback onto the config of :crawly
, Crawly.Pipelines.WriteToFile
, for the :folder
and :extension
keys
:folder
, optional. The folder in which the file will be created. Defaults to current project's folder. If provided folder does not exist it's created.:extension
, optional. The file extension in which the file will be created with. Defaults tojl
.:include_timestamp
, boolean, optional, true by default. Allows to add timestamp to the filename.Example Declaration
pipelines: [ Crawly.Pipelines.JSONEncoder, {Crawly.Pipelines.WriteToFile, folder: "/tmp", extension: "csv"} ]
Example Output
iex> item = %{my: "item"}
iex> WriteToFile.run(item, %{}, folder: "/tmp", extension: "csv")
{ %{my: "item"} , %{write_to_file_fd: #PID<0.123.0>} }