View Source
Crawler Changelog
- [Added] Add
:force
option - [Added] Add
:scope
option
- [Added] Allow multiple instances of Crawler sharing the same queue
- [Improved] Logger will now log entries as
debug
or warn
- [Added]
:store
option, defaults to nil
to save memory usage - [Added]
:max_pages
option - [Added]
Crawler.running?/1
to check whether Crawler is running - [Improved] The queue is being supervised now
- [Improved] Documentation improvements (thanks @kianmeng)
- [Improved] Updated
floki
and other dependencies
- [Added]
:modifier
option - [Added]
:encode_uri
option - [Improved] Varies small fixes and improvements
- [Added] Pause / resume / stop Crawler
- [Improved] Varies small fixes and improvements
- [Added]
:scraper
option to allow scraping content - [Improved] Varies small fixes and improvements
- [Improved]
Crawler.Store.DB
now stores the opts
meta data - [Improved] Code documentation
- [Improved] Varies small fixes and improvements
- [Added]
:retrier
option to allow custom fetch retrying logic - [Added]
:url_filter
option to allow custom url filtering logic - [Improved] Parser is now more stable and skips unparsable files
- [Improved] Varies small fixes and improvements
- [Added]
:workers
option - [Added]
:interval
option - [Added]
:timeout
option - [Added]
:user_agent
option - [Added]
:save_to
option - [Added]
:assets
option - [Added]
:parser
option to allow custom parsing logic - [Improved] Renamed
:max_levels
to :max_depths
- [Improved] Varies small fixes and improvements
- [Added] A semi-functioning prototype
- [Added] Finished the very basic crawling function
- [Added]
:max_levels
option