View Source
Crawler Changelog
- [Added] Add
:force option - [Added] Add
:scope option
- [Added] Allow multiple instances of Crawler sharing the same queue
- [Improved] Logger will now log entries as
debug or warn
- [Added]
:store option, defaults to nil to save memory usage - [Added]
:max_pages option - [Added]
Crawler.running?/1 to check whether Crawler is running - [Improved] The queue is being supervised now
- [Improved] Documentation improvements (thanks @kianmeng)
- [Improved] Updated
floki and other dependencies
- [Added]
:modifier option - [Added]
:encode_uri option - [Improved] Varies small fixes and improvements
- [Added] Pause / resume / stop Crawler
- [Improved] Varies small fixes and improvements
- [Added]
:scraper option to allow scraping content - [Improved] Varies small fixes and improvements
- [Improved]
Crawler.Store.DB now stores the opts meta data - [Improved] Code documentation
- [Improved] Varies small fixes and improvements
- [Added]
:retrier option to allow custom fetch retrying logic - [Added]
:url_filter option to allow custom url filtering logic - [Improved] Parser is now more stable and skips unparsable files
- [Improved] Varies small fixes and improvements
- [Added]
:workers option - [Added]
:interval option - [Added]
:timeout option - [Added]
:user_agent option - [Added]
:save_to option - [Added]
:assets option - [Added]
:parser option to allow custom parsing logic - [Improved] Renamed
:max_levels to :max_depths - [Improved] Varies small fixes and improvements
- [Added] A semi-functioning prototype
- [Added] Finished the very basic crawling function
- [Added]
:max_levels option