View Source API Reference spider_man v0.6.3
Modules
SpiderMan, a fast high-level web crawling & scraping framework for Elixir.
A Common Spider what setting functions as callbacks instead of module defined
Download request.
Store items.
Analyze web pages.
Handle settings for spider
Engine
Item Struct
Setting user-agent for request
msg counter for component
use for debug msg by component
filter msg while duplicate key
Encode item.value to json for ItemProcessor component
Encode item.value to json and save to files for ItemProcessor component
A post_pipeline what is use to download file directly for downloader component
auto save cookies for spider component & auto set cookie for downloader component
use Splash for javascript rendering service
ETS Producer
Request Struct
A Requester use by downloader component
use Finch as Requester
use Hackney as Requester
Response Struct
Save items to *.csv files by Storage
Save items to *.ets file by Storage
Save items to JsonLines(*.jsonl) file by Storage
Just log each item by Logger
Support setting multiple Storage for ItemProcessor component
Utils