View Source robots (robots v1.1.2)

Parse and manipulate robots.txt files according to the specification (RFC 9309).

Link to this section Summary

Functions

Verifies that the given URL is allowed for the specified agent.
Parses the content of the robot.txt and returns all the rules indexed by their agents.
Fetches the sitemap of the parsed index.

Link to this section Types

-type agent() :: binary().
-opaque agent_rules()
-type allowed_all() :: {allowed, all}.
-type code() :: 100..599.
-type content() :: string() | binary().
-type rule() :: binary().
-type rules() :: [rule()].
-type rules_index() ::
    #{agent() := {Allowed :: rules(), Disallowed :: rules()} | allowed_all(), sitemap => binary()}.
-type sitemap() :: binary().
-type status() :: allowed | disallowed.

Link to this section Functions

Link to this function

is_allowed(Agent, Url, RulesIndex)

View Source
-spec is_allowed(agent(), uri_string:uri_string(), agent_rules()) -> boolean().
Verifies that the given URL is allowed for the specified agent.
-spec parse(content(), code()) -> {ok, agent_rules()} | {error, term()}.
Parses the content of the robot.txt and returns all the rules indexed by their agents.
-spec sitemap(agent_rules()) -> {ok, sitemap()} | {error, not_found}.
Fetches the sitemap of the parsed index.