An Omni.Tool for fetching and simplifying web content.
Fetches one or more URLs, extracts content appropriate for LLM consumption (HTML to Markdown, pretty-printed JSON, plain text passthrough), and returns the results as a string.
tool = Omni.Tools.WebFetch.new()
tool = Omni.Tools.WebFetch.new(max_output: 30_000, timeout: 10_000)Strategies
Content extraction is handled by pluggable strategies. Each strategy
implements Omni.Tools.WebFetch.Strategy and declares which URLs it
handles via match?/2. Strategies are tried in order — the first
match wins.
Three strategies are always active, appended after any user-provided strategies:
- GitHub — matches
github.comblob URLs, redirects toraw.githubusercontent.comfor direct file content. - Reddit — matches
*.reddit.com, fetches via Reddit's JSON API, formats posts and comments as readable Markdown. - Default — catch-all that handles HTML (→ Markdown), JSON (→ pretty-printed), plain text (→ passthrough), and binary (→ metadata).
Custom strategies are prepended, so they take priority over the
built-ins. To override the built-in GitHub handling, for example,
provide your own strategy that matches github.com first:
tool = Omni.Tools.WebFetch.new(strategies: [{MyApp.WikiStrategy, []}])Custom Req
Pass a pre-configured Req.Request struct to control the HTTP
transport. This is useful for attaching middleware, setting
authentication, or replacing the transport layer entirely.
req = Req.new() |> MyApp.Auth.attach()
tool = Omni.Tools.WebFetch.new(req: req)Options
:req— baseReq.Requeststruct. Default:Req.new().:strategies— list of strategy modules or{module, opts}tuples. Default:[].:max_output— per-URL content truncation limit in bytes. Set to:infinityto disable truncation. Default:100_000.:max_urls— maximum number of URLs per batch call. Default:10.:timeout— HTTP receive timeout in milliseconds. Default:15_000.
Summary
Functions
Builds a %Omni.Tool{} struct with a bound handler.