SitemapXml.SitemapUrlTree (sitemap_xml v0.1.2)
A module to fetch and parse sitemap XML concurrently and return a nested data structure.
Summary
Functions
Fetches the raw sitemap XML from the given URL.
Fetches and parses the sitemap from the provided URL and returns a nested structure.
Parses the sitemap XML to extract URLs with their attributes, or processes nested sitemaps.
Functions
Fetches the raw sitemap XML from the given URL.
Examples
iex> SitemapXml.SitemapUrlTree.fetch_sitemap("https://web.site/sitemap.xml")
{:ok, "<?xml version="1.0" encoding="UTF-8"?><?xml-styleshee..."}
iex> SitemapXml.SitemapUrlTree.fetch_sitemap("https://web.site/404.xml")
{:error, "HTTP error with status 404"}
Fetches and parses the sitemap from the provided URL and returns a nested structure.
Examples
iex> SitemapXml.SitemapUrlTree.fetch_url_tree("https://web.site/sitemap.xml")
{:ok, [%{"sitemap.xml" => [%{url: "https://web.site/page1", lastmod: ..., priority: ...}, ...]}]}
Parses the sitemap XML to extract URLs with their attributes, or processes nested sitemaps.
Examples
iex> SitemapXml.SitemapUrlTree.parse_sitemap("https://web.site/sitemap.xml", "<urlset>...</urlset>")
{:ok, [%{"sitemap.xml" => [%{url: "https://web.site/page1", lastmod: ..., priority: ...}, ...]}]}
iex> SitemapXml.SitemapUrlTree.parse_sitemap("https://web.site/nested_sitemap.xml", "<sitemapindex>...</sitemapindex>")
...