GoogleNewsDecoder (GoogleNewsDecoder v0.1.0)

Copy Markdown View Source

GoogleNewsDecoder

Decode Google News redirect URLs to their original source URLs.

Google News wraps every article link in an encrypted redirect (news.google.com/rss/articles/CBMi...). This library resolves those back to the real URL by extracting decoding parameters from the article page and calling Google's internal batchexecute endpoint.

Zero dependencies — uses Erlang's built-in :httpc for HTTP and Elixir's built-in JSON module for encoding/decoding.

This library does not avoid rate-limiting or CAPTCHAs presented by Google.

Installation

Add google_news_decoder to your list of dependencies in mix.exs:

def deps do
  [
    {:google_news_decoder, "~> 0.1.0"}
  ]
end

Quickstart

Decode a Google News URL

{:ok, url} = GoogleNewsDecoder.decode("https://news.google.com/rss/articles/CBMiK2h0dHBz...")
# {:ok, "https://www.reuters.com/world/..."}

Resolve (decode or pass through)

If you have a mix of Google News and regular URLs, resolve/1 handles both — it decodes Google News URLs and returns everything else unchanged:

GoogleNewsDecoder.resolve("https://news.google.com/rss/articles/CBMiK2h0dHBz...")
# "https://www.reuters.com/world/..."

GoogleNewsDecoder.resolve("https://example.com/article")
# "https://example.com/article"

Check if a URL is a Google News redirect

GoogleNewsDecoder.google_news_url?("https://news.google.com/rss/articles/CBMi...")
# true

GoogleNewsDecoder.google_news_url?("https://example.com")
# false

Batch decoding

Decode a list of URLs concurrently with Task.async_stream/3:

urls
|> Task.async_stream(&GoogleNewsDecoder.resolve/1, max_concurrency: 5, timeout: 15_000)
|> Enum.map(fn {:ok, url} -> url end)

API

FunctionReturnsDescription
decode/1{:ok, url} or {:error, reason}Decode a Google News URL to its source
resolve/1urlDecode if Google News, otherwise pass through
google_news_url?/1booleanCheck if a URL is a Google News redirect

How it works

  1. Extracts the base64-encoded article ID from the URL path
  2. Fetches the Google News article page to obtain a signature (data-n-a-sg) and timestamp (data-n-a-ts)
  3. POSTs those parameters to Google's batchexecute endpoint
  4. Parses the nested JSON response to extract the original source URL

Requirements

Summary

Functions

Decodes a Google News URL to its original source URL.

Returns true if the URL is a Google News redirect URL.

Resolves a URL, decoding it if it's a Google News redirect.

Functions

decode(url)

@spec decode(String.t()) :: {:ok, String.t()} | {:error, term()}

Decodes a Google News URL to its original source URL.

Returns {:ok, source_url} or {:error, reason}.

google_news_url?(url)

@spec google_news_url?(term()) :: boolean()

Returns true if the URL is a Google News redirect URL.

resolve(url)

@spec resolve(String.t()) :: String.t()

Resolves a URL, decoding it if it's a Google News redirect.

Returns the original URL unchanged if it's not a Google News URL or if decoding fails.