View Source mediawiki-client notebook

Mix.install([
  {:mediawiki_client, path: "#{__DIR__}/.."}
])

Site matrix

Any wiki farm (such as the default Wikimedia cluster) that includes the SiteMatrix extension exposes an API for fetching a list of all participating sites. The Wiki.SiteMatrix module returns this full list,

site_matrix = Wiki.SiteMatrix.new()

You can filter it by site if you already know its dbname (wiki database name).

dewiki = Wiki.SiteMatrix.get!(site_matrix, "dewiki")

Working with a wiki

You can pass a site spec to the initializer for other modules, or you can supply the base API URL if this is already known.

For the remainder of this notebook, we use the explicit site style so that the code blocks are self-contained and easy to copy and paste, and to make it more obvious how to target a local wiki.

Wiki.Action.new(dewiki) == Wiki.Action.new("https://de.wikipedia.org/w/api.php")

Free-form Action API

MediaWiki's main external interface is called the Action API, and this module allows access to any arbitrary command using keyword syntax:

{:ok, %{result: %{"query" => %{"babel" => babel}}}} =
  Wiki.Action.new("https://mediawiki.org/w/api.php")
  |> Wiki.Action.get(
    action: :query,
    meta: :babel,
    babuser: "Adamw"
  )

babel

Streaming continuation

The Wiki.Action transparently handles continuation and can provide output as a stream. This example snippet returns 50 results, making 10 calls with 5 results per call:

Wiki.Action.new("https://de.wikipedia.org/w/api.php")
|> Wiki.Action.stream(
  action: :query,
  list: :recentchanges,
  rclimit: 5
)
|> Stream.take(10)
|> Enum.flat_map(fn response -> response["query"]["recentchanges"] end)
|> Enum.map(fn rc -> rc["timestamp"] <> " " <> rc["title"] end)
|> Enum.to_list()

Site statistics

The siteinfo API returns summary statistics for a wiki.

Wiki.Action.new("https://de.wikipedia.org/w/api.php")
|> Wiki.Action.get!(
  action: :query,
  meta: :siteinfo,
  siprop: :statistics
)
|> Map.get(:result)

Event Streams

You can monitor many different events in real-time using the EventStreams interface. These snippets sample from the most recent edits,

{:ok, pid} = Wiki.EventStreams.start_link(streams: "recentchange")

Wiki.EventStreams.stream()
|> Stream.take(6)
|> Enum.each(fn event -> IO.inspect(event) end)

Process.exit(pid, :normal)

and from the feed of ORES results for recent edits,

{:ok, pid} = Wiki.EventStreams.start_link(streams: "revision-score")

Wiki.EventStreams.stream()
|> Stream.take(6)
|> Enum.each(fn event -> IO.inspect(event) end)

Process.exit(pid, :normal)

Machine learning scores

To query ORES directly, use the Wiki.ORES module. List the models available for your target wiki: (FIXME: eliminate the dummy request parameter)

Wiki.Ores.new("enwiki")
|> Wiki.Ores.request!(dummy: [])

Retrieve scores for multiple revisions by ID:

Wiki.Ores.new("enwiki")
|> Wiki.Ores.request!(
  models: [:damaging, :wp10],
  revids: [456_789, 123_456]
)

Wikidata

Wikibase provides an Action API you can use as below.

This snippet runs a search:

Wiki.Action.new("https://www.wikidata.org/w/api.php")
|> Wiki.Action.get!(
  action: :wbsearchentities,
  search: "alphabet",
  language: :en
)
|> Map.fetch(:result)

This returns the detailed, structured data for a single Wikidata item:

Wiki.SiteMatrix.new()
|> Wiki.SiteMatrix.get!("wikidatawiki")
|> Wiki.Action.new()
|> Wiki.Action.get!(
  action: :wbgetentities,
  ids: "Q42"
)
|> Map.get(:result)