View Source LastfmArchive (lastfm_archive v0.10.0)

lastfm_archive is a tool for creating local Last.fm scrobble file archive, Solr archive and analytics.

The software is currently experimental and in preliminary development. It should eventually provide capability to perform ETL and analytic tasks on Lastfm scrobble data.

Current usage:

Link to this section Summary

Functions

Returns the total playcount and registered, i.e. earliest scrobble time for a user.

Load all TSV data from the archive into Solr for a Lastfm user.

Sync scrobbles of a default user specified in configuration.

Sync scrobbles for a Lastfm user.

Transform downloaded raw JSON data and create a TSV file archive for a Lastfm user.

Link to this section Types

@type archive() :: LastfmArchive.Behaviour.Archive.t()
@type solr_url() :: atom() | Hui.URL.t()
@type time_range() :: {integer(), integer()}

Link to this section Functions

Returns the total playcount and registered, i.e. earliest scrobble time for a user.

@spec load_archive(binary(), solr_url()) :: :ok | {:error, Hui.Error.t()}

Load all TSV data from the archive into Solr for a Lastfm user.

The function finds TSV files from the archive and sends them to Solr for ingestion one at a time. It uses Hui client to interact with Solr and the Hui.URL.t/0 struct for Solr endpoint specification.

example

Example

  # define a Solr endpoint with %Hui.URL{} struct
  headers = [{"Content-type", "application/json"}]
  url = %Hui.URL{url: "http://localhost:8983/solr/lastfm_archive", handler: "update", headers: headers}

  LastfmArchive.load_archive("a_lastfm_user", url)

TSV files must be pre-created before the loading - see transform_archive/2.

@spec sync() :: :ok | {:error, :file.posix()}

Sync scrobbles of a default user specified in configuration.

example

Example

  LastfmArchive.sync

The default user is specified in configuration, for example user_a in config/config.exs:

  config :lastfm_archive,
    user: "user_a",
    ... # other archiving options

See sync/2 for further details and archiving options.

Link to this function

sync(user, options \\ [])

View Source
@spec sync(
  binary(),
  keyword()
) :: {:ok, archive()} | {:error, :file.posix()}

Sync scrobbles for a Lastfm user.

example

Example

  LastfmArchive.sync("a_lastfm_user")

The first sync downloads all daily scrobbles in 200-track (gzip compressed) chunks that are written into a local file archive. Subsequent syncs extract further scrobbles starting from the date of latest downloaded scrobbles.

The data is currently in raw Lastfm recenttracks JSON format, chunked into 200-track (max) gzip compressed pages and stored within directories corresponding to the days when tracks were scrobbled.

Options:

  • :interval - default 1000(ms), the duration between successive Lastfm API requests. This provides a control for request rate. The default interval ensures a safe rate that is within Lastfm's term of service: no more than 5 requests per second

  • :overwrite - default false (not available currently), if sets to true the system will (re)fetch and overwrite any previously downloaded data. Use this option to refresh the file archive. Otherwise (false), the system will not be making calls to Lastfm to check and re-fetch data if existing data chunks / pages are found. This speeds up archive updating

  • :per_page - default 200, number of scrobbles per page in archive. The default is the max number of tracks per request permissible by Lastfm

  • :data_dir - default lastfm_data. The file archive is created within a main data directory, e.g. ./lastfm_data/a_lastfm_user/.

These options can be configured in config/config.exs:

  config :lastfm_archive,
    ...
    data_dir: "./lastfm_data/"
Link to this function

transform_archive(user, mode \\ :tsv)

View Source
@spec transform_archive(binary(), :tsv) :: :ok

Transform downloaded raw JSON data and create a TSV file archive for a Lastfm user.

example

Example

  LastfmArchive.transform_archive("a_lastfm_user")

The function only transforms downloaded archive data on local filesystem. It does not fetch data from Lastfm, which can be done via archive/2, archive/3.

The TSV files are created on a yearly basis and stored in gzip compressed format. They are stored in a tsv directory within either the default ./lastfm_data/ or the directory specified in config/config.exs (:lastfm_archive, :data_dir).