View Source Smee.Publish (Smee v0.5.0)

Publishes/exports streams, lists or text files of entity structs in various formats.

You can use this module to build your own metadata aggregates, create DiscoFeed files for discovery services, output data for reports and documents, load structured data into databases, populate MDQ services, and so on.

Formats can be output as complete binary strings or streamed as text chunks or structs. Streamed output can be useful for web services, allowing gradual downloads generated on-the-fly with no need to render a very large document in advance.

When no :format or other options are specified Publish will default to creating SAML2 metadata files.

Options:

  • :alias (boolean) - create hashed aliases for written files (only for write_ functions)
  • :filename (file path) - write an aggregate to a file with this name (only for write_ functions)
  • :format - The publishing format - defaults to :saml for SAML metadata. See below for other options
  • :id_type - the ID type used for keys for items and for creating item filenames automatically (only for item and raw functions)
  • :to - the directory to write automatically-named files to. Defaults to a directory called published in the current working directory
  • :valid_until - pass a DateTime to set the validUntil attribute for the entity metadata. Alternatively, an integer can be passed to request a validity of n days, or :default and :auto to use the default validity period.

Publishing formats:

  • :csv - a brief CSV summary of the entities
  • :disco - Shibboleth DiscoFeed format JSON, used by the Embedded Discovery Service and others (schema)
  • :index - a plain text format containing entity ID and an optional name on each line
  • :markdown - a simple Markdown table summarising the entities
  • :saml - SAML2 metadata, either as a single aggregate XML file or many per-entity XML files
  • :thiss - Entity information in the JSON format used by THISS software such as the Seamless Access discovery service
  • :udest - A compact JSON format for SP info, used by Little Disco discovery service
  • :udisco - An efficient JSON format used by Little Disco as an alternative to :disco/DiscoFeed

ID types:

  • :hash - a hashed entityID, as used by Local Dynamic
  • :entity_id, :uri - an entityID URI. This will be sanitized when used as a filename
  • :number - a simple incremented number
  • :mdq - the full MDQ style transformed entityURI, made up of "{sha1}" and a hash

Some publishing formats will automatically filter entities for suitable roles, others will accept any role. There is no automatic checking for uniqueness - if you may have conflicts (maybe from combining multiple sources) you must filter for uniqueness yourself.

Examples

1. Writing aggregated metadata XML containing entities created in the last 6 months, with a specified filename:

iex> Smee.source("http://metadata.ukfederation.org.uk/ukfederation-metadata.xml")
iex> |> Smee.fetch!()
iex> |> Smee.Metadata.stream_entities()
iex> |> Smee.Filter.days(180)
iex> |> Smee.Publish.write_aggregate(filename: "my_aggregate.xml")

2. Writing a DiscoFeed file, with the default filename:

iex> Smee.source("http://metadata.ukfederation.org.uk/ukfederation-metadata.xml")
iex> |> Smee.fetch!()
iex> |> Smee.Metadata.stream_entities()
iex> |> Smee.Publish.write_aggregate(format: :disco)

3. Creating a directory of files for use in an IdP's Local Dynamic metadata provider, with friendly file names:

iex> Smee.source("http://metadata.ukfederation.org.uk/ukfederation-metadata.xml")
iex> |> Smee.fetch!()
iex> |> Smee.Metadata.stream_entities()
iex> |> Smee.Filter.sp()
iex> |> Smee.Publish.write_items(alias: true, id: :uri)

Summary

Functions

Processes the stream of entity records and returns a single binary in the selected format (defaulting to SAML2 metadata)

Processes the stream of entity records and returns a stream of text in the selected format that will become a single, valid aggregated file when combined.

Estimates the size (in bytes) of an aggregated published file or stream in the selected format (defaulting to SAML2 metadata).

Lists the available supported formats, as atoms, that are used with the :format tag in other Publish functions.

Processes the stream of entity records and returns a map of IDs and individual entity records in the selected format (defaulting to SAML2 metadata).

Processes the stream of entity records and returns a stream of tuples containing IDs and individual entity records in the selected format (defaulting to SAML2 metadata).

Processes the stream of entity records and returns a map of IDs and raw maps of processed entity information in that would be used to create text in the selected format (defaulting to SAML2 metadata).

Writes a single aggregated file to disk in the selected format (defaulting to SAML2 metadata).

Writes multiple files to disk in the selected format (defaulting to SAML2 metadata), one per entity.

Functions

Link to this function

aggregate(entities, options \\ [])

View Source
@spec aggregate(entities :: Enumerable.t(), options :: keyword()) :: binary()

Processes the stream of entity records and returns a single binary in the selected format (defaulting to SAML2 metadata)

This is more memory intensive than aggregate_stream/2 but simpler to use.

By default this function will produce a SAML2 metadata aggregate in XML, as used by the Smee.Metadata module and all decent SAML software.

Link to this function

aggregate_stream(entities, options \\ [])

View Source
@spec aggregate_stream(entities :: Enumerable.t(), options :: keyword()) ::
  Enumerable.t(binary())

Processes the stream of entity records and returns a stream of text in the selected format that will become a single, valid aggregated file when combined.

The aggregated file will be returned in a stream of text chunks, usually one-entity-per-chunk. This approach uses much less memory than generating the file up-front, and can begin sending data to the user almost immediately.

By default this function will produce a SAML2 metadata aggregate in XML.

Link to this function

eslength(entities, options \\ [])

View Source
@spec eslength(entities :: Enumerable.t(), options :: keyword()) :: integer()

Estimates the size (in bytes) of an aggregated published file or stream in the selected format (defaulting to SAML2 metadata).

The calculation is made without generating the actual data, so a large (100MB) XML file can be sized without using much memory.

This function is useful when streaming data over HTTP or other protocols where a file size is needed for headers.

@spec formats() :: [atom()]

Lists the available supported formats, as atoms, that are used with the :format tag in other Publish functions.

Link to this function

items(entities, options \\ [])

View Source
@spec items(entities :: Enumerable.t(), options :: keyword()) :: map()

Processes the stream of entity records and returns a map of IDs and individual entity records in the selected format (defaulting to SAML2 metadata).

This is more memory intensive than items_stream/2 but simpler to use.

By default this function will return individual SAML2 metadata files, one-per-entity, suitable for use in MDQ services and "Local Dynamic" metadata providers.

Link to this function

items_stream(entities, options \\ [])

View Source
@spec items_stream(entities :: Enumerable.t(), options :: keyword()) ::
  Enumerable.t(tuple())

Processes the stream of entity records and returns a stream of tuples containing IDs and individual entity records in the selected format (defaulting to SAML2 metadata).

This returns a stream of tuples containing IDs and individual entity records in the selected format.

By default this function will return individual SAML2 metadata files, one-per-entity, suitable for use in MDQ services and "Local Dynamic" metadata providers.

Link to this function

raw_stream(entities, options \\ [])

View Source
@spec raw_stream(entities :: Enumerable.t(), options :: keyword()) ::
  Enumerable.t(tuple())

Processes the stream of entity records and returns a map of IDs and raw maps of processed entity information in that would be used to create text in the selected format (defaulting to SAML2 metadata).

This function is similar to items_stream/2 but returns the unencoded structs used to create the item records.

Use this if you want to store JSON records in a Key/Value store or database as structured data, rather than writing them directly to disc as text.

Link to this function

write_aggregate(entities, options \\ [])

View Source
@spec write_aggregate(entities :: Enumerable.t(), options :: keyword()) :: binary()

Writes a single aggregated file to disk in the selected format (defaulting to SAML2 metadata).

By default this will write a single SAML2 metadata aggregate file to disk.

The file is written using an IO stream, so hopefully will not require much RAM to process.

Link to this function

write_items(entities, options \\ [])

View Source
@spec write_items(entities :: Enumerable.t(), options :: keyword()) :: list()

Writes multiple files to disk in the selected format (defaulting to SAML2 metadata), one per entity.

By default this will write many individual SAML2 metadata files, one-per-entity, to disk, using the ID as a filename.

This is a simple way to create files for use by an MDQ service or Local Dynamic metadata provider.

Hints

  • Use the :id option to choose the filename type. :uri will produce readable filenames based on Entity IDs.
  • set alias: true to create MDQ-compatible symlinks if you are using a different type of ID for the file itself