View Source Smee.Metadata (Smee v0.5.0)
The Metadata module wraps up Metadata XML into a struct and contains functions that may be helpful when working with them. The metadata is either an aggregate (as used by federations to contain many entity records) or a single entity.
Many of the functions mirror those in the Smee.Entity
module - the same actions but on larger source XML rather
than on fragments.
The XML in metadata structs can be compressed or decompressed, but unlike Entities there is no cached, parsed xmlerl record by default - this is to save time and memory.
Wherever possible use Metadata.update/2
to make changes, do not write to the Entity struct directly. If you must write directly
you can use Metadata.update/1
to resync the state of the record.
Methods in Smee.Metadata
can be used to extract individual entity records each containing a fragment of XML. It's
strongly recommended to stream these using stream_entities2
to save on memory, or select a particular entity using
entity2
.
Summary
Functions
Raises an exception if the metadata has expired (based on valid_until datetime), otherwise returns the metadata.
Returns compressed metadata, containing gzipped XML. This greatly reduces the size of the metadata record.
Returns true if the XML data in an metadata struct has been compressed
Returns the number of entities in the metadata file
Returns decompressed metadata, with plain-text XML data. This makes the struct much larger.
Returns a new metadata struct based on the streamed entities passed as the first parameter.
Returns all entities in the metadata as a list of entity structs.
Returns the specified entity from the metadata in an :ok/:error struct
Returns the specified entity from the metadata or raises an exception if not found
Returns a list of all entity IDs in the metadata
Returns true if the metadata has expired (based on valid_until datetime)
Returns a suggested filename for the metadata.
Returns a suggested filename for the metadata in the specified format.
Returns a new metadata struct based on the XML data passed as the first parameter.
Returns one randomly selected entity from the metadata
Returns a stream of all entities in the metadata.
Tags a metadata record with one or more tags, replacing existing tags.
Returns the tags of the metadata struct, a list of binary strings
Resyncs the internal state of a %Metadata{} struct
Returns an updates %Metadata{} struct with new XML, refreshing various parts of the struct correctly.
Raises an exception if the metadata has invalid XML, otherwise returns the metadata.
Returns a parsed Erlang xmerl
structure representing the metadata XML, for use with xmerl
, SweetXML
and other
tools.
Returns the XML for the metadata, unchanged, and decompressed.
Returns the XML for the metadata, decompressed, after a processing stage.
Types
@type t() :: %Smee.Metadata{ cache_duration: nil | binary(), cert_fingerprint: nil | binary(), cert_url: nil | binary(), changes: integer(), compressed: boolean(), data: nil | binary(), data_hash: nil | binary(), downloaded_at: nil | DateTime.t(), entity_count: integer(), etag: nil | binary(), file_uid: nil | binary(), id: nil | binary(), label: nil | binary(), modified_at: nil | DateTime.t(), priority: integer(), size: integer(), tags: [binary()], trustiness: float(), type: atom(), uri: nil | binary(), uri_hash: nil | binary(), url: nil | binary(), url_hash: nil | binary(), valid_until: nil | DateTime.t(), verified: boolean() }
Functions
Raises an exception if the metadata has expired (based on valid_until datetime), otherwise returns the metadata.
If no valid_until has been set (if it's nil) then the metadata will always be returned.
Returns compressed metadata, containing gzipped XML. This greatly reduces the size of the metadata record.
Returns true if the XML data in an metadata struct has been compressed
Returns the number of entities in the metadata file
Returns decompressed metadata, with plain-text XML data. This makes the struct much larger.
@spec derive(data :: Enumerable.t() | Smee.Entity.t(), options :: keyword()) :: t()
Returns a new metadata struct based on the streamed entities passed as the first parameter.
You can set or override various parts of the struct by passing options:
- url - the original location of the metadata
- uri - a URI that identifies the metadata (Name)
- downloaded_at - A DateTime to record when the record was downloaded
- modified_at - A DateTime to record when the record was updated upstream
- valid_until - A DateTime to indicate when an entity expires
- priority - An integer between 0 and 9 to show priority
- trustiness - a Float between 0.0 and 0.9 to indicate, well, trustiness.
- etag - a string to use as an etag (unique content identifier).
- cert_url - location of a certificate to use for signature verification
- cert_fingerprint - fingerprint of the certificate to use for certificate verification
- label - a description for the metadata
@spec entities(metadata :: t()) :: [Smee.Entity.t()]
Returns all entities in the metadata as a list of entity structs.
This can produce very large lists very slowly. The stream_entities2
function is much better.
@spec entity(metadata :: t(), uri :: binary()) :: Smee.Entity.t() | nil
Returns the specified entity from the metadata in an :ok/:error struct
@spec entity!(metadata :: t(), uri :: binary()) :: Smee.Entity.t()
Returns the specified entity from the metadata or raises an exception if not found
Returns a list of all entity IDs in the metadata
Returns true if the metadata has expired (based on valid_until datetime)
If no valid_until has been set (if it's nil) then false will be returned
Returns a suggested filename for the metadata.
Returns a suggested filename for the metadata in the specified format.
Two formats can be specified: :sha1 and :uri
Returns a new metadata struct based on the XML data passed as the first parameter.
You can set or override various parts of the struct by passing options:
- url - the original location of the metadata
- uri - a URI that identifies the metadata (Name)
- downloaded_at - A DateTime to record when the record was downloaded
- modified_at - A DateTime to record when the record was updated upstream
- valid_until - A DateTime to indicate when an entity expires
- priority - An integer between 0 and 9 to show priority
- trustiness - a Float between 0.0 and 0.9 to indicate, well, trustiness.
- etag - a string to use as an etag (unique content identifier).
- cert_url - location of a certificate to use for signature verification
- cert_fingerprint - fingerprint of the certificate to use for certificate verification
- label - a description for the metadata
In most cases it is better to use Smee.Source
and then Smee.Fetch
to generate a metadata struct.
@spec random_entity(metadata :: t()) :: Smee.Entity.t()
Returns one randomly selected entity from the metadata
@spec stream_entities(metadata :: t(), options :: keyword()) :: Enumerable.t()
Returns a stream of all entities in the metadata.
Tags a metadata record with one or more tags, replacing existing tags.
Tags are arbitrary classifiers, initially inherited from sources
Returns the tags of the metadata struct, a list of binary strings
Tags are arbitrary strings, which may be initially inherited from source records, and will be passed on to entities.
Resyncs the internal state of a %Metadata{} struct
If changes have been made using Metadata.update/2
then this will not be needed - it's there for when the struct
has been changed directly
Returns an updates %Metadata{} struct with new XML, refreshing various parts of the struct correctly.
This should be the only way updated Metadata structs are produced - the raw struct should not be changed directly.
Raises an exception if the metadata has invalid XML, otherwise returns the metadata.
Returns a parsed Erlang xmerl
structure representing the metadata XML, for use with xmerl
, SweetXML
and other
tools.
Using this is not recommended as it will create a very, very large xmerl
structure. The Smee.Transform
and Smee.Extract
modules may be more efficient for working with large metadata files, and the best approach is to stream and work with
Smee.Entity
records using Smee.Metadata.stream_entities/2
Unlike the similar function for Entity
it is not possible to cache this in the struct, so it will
be regenerated every time.
Returns the XML for the metadata, unchanged, and decompressed.
The XML is returned as a binary string - it may be very large, and larger than the struct it comes from.
Returns the XML for the metadata, decompressed, after a processing stage.
Available processing options:
:default
and:none
- Nothing is changed, so it will be the same output asSmee.Metadata.xml/1
:strip
- XML has comments removed, signature removed, and XML declaration removed.
The XML is returned as a binary string - it may be very large.