Runbox.Deduplicator (runbox v7.0.1)
Decide if a message is a duplicate by comparing it to messages already seen.
Only messages with the biggest timestamp are remembered as "seen". If the actual message has lower timestamp than the messages in "seen", it is considered "old". If it has the exact same timestamp, it is compared for equality with all the "seen" messages and if it matches, it is considered "duplicate". In any other case, the message is considered "new".
Link to this section Summary
Link to this section Types
@opaque t()
Link to this section Functions
deduplicate(msg, state)
Decides if a given message is a duplicate or not.
Returns a tuple with {msg_condition, deduplicator_state}
where msg_condition
can be one of:
:new
- message is new and was not yet seen.:old
- message is older than the messages already seen.:duplicate
- message is equal to one of the latest messages already seen.
new(stream, extract_timestamp)
@spec new(Enumerable.t(), (any() -> non_neg_integer())) :: t() | no_return()
Initializes a deduplicator.
stream
is an enumerable (possibly empty), which contains messages with
descending timestamps - from newest to oldest. It is used to initialize
"seen" messages from messages which were already processed.
extract_timestamp
is a function which returns a timestamp for a given
message.