View Source Membrane.RTP.VAD (Membrane RTP plugin v0.21.0)
Simple vad based on audio level sent in RTP header.
To make this module work appropriate RTP header extension has to be set in SDP offer/answer.
If avg of audio level in packets in time_window
exceeds vad_threshold
it emits Membrane.RTP.VadEvent
on its output pad.
When avg falls below vad_threshold
and doesn't exceed it in the next vad_silence_timer
it also emits the event.
Buffers that are processed by this element may or may not have been processed by
a depayloader and passed through a jitter buffer. If they have not, then the only timestamp
available for time comparison is the RTP timestamp. The delta between RTP timestamps is
dependent on the clock rate used by the encoding. For OPUS
the clock rate is 48kHz
and
packets are sent every 20ms
, so the RTP timestamp delta between sequential packets should
be 48000 / 1000 * 20
, or 960
.
When calculating the epoch of the timestamp, we need to account for 32bit integer wrapping.
:current
- the difference between timestamps is low: the timestamp has not wrapped around.:next
- the timestamp has wrapped around to 0. To simplify queue processing we reset the state.:prev
- the timestamp has recently wrapped around. We might receive an out-of-order packet from before the rollover, which we ignore.
element-options
Element options
Passed via struct Membrane.RTP.VAD.t/0
vad_id
1..14
Required
ID of VAD header extension.clock_rate
Membrane.RTP.clock_rate_t()
Default value:
48000
Clock rate (inHz
) for the encoding.time_window
pos_integer()
Default value:
2000
Time window (inms
) in which avg audio level is measured.min_packet_num
pos_integer()
Default value:
50
Minimal number of packets to count avg audio level from. Speech won't be detected until there are enough packets.vad_threshold
-127..0
Default value:
-50
Audio level in dBov representing vad threshold. Values above are considered to represent voice activity. Value -127 represents digital silence.vad_silence_time
pos_integer()
Default value:
300
Time to wait before emittingMembrane.RTP.VadEvent
after audio track is no longer considered to represent speech. If at this time audio track is considered to represent speech again the event will not be sent.
pads
Pads
input
:input
Accepted formats:
_any
Direction: | :input |
Availability: | :always |
Mode: | :pull |
Demand mode: | :auto |
Demand unit: | :buffers |
output
:output
Accepted formats:
_any
Direction: | :output |
Availability: | :always |
Mode: | :pull |
Demand mode: | :auto |
Link to this section Summary
Types
Struct containing options for Membrane.RTP.VAD
Link to this section Types
@type t() :: %Membrane.RTP.VAD{ clock_rate: Membrane.RTP.clock_rate_t(), min_packet_num: pos_integer(), time_window: pos_integer(), vad_id: 1..14, vad_silence_time: pos_integer(), vad_threshold: -127..0 }
Struct containing options for Membrane.RTP.VAD
Link to this section Functions
@spec options() :: keyword()
Returns description of options available for this module