Voice Activity Detection processor. Analyzes InputAudioRawFrame using a VAD analyzer and emits UserStartedSpeakingFrame / UserStoppedSpeakingFrame based on a state machine.
States: :quiet -> :speaking -> :quiet
Uses configurable start_secs and stop_secs to debounce transitions.