Processor that buffers TTS audio and sends it in timed chunks over WebSocket. Emits BotStartedSpeakingFrame / BotStoppedSpeakingFrame to coordinate echo suppression.
BotStartedSpeakingFrame
BotStoppedSpeakingFrame