Modules
Feline — Real-time voice AI pipelines for Elixir.
Behaviour for resampling PCM16 audio between sample rates.
Linear interpolation resampler for PCM16 (16-bit signed little-endian) audio.
Pure Elixir audio utilities for PCM16 (16-bit signed little-endian) audio.
Simple energy-based Voice Activity Detection. Computes RMS of audio chunk and compares to threshold.
Monotonic clock wrapper for time measurements.
LLM conversation context. Holds the messages list, available tools, and tool choice configuration.
Frame classification and introspection.
Behaviour for observing frame processing in a pipeline.
Struct holding a list of processor specifications for a pipeline.
A processor that runs N pipeline branches concurrently.
Top-level entry point for running a pipeline. Starts a Pipeline.Task, waits for completion, and handles graceful shutdown.
Exit point of a pipeline. Downstream frames are forwarded to the pipeline task. Upstream frames pass through to the last user processor.
Entry point of a pipeline. Downstream frames pass through to the first processor. Upstream frames are forwarded to the pipeline task.
Orchestrates pipeline execution. Starts processors under a DynamicSupervisor, links them in order, sends StartFrame, and manages the lifecycle.
Behaviour for pipeline processors.
GenServer that wraps a Feline.Processor callback module.
Accumulates LLM response text into the conversation context. Collects text between LLMFullResponseStartFrame and LLMFullResponseEndFrame.
Processor that plays TTS audio in real-time via sox play.
Logs bot LLM responses to the console, streaming token by token. Place after AssistantContextAggregator.
Logs user transcriptions and speaking state to the console. Place before UserContextAggregator (which absorbs TranscriptionFrame).
Creates a shared context between UserContextAggregator and AssistantContextAggregator. Both aggregators read/write through the same Agent, keeping conversation history in sync.
Intercepts FunctionCallInProgressFrame, executes the registered function, and pushes FunctionCallResultFrame downstream.
Calculates Time-To-First-Byte (TTFB) metrics for LLM and TTS.
Buffers LLM token stream and emits complete sentences.
Pluggable turn management processor. Delegates turn detection to a
configurable strategy module that implements the
Feline.Processors.TurnManager.Strategy behaviour.
Push-to-talk turn detection strategy. User explicitly signals start/stop of speech via InputTransportMessageFrame payloads.
Behaviour for pluggable turn detection strategies.
VAD-based turn detection strategy. Extracts the energy-based voice activity detection logic from VADProcessor into a pluggable strategy.
Accumulates user transcriptions into the LLM context. When the user stops speaking (UserStoppedSpeakingFrame), pushes an LLMContextFrame to trigger LLM processing.
Voice Activity Detection processor. Analyzes InputAudioRawFrame using a VAD analyzer and emits UserStartedSpeakingFrame / UserStoppedSpeakingFrame based on a state machine.
Deepgram speech-to-text service. Streams audio to Deepgram's WebSocket API and produces TranscriptionFrame results.
Deepgram streaming speech-to-text via WebSocket. Sends audio chunks as binary frames and receives JSON transcription results in real time.
ElevenLabs streaming text-to-speech via WebSocket. Sends text chunks and receives audio data in real time, pushing TTSAudioRawFrame as chunks arrive.
ElevenLabs text-to-speech service. Sends text to ElevenLabs API and returns audio as TTSAudioRawFrame.
Behaviour for LLM services. Processes LLM context frames and produces text output frames (LLMTextFrame, LLMFullResponseStartFrame, etc.).
OpenAI LLM service. Sends conversation context to the OpenAI Chat Completions API and streams back LLMTextFrame responses.
OpenAI LLM service with streaming SSE support. Spawns a task per request that pushes LLMTextFrame tokens as they arrive. Supports interruption by killing the in-flight task.
Behaviour for speech-to-text services. Processes audio frames and produces transcription frames.
Behaviour for text-to-speech services. Processes text frames and produces audio output frames.
Behaviour for transports. A transport provides input and output processor specs that can be included in a pipeline.
Configuration for WebSocket transport audio settings and session timeout.
Base input transport processor. Passes through all frames.
Base output transport processor. Receives OutputAudioRawFrame and TTSAudioRawFrame and forwards the audio data to a callback function or registered process.
WebSock handler that receives binary audio and text messages from clients and queues them as frames into the pipeline.
Processor that receives WebSocket messages via handle_info and
converts them to InputAudioRawFrame or InputTransportMessageFrame.
Processor that buffers TTS audio and sends it in timed chunks over
WebSocket. Emits BotStartedSpeakingFrame / BotStoppedSpeakingFrame
to coordinate echo suppression.
Plug that upgrades matching HTTP requests to WebSocket connections.
GenServer managing the Bandit WebSocket server and routing frames between WebSocket clients and the active pipeline.
Mix Tasks
Live voice agent — speak into your microphone and hear the agent respond.