VoxCPMEx.Server (voxcpmex v0.3.0)

Copy Markdown View Source

GenServer that manages a Python VoxCPM2 bridge via Erlang Port.

Protocol (v2.1 — MessagePack binary framing, single-ref streaming)

Frame format: [4-byte BE total_length][msgpack-encoded payload]

Audio is raw WAV bytes inside msgpack — no base64.

Streaming: unlike v2.0 which used a dual stream_id/ref mapping, v2.1 eliminates stream IDs entirely. The Elixir ref (sent in the request) is the sole identifier — Python just echoes stream_start/chunk/end in order.

Summary

Functions

Waits for the model to finish loading.

Returns a specification to start this module under a supervisor.

Starts async streaming. Returns {:ok, ref} immediately.

Returns runtime model info: device, sample_rate, status.

Gracefully stops the GenServer and the Python bridge process.

Types

generate_opt()

@type generate_opt() ::
  {:audio_prompt, String.t()}
  | {:prompt_wav_path, String.t()}
  | {:prompt_text, String.t()}
  | {:cfg_value, float()}
  | {:inference_timesteps, pos_integer()}
  | {:min_len, pos_integer()}
  | {:max_len, pos_integer()}
  | {:normalize, boolean()}
  | {:denoise, boolean()}

model_option()

@type model_option() ::
  {:model, String.t()}
  | {:device, String.t()}
  | {:load_denoiser, boolean()}
  | {:optimize, boolean()}
  | {:name, atom()}

start_opts()

@type start_opts() :: [model_option()]

Functions

await_ready(server, timeout \\ 120_000)

@spec await_ready(GenServer.server(), timeout()) :: :ok | {:error, term()}

Waits for the model to finish loading.

Returns :ok when ready, {:error, :loading} if still initializing, or {:error, reason} if initialization failed.

child_spec(init_arg)

Returns a specification to start this module under a supervisor.

See Supervisor.

collect_stream(server, ref)

@spec collect_stream(GenServer.server(), reference()) ::
  {:ok, binary()} | {:error, term()}

generate(server, text, opts \\ [])

@spec generate(GenServer.server(), String.t(), [generate_opt()]) ::
  {:ok, binary()} | {:error, term()}

generate(server, text, opts, timeout)

@spec generate(GenServer.server(), String.t(), [generate_opt()], timeout()) ::
  {:ok, binary()} | {:error, term()}

generate_streaming_async(server, text, opts \\ [])

@spec generate_streaming_async(GenServer.server(), String.t(), [generate_opt()]) ::
  {:ok, reference()} | {:error, term()}

Starts async streaming. Returns {:ok, ref} immediately.

Poll with next_chunk/2: {:ok, chunk} (raw float32 PCM) | :eos | {:error, reason}. Collect everything with collect_stream/2: {:ok, wav_binary}.

info(server)

@spec info(GenServer.server()) :: map()

Returns runtime model info: device, sample_rate, status.

load_lora(server, lora_path)

@spec load_lora(GenServer.server(), String.t()) ::
  {:ok, non_neg_integer(), non_neg_integer()} | {:error, term()}

next_chunk(server, ref)

@spec next_chunk(GenServer.server(), reference()) ::
  {:ok, binary()} | :eos | {:error, term()}

save(audio, path)

@spec save(binary(), Path.t()) :: :ok | {:error, term()}

start_link(opts \\ [])

@spec start_link(start_opts()) :: GenServer.on_start()

stop(server)

@spec stop(GenServer.server()) :: :ok

Gracefully stops the GenServer and the Python bridge process.

unload_lora(server)

@spec unload_lora(GenServer.server()) :: :ok | {:error, term()}