OpenJTalk (open_jtalk_elixir v0.3.0)
View SourceUse Open JTalk from Elixir. This package builds a local open_jtalk CLI and,
by default, bundles a UTF-8 dictionary and an HTS voice (you can disable this),
exposing convenient functions:
OpenJTalk.say/2— synthesize and play via a system audio playerOpenJTalk.to_wav_file/2— synthesize text to a WAV fileOpenJTalk.to_wav_binary/2— synthesize and return WAV bytesOpenJTalk.Wav.concat_binaries/1— merge multiple WAV binaries (same format)OpenJTalk.Wav.concat_files/1— merge multiple WAV files from paths (same format)
Install
Add the dependency to your mix.exs:
def deps do
[
{:open_jtalk_elixir, "~> 0.3"}
]
endThen:
mix deps.get
mix compile
On first compile the project may download and build MeCab, HTS Engine API,
and Open JTalk. By default it also downloads and bundles a UTF-8 dictionary
and a Mei voice into priv/ (you can turn this off with
OPENJTALK_BUNDLE_ASSETS=0).
Build requirements
You’ll need common build tools: gcc/g++, make, curl, tar, unzip.
On macOS Xcode Command Line Tools are sufficient.
Optional environment flags (honored by the Makefile):
OPENJTALK_FULL_STATIC=1— attempt a fully staticopen_jtalk(Linux only; requires static libstdc++)OPENJTALK_BUNDLE_ASSETS=0|1— whether to bundle dictionary/voice intopriv/
Tested platforms
Host builds (compile and run on the same machine):
- Linux x86_64
- Linux aarch64
- macOS 14 (arm64, Apple Silicon)
Cross-compile (host → target):
- Linux x86_64 → Nerves rpi4 (aarch64)
Quick start
# play via system audio player (aplay/paplay/afplay/play)
OpenJTalk.say("元氣ですかあ 、元氣が有れば、なんでもできる")Options
All synthesis calls accept the same options (values are clamped):
:timbre— voice color offset-0.8..0.8(default0.0):pitch_shift— semitones-24..24(default0):rate— speaking speed0.5..2.0(default1.0):gain— output gain in dB (default0):voice— path to a.htsvoicefile (optional):dictionary— path to a directory containingsys.dic(optional):timeout— max runtime in ms (default20_000):out— output WAV path (only forto_wav_file/2)
Concatenate WAVs
You can combine multiple WAVs (same format: channels/rate/bit depth/etc.) into one:
{:ok, a} = OpenJTalk.to_wav_binary("これは一つ目。")
{:ok, b} = OpenJTalk.to_wav_binary("これは二つ目。")
{:ok, c} = OpenJTalk.to_wav_binary("これは三つ目。")
{:ok, merged} = OpenJTalk.Wav.concat_binaries([a, b, c])
# or from files:
# {:ok, merged} = OpenJTalk.Wav.concat_files(["a.wav", "b.wav", "c.wav"])
Summary
Types
Output gain in dB. Typical useful range is about -20..20 (values are clamped).
Entry describing a component path and where it came from.
Return type of info/0.
Pitch shift in semitones. Range: -24..24 (values are clamped).
Audio playback mode
Options accepted by playback functions.
Speaking rate multiplier. Range: 0.5..2.0 (values are clamped).
Options accepted by say/2 (synth + playback + optional :out).
Options accepted by synthesis functions.
Voice color adjustment. Range: -0.8..0.8 (values are clamped).
Functions
Return useful information about the local Open J Talk setup.
Play RIFF/WAV bytes already in memory (no temp files).
Play a WAV from a file path. See play_wav_binary/2 for options.
Synthesize text with Open JTalk and play it.
Synthesize text and return RIFF/WAV bytes.
Synthesize text to a WAV file.
Respects :out when provided; otherwise creates a unique path in the system temp dir.
Validate options for synthesis and playback.
Types
@type gain() :: number()
Output gain in dB. Typical useful range is about -20..20 (values are clamped).
@type info_entry() :: %{ path: String.t() | nil, source: :env | :bundled | :system | :none }
Entry describing a component path and where it came from.
@type info_map() :: %{ bin: info_entry(), dictionary: info_entry(), voice: info_entry(), audio_player: info_entry() }
Return type of info/0.
@type pitch_shift() :: -24..24
Pitch shift in semitones. Range: -24..24 (values are clamped).
@type playback_mode() :: :auto | :file | :stdin
Audio playback mode:
:auto— prefer stdin when available; otherwise fall back to file playback:file— always use file-based playback:stdin— stream WAV bytes via stdin (diskless); falls back to file if unsupported
@type player_option() :: {:timeout, non_neg_integer()} | {:playback_mode, playback_mode()}
Options accepted by playback functions.
@type rate() :: float()
Speaking rate multiplier. Range: 0.5..2.0 (values are clamped).
@type say_option() :: player_option() | synth_option() | {:out, Path.t()}
Options accepted by say/2 (synth + playback + optional :out).
@type synth_option() :: {:timbre, timbre()} | {:pitch_shift, pitch_shift()} | {:rate, rate()} | {:gain, gain()} | {:voice, Path.t()} | {:dictionary, Path.t()} | {:timeout, non_neg_integer()}
Options accepted by synthesis functions.
@type timbre() :: float()
Voice color adjustment. Range: -0.8..0.8 (values are clamped).
Functions
@spec info() :: {:ok, info_map()}
Return useful information about the local Open J Talk setup.
@spec play_wav_binary(iodata(), [player_option()]) :: :ok | {:error, term()}
Play RIFF/WAV bytes already in memory (no temp files).
Accepts the same :playback_mode and :timeout options as say/2.
Use playback_mode: :stdin for diskless playback when a stdin-capable player is available.
@spec play_wav_file(Path.t(), [player_option()]) :: :ok | {:error, term()}
Play a WAV from a file path. See play_wav_binary/2 for options.
@spec say(binary(), [say_option()]) :: :ok | {:error, term()}
Synthesize text with Open JTalk and play it.
:playback_mode controls how playback occurs:
:auto(default) tries stdin first, then falls back to file playback.
@spec to_wav_binary(binary(), [synth_option()]) :: {:ok, binary()} | {:error, term()}
Synthesize text and return RIFF/WAV bytes.
@spec to_wav_file(binary(), [synth_option()]) :: {:ok, Path.t()} | {:error, term()}
Synthesize text to a WAV file.
Respects :out when provided; otherwise creates a unique path in the system temp dir.
Validate options for synthesis and playback.
Allowed keys:
- Synthesis:
:timbre,:pitch_shift,:rate,:gain,:voice,:dictionary,:timeout - Playback:
:playback_mode,:timeout - Files:
:out
Enforcement:
- Unknown keys raise
ArgumentError :playback_modemust be one of:auto | :file | :stdin(if present):timeoutmust be a non-negative integer (if present)
Returns the original opts on success.