OpenJTalk (open_jtalk_elixir v0.2.0)
View SourceUse Open JTalk from Elixir. This package builds a local open_jtalk
CLI and,
by default, bundles a UTF-8 dictionary and an HTS voice (you can disable this),
exposing three convenient APIs:
OpenJTalk.to_wav/2
— synthesize text to a WAV fileOpenJTalk.to_binary/2
— synthesize and return WAV bytesOpenJTalk.say/2
— synthesize and play via a system audio player
Install
Add the dependency to your mix.exs
:
def deps do
[
{:open_jtalk_elixir, "~> 0.2"}
]
end
Then:
mix deps.get
mix compile
On first compile the project may download and build MeCab, HTS Engine API,
and Open JTalk. By default it also downloads and bundles a UTF-8 dictionary
and a Mei voice into priv/
(you can turn this off with
OPENJTALK_BUNDLE_ASSETS=0
).
Build requirements
You’ll need common build tools: gcc
/g++
, make
, curl
, tar
, unzip
.
On macOS Xcode Command Line Tools are sufficient.
Optional environment flags (honored by the Makefile):
OPENJTALK_FULL_STATIC=1
— attempt a fully staticopen_jtalk
(Linux only; requires static libstdc++)OPENJTALK_BUNDLE_ASSETS=0|1
— whether to bundle dictionary/voice intopriv/
Tested platforms
Host builds (compile and run on the same machine):
- Linux x86_64
- Linux aarch64
- macOS 14 (arm64, Apple Silicon)
Cross-compile (host → target):
- Linux x86_64 → Nerves rpi4 (aarch64)
Quick start
# play via system audio player (aplay/paplay/afplay/play)
OpenJTalk.say("元氣ですかあ 、元氣が有れば、なんでもできる")
Options
All synthesis calls accept the same options (values are clamped):
:timbre
— voice color offset-0.8..0.8
(default0.0
):pitch_shift
— semitones-24..24
(default0
):rate
— speaking speed0.5..2.0
(default1.0
):gain
— output gain in dB (default0
):voice
— path to a.htsvoice
file (optional):dictionary
— path to a directory containingsys.dic
(optional):timeout
— max runtime in ms (default20_000
):out
— output WAV path (only forto_wav/2
)
Summary
Types
Output gain in dB. Typical useful range is about -20..20 (values are clamped).
Pitch shift in semitones. Range: -24..24 (values are clamped).
Speaking rate multiplier. Range: 0.5..2.0 (values are clamped).
Voice color adjustment. Range: -0.8..0.8 (values are clamped).
Functions
Return useful information about the local Open JTalk setup.
Synthesize text
and play it via a system audio player.
Synthesize text
and return a WAV as a binary.
Synthesize text
to a WAV file.
Types
@type gain() :: number()
Output gain in dB. Typical useful range is about -20..20 (values are clamped).
@type opt() :: {:timbre, timbre()} | {:pitch_shift, pitch_shift()} | {:rate, rate()} | {:gain, gain()} | {:voice, Path.t()} | {:dictionary, Path.t()} | {:timeout, non_neg_integer()} | {:out, Path.t()}
@type opts() :: [opt()]
@type pitch_shift() :: -24..24
Pitch shift in semitones. Range: -24..24 (values are clamped).
@type rate() :: float()
Speaking rate multiplier. Range: 0.5..2.0 (values are clamped).
@type timbre() :: float()
Voice color adjustment. Range: -0.8..0.8 (values are clamped).
Functions
Return useful information about the local Open JTalk setup.
Synthesize text
and play it via a system audio player.
Synthesize text
and return a WAV as a binary.
Synthesize text
to a WAV file.