open_jtalk_elixir
View SourceUse Open JTalk from Elixir. This package builds a local open_jtalk
CLI and,
by default, bundles a UTF-8 dictionary and an HTS voice (you can disable this),
exposing three convenient APIs:
OpenJTalk.to_wav/2
— synthesize text to a WAV fileOpenJTalk.to_binary/2
— synthesize and return WAV bytesOpenJTalk.say/2
— synthesize and play via a system audio player
Install
Add the dependency to your mix.exs
:
def deps do
[
{:open_jtalk_elixir, "~> 0.2"}
]
end
Then:
mix deps.get
mix compile
On first compile the project may download and build MeCab, HTS Engine API,
and Open JTalk. By default it also downloads and bundles a UTF-8 dictionary
and a Mei voice into priv/
(you can turn this off with
OPENJTALK_BUNDLE_ASSETS=0
).
Build requirements
You’ll need common build tools: gcc
/g++
, make
, curl
, tar
, unzip
.
On macOS Xcode Command Line Tools are sufficient.
Optional environment flags (honored by the Makefile):
OPENJTALK_FULL_STATIC=1
— attempt a fully staticopen_jtalk
(Linux only; requires static libstdc++)OPENJTALK_BUNDLE_ASSETS=0|1
— whether to bundle dictionary/voice intopriv/
Tested platforms
Host builds (compile and run on the same machine):
- Linux x86_64
- Linux aarch64
- macOS 14 (arm64, Apple Silicon)
Cross-compile (host → target):
- Linux x86_64 → Nerves rpi4 (aarch64)
Quick start
# play via system audio player (aplay/paplay/afplay/play)
OpenJTalk.say("元氣ですかあ 、元氣が有れば、なんでもできる")
Options
All synthesis calls accept the same options (values are clamped):
:timbre
— voice color offset-0.8..0.8
(default0.0
):pitch_shift
— semitones-24..24
(default0
):rate
— speaking speed0.5..2.0
(default1.0
):gain
— output gain in dB (default0
):voice
— path to a.htsvoice
file (optional):dictionary
— path to a directory containingsys.dic
(optional):timeout
— max runtime in ms (default20_000
):out
— output WAV path (only forto_wav/2
)
How asset resolution works
The package resolves required assets in this order:
- Environment variable override
- Bundled asset in
priv/
- System-installed location
CLI binary (open_jtalk
)
- Env:
OPENJTALK_CLI
— full path toopen_jtalk
. - Bundled:
priv/bin/open_jtalk
(built during compile). - System:
open_jtalk
found on$PATH
.
Dictionary (sys.dic
)
- Env:
OPENJTALK_DIC_DIR
— directory containingsys.dic
. - Bundled:
priv/dic/sys.dic
or anypriv/dic/**/sys.dic
(e.g.naist-jdic
). - System: common locations such as
/var/lib/mecab/dic/open-jtalk/naist-jdic
,/usr/lib/*/mecab/dic/open-jtalk/naist-jdic
, etc.
Voice (.htsvoice
)
- Env:
OPENJTALK_VOICE
— path to a.htsvoice
file. - Bundled: first file matching
priv/voices/**/*.htsvoice
. - System: standard locations like
/usr/share/hts-voice/**
or/usr/local/share/hts-voice/**
.
If you change environment variables at runtime (or move files), refresh the cached paths:
OpenJTalk.Assets.reset_cache()
Using with Nerves
This library is Nerves-aware. When MIX_TARGET
is set the build defaults to:
OPENJTALK_FULL_STATIC=1
— try to statically link the CLI on Linux targets when possibleOPENJTALK_BUNDLE_ASSETS=1
— bundle CLI, dictionary, and voice intopriv/
So for many projects no extra configuration is needed.
Quick Nerves flow
export MIX_TARGET=rpi4
mix deps.get
mix compile
mix firmware
On the device:
{:ok, info} = OpenJTalk.info()
# bundled assets should show up as :bundled
OpenJTalk.say("こんにちは")
Audio on Nerves
OpenJTalk.say/2
requires a system audio player. Most Nerves images use ALSA
aplay
. If your image does not include a player:
- add one to the system image, or
- use
OpenJTalk.to_wav/2
and play the WAV with your chosen mechanism.
Firmware size notes
Bundling the full dictionary + voice + binary increases firmware size. Approximate (uncompressed) sizes:
- Dictionary (NAIST-JDIC): ~100–110 MB
- Mei voice: ~2.2 MB
- CLI binary: ~0.7 MB
If that’s too large you can avoid bundling at compile time and provision assets
separately (rootfs overlay, /data
, OTA, etc.):
MIX_TARGET=rpi4 OPENJTALK_BUNDLE_ASSETS=0 mix deps.compile open_jtalk_elixir
Then point the library to the provisioned assets (for example in
config/runtime.exs
):
System.put_env("OPENJTALK_CLI", "/data/open_jtalk/bin/open_jtalk")
System.put_env("OPENJTALK_DIC_DIR", "/data/open_jtalk/dic")
System.put_env("OPENJTALK_VOICE", "/data/open_jtalk/voices/mei_normal.htsvoice")
OpenJTalk.Assets.reset_cache()
How you provision those files into your image is outside the scope of this library.
Third-party components & licenses
This package does not redistribute third-party assets by default. At compile time it may download and build:
Open JTalk 1.11 — Modified BSD (BSD 3-Clause)
Source: http://open-jtalk.sourceforge.net/HTS Engine API 1.10 — Modified BSD (BSD 3-Clause)
Source: http://hts-engine.sourceforge.net/MeCab 0.996 — tri-licensed (GPL / LGPL / BSD); this project uses the BSD terms
Source: https://taku910.github.io/mecab/Open JTalk Dictionary (NAIST-JDIC UTF-8) 1.11 — BSD-style by NAIST
Source: https://sourceforge.net/projects/open-jtalk/files/Dictionary/HTS Voice “Mei” (MMDAgent_Example 1.8) — CC BY 3.0
Source: https://sourceforge.net/projects/mmdagent/files/MMDAgent_Example/
Attribution: “HTS Voice ‘Mei’ © Nagoya Institute of Technology, licensed CC BY 3.0.”