View Source ExAzureSpeech.TextToSpeech.Websocket (ex_azure_speech v0.2.2)

Module for handling the websocket connection to the Azure Text to Speech service.

The Text-to-Speech webhook internals are implemented like this:

  1. Opens a WebSocket connection to the Azure Text to Speech service.
  2. The client sends a ExAzureSpeech.Common.Messages.SpeechConfigMessage informing the basic configuration for the recognition.
  3. The client sends a ExAzureSpeech.TextToSpeech.Messages.SynthesisContextMessage informing the synthesis context.
  4. The client sends a ExAzureSpeech.TextToSpeech.Messages.SynthesisMessage to start the synthesis.
  5. The client receives audio metadata from the service. Which can be processed by the asynchronous callbacks
  6. The client receives audio data from the service in a binary format
  7. The client receives a ExAzureSpeech.TextToSpeech.Responses.AudioMetadata.session_end message when the synthesis ends.

Summary

Types

Callbacks for handling audio metadata.

Expected websocket frame responses from the Azure Text-to-Speech Service.

Functions

Opens a connection to the Azure Text to Speech service.

Synthesises the given text using the Azure Text to Speech service.

Types

Callbacks for handling audio metadata.

viseme_callback: Executes everytime an Viseme metadata is received.
word_boundary_callback: Executes everytime an Word Boundary metadata is received.
sentence_boundary_callback: Executes everytime an Sentence Boundary metadata is received.
session_end_callback: Executes everytime an Session End metadata is received.

@type expected_responses() ::
  :turn_start | :response | :audio_metadata | :audio | :turn_end

Expected websocket frame responses from the Azure Text-to-Speech Service.

turn_start: The start of a new synthesis turn.
response: Returns info from a stream, nothing useful.-- audio_metadata: Returns metadata about the audio. Like boundaries, visemes, etc.
audio: Returns the audio data in binary format.
turn_end: The end of a synthesis turn.

Functions

Link to this function

open_connection(opts, context, callbacks)

View Source

Opens a connection to the Azure Text to Speech service.

Link to this function

synthesize(pid, command, close_connection_callback)

View Source

Synthesises the given text using the Azure Text to Speech service.