View Source ExGoogleSTT.TranscriptionServer (Google Speech gRPC API v0.5.1)

A Server to handle transcription requests.

Summary

Functions

Returns a specification to start this module under a supervisor.

That's the main entrypoint for processing audio. It will start a stream, if it's not already started and send the audio to it. It will also send the config if it's not already sent.

Starts a transcription server. The basic usage is to start the server with the config you want. It is then kept in state and can be used to send audio requests later on.

Functions

Link to this function

cancel_stream(transcription_server_pid)

View Source

Returns a specification to start this module under a supervisor.

See Supervisor.

Link to this function

chunk_every(binary, chunk_size)

View Source
Link to this function

chunk_every_rem(binary, chunk_size)

View Source
@spec chunk_every_rem(binary(), chunk_size :: pos_integer()) ::
  {[binary()], remainder :: binary()}
Link to this function

end_stream(transcription_server_pid)

View Source
Link to this function

process_audio(transcription_server_pid, audio_data)

View Source
@spec process_audio(pid(), binary()) :: :ok

That's the main entrypoint for processing audio. It will start a stream, if it's not already started and send the audio to it. It will also send the config if it's not already sent.

Starts a transcription server. The basic usage is to start the server with the config you want. It is then kept in state and can be used to send audio requests later on.

Examples

iex> TranscriptionServer.start_link()
{:ok, #PID<0.123.0>}

Options

These options are all optional. The recognizer should be the main point of configuration.

  • target - a pid to send the results to, defaults to self()
  • language_codes - a list of language codes to use for recognition, defaults to ["en-US"]
  • enable_automatic_punctuation - a boolean to enable automatic punctuation, defaults to true
  • interim_results - a boolean to enable interim results, defaults to false
  • recognizer - a string representing the recognizer to use, defaults to use the recognizer from the config
  • model - a string representing the model to use, defaults to "latest_long". Be careful, changing to 'short' may have unintended consequences
  • split_by_chunk - boolean - whether to split the audio into chunks or not, defaults to true. Used to avoid hitting the Google STT limit