View Source NxAudio.Transforms.MelSpectrogramConfig (nx_audio v0.2.0)

Configuration options for mel spectrogram transformation.

Summary

Types

t()
  • :sample_rate (non_neg_integer/0) - Sample rate of audio signal. The default value is 16000.

Functions

Parses and validate a keyword list into a valid mel spectrogram config

Types

t()

@type t() :: [
  sample_rate: non_neg_integer(),
  n_fft: non_neg_integer(),
  win_length: non_neg_integer(),
  hop_length: non_neg_integer(),
  f_min: float(),
  f_max: float(),
  pad: non_neg_integer(),
  n_mels: non_neg_integer(),
  window_fn: term(),
  power: float(),
  normalized: term(),
  wkwargs: keyword(),
  center: boolean(),
  pad_mode: atom(),
  onesided: boolean(),
  norm: term(),
  mel_scale: term()
]
  • :sample_rate (non_neg_integer/0) - Sample rate of audio signal. The default value is 16000.

  • :n_fft (non_neg_integer/0) - Size of FFT, creates n_fft // 2 + 1 bins. The default value is 400.

  • :win_length (non_neg_integer/0) - Number of samples in each frame. By default its n_fft.

  • :hop_length (non_neg_integer/0) - Number of samples between successive frames. By default its win_length/2.

  • :f_min (float/0) - Minimum frequency. The default value is 0.0.

  • :f_max (float/0) - Maximum frequency. The default value is nil.

  • :pad (non_neg_integer/0) - Two sided padding of signal. The default value is 0.

  • :n_mels (non_neg_integer/0) - Number of mel filterbanks. The default value is 128.

  • :window_fn (term/0) - Window function to apply to each frame. The default value is &NxAudio.Commons.Windows.haan/1.

  • :power (float/0) - Exponent for the magnitude spectrogram, (must be > 0) e.g., 1 for magnitude, 2 for power, etc. If None, then the complex spectrum is returned instead. The default value is 2.

  • :normalized - Whether to normalize by magnitude after stft. choices are "window" and "frame_length", if specific normalization type is desirable.

  • :wkwargs (keyword/0) - Arguments for window function. The default value is [].

  • :center (boolean/0) - Whether to pad waveform on both sides so that the t-th frame is centered at time t * hop_length. The default value is true.

  • :pad_mode (atom/0) - Controls the padding method used when center is true. The default value is :reflect.

  • :onesided (boolean/0) - Controls whether to return half of results to avoid redundancy. The default value is true.

  • :norm - If “slaney”, divide the triangular mel weights by the width of the mel band (area normalization).

  • :mel_scale - Scale to use The default value is :htk.

Functions

validate(config)

Parses and validate a keyword list into a valid mel spectrogram config