View Source NxAudio.Transforms.MelSpectrogramConfig (nx_audio v0.2.0)
Configuration options for mel spectrogram transformation.
Summary
Types
:sample_rate
(non_neg_integer/0
) - Sample rate of audio signal. The default value is16000
.
Functions
Parses and validate a keyword list into a valid mel spectrogram config
Types
@type t() :: [ sample_rate: non_neg_integer(), n_fft: non_neg_integer(), win_length: non_neg_integer(), hop_length: non_neg_integer(), f_min: float(), f_max: float(), pad: non_neg_integer(), n_mels: non_neg_integer(), window_fn: term(), power: float(), normalized: term(), wkwargs: keyword(), center: boolean(), pad_mode: atom(), onesided: boolean(), norm: term(), mel_scale: term() ]
:sample_rate
(non_neg_integer/0
) - Sample rate of audio signal. The default value is16000
.:n_fft
(non_neg_integer/0
) - Size of FFT, creates n_fft // 2 + 1 bins. The default value is400
.:win_length
(non_neg_integer/0
) - Number of samples in each frame. By default its n_fft.:hop_length
(non_neg_integer/0
) - Number of samples between successive frames. By default its win_length/2.:f_min
(float/0
) - Minimum frequency. The default value is0.0
.:f_max
(float/0
) - Maximum frequency. The default value isnil
.:pad
(non_neg_integer/0
) - Two sided padding of signal. The default value is0
.:n_mels
(non_neg_integer/0
) - Number of mel filterbanks. The default value is128
.:window_fn
(term/0
) - Window function to apply to each frame. The default value is&NxAudio.Commons.Windows.haan/1
.:power
(float/0
) - Exponent for the magnitude spectrogram, (must be > 0) e.g., 1 for magnitude, 2 for power, etc. If None, then the complex spectrum is returned instead. The default value is2
.:normalized
- Whether to normalize by magnitude after stft. choices are "window" and "frame_length", if specific normalization type is desirable.:wkwargs
(keyword/0
) - Arguments for window function. The default value is[]
.:center
(boolean/0
) - Whether to pad waveform on both sides so that the t-th frame is centered at time t * hop_length. The default value istrue
.:pad_mode
(atom/0
) - Controls the padding method used when center is true. The default value is:reflect
.:onesided
(boolean/0
) - Controls whether to return half of results to avoid redundancy. The default value istrue
.:norm
- If “slaney”, divide the triangular mel weights by the width of the mel band (area normalization).:mel_scale
- Scale to use The default value is:htk
.