View Source Bumblebee.Audio.WhisperFeaturizer (Bumblebee v0.5.3)
Whisper featurizer for audio data.
Configuration
:feature_size- the dimension of the extracted features. This corresponds to the number of Mel bins. Defaults to80:sampling_rate- the sampling rate at which the audio files should be digitally expressed in Hertz. Defaults to16000:num_seconds- the maximum duration of the audio sequence. This implies that the the maximum length of the input sequence is:num_seconds*:sampling_rate. Defaults to30:hop_length- the hop between consecutive overlapping windows for the STFT used to obtain Mel Frequency coefficients. Defaults to160:fft_length- the size of the fourier transform. Defaults to400:padding_value- the value used to pad the audio. Should correspond to silence. Defaults to0.0