View Source NxSignal (NxSignal v0.1.0)
Nx library extension for digital signal processing.
Link to this section Summary
Functions: Time-Frequency
Computes the frequency bins for a FFT with given options.
Computes the Inverse Short-Time Fourier Transform of a tensor.
Generates weights for converting an STFT representation into MEL-scale.
Computes the Short-Time Fourier Transform of a tensor.
Converts a given STFT time-frequency spectrum into a MEL-scale time-frequency spectrum.
Functions: Windowing
Returns a tensor of K windows of length N
Performs the overlap-and-add algorithm over an M by N tensor, where M is the number of windows and N is the window size.
Link to this section Functions: Time-Frequency
Computes the frequency bins for a FFT with given options.
arguments
Arguments
sampling_rate
- Sampling frequency in Hz.
options
Options
:fft_length
- Number of FFT frequency bins.:type
- Optional output type. Defaults to{:f, 32}
:name
- Optional axis name for the tensor. Defaults to:frequencies
examples
Examples
iex> NxSignal.fft_frequencies(1.6e4, fft_length: 10)
#Nx.Tensor<
f32[frequencies: 10]
[0.0, 1.6e3, 3.2e3, 4.8e3, 6.4e3, 8.0e3, 9.6e3, 1.12e4, 1.28e4, 1.44e4]
>
Computes the Inverse Short-Time Fourier Transform of a tensor.
Returns a tensor of M time-domain frames of length fft_length
.
See also: NxSignal.Windows
, stft/3
options
Options
:fft_length
- the DFT length that will be passed toNx.fft/2
. Defaults to:power_of_two
.:overlap_length
- the number of samples for the overlap between frames. Defaults to half the window size.:sampling_rate
- the sampling rate $F_s$ in Hz. Defaults to1000
.:scaling
-nil
,:spectrum
or:psd
.:spectrum
- each frame is multiplied by $\sum_{i} window[i]$.nil
- No scaling is applied.:psd
- each frame is multiplied by $\sqrt{F_s\sum_{i} window[i]^2}$.
examples
Examples
In general, istft/3
takes in the same parameters and window as the stft/3
that generated the spectrum.
In the first example, we can notice that the reconstruction is mostly perfect, aside from the first sample.
This is because the Hann window only ensures perfect reconstruction in overlapping regions, so the edges of the signal end up being distorted.
iex> t = Nx.tensor([10, 10, 1, 0, 10, 10, 2, 20])
iex> w = NxSignal.Windows.hann(n: 4)
iex> opts = [sampling_rate: 1, fft_length: 4]
iex> {z, _time, _freqs} = NxSignal.stft(t, w, opts)
iex> result = NxSignal.istft(z, w, opts)
iex> Nx.as_type(result, Nx.type(t))
#Nx.Tensor<
s64[8]
[0, 10, 1, 0, 10, 10, 2, 20]
>
Different scaling options are available (see stft/3
for a more detailed explanation).
For perfect reconstruction, you want to use the same scaling as the STFT:
iex> t = Nx.tensor([10, 10, 1, 0, 10, 10, 2, 20])
iex> w = NxSignal.Windows.hann(n: 4)
iex> opts = [scaling: :spectrum, sampling_rate: 1, fft_length: 4]
iex> {z, _time, _freqs} = NxSignal.stft(t, w, opts)
iex> result = NxSignal.istft(z, w, opts)
iex> Nx.as_type(result, Nx.type(t))
#Nx.Tensor<
s64[8]
[0, 10, 1, 0, 10, 10, 2, 20]
>
iex> t = Nx.tensor([10, 10, 1, 0, 10, 10, 2, 20], type: :f32)
iex> w = NxSignal.Windows.hann(n: 4)
iex> opts = [scaling: :psd, sampling_rate: 1, fft_length: 4]
iex> {z, _time, _freqs} = NxSignal.stft(t, w, opts)
iex> result = NxSignal.istft(z, w, opts)
iex> Nx.as_type(result, Nx.type(t))
#Nx.Tensor<
f32[8]
[0.0, 10.0, 0.9999999403953552, -2.1900146407460852e-7, 10.0, 10.0, 2.000000238418579, 20.0]
>
Generates weights for converting an STFT representation into MEL-scale.
See also: stft/3
, istft/3
, stft_to_mel/3
arguments
Arguments
fft_length
- Number of FFT binsmel_bins
- Number of target MEL binssampling_rate
- Sampling frequency in Hz
options
Options
:max_mel
- the pitch for the last MEL bin before log scaling. Defaults to 3016:mel_frequency_spacing
- the distance in Hz between two MEL bins before log scaling. Defaults to 66.6:type
- Target output type. Defaults to{:f, 32}
examples
Examples
iex> NxSignal.mel_filters(10, 5, 8.0e3)
#Nx.Tensor<
f32[mels: 5][frequencies: 10]
[
[0.0, 8.129207999445498e-4, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[0.0, 9.972016559913754e-4, 2.1870288765057921e-4, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[0.0, 0.0, 9.510891977697611e-4, 4.150509194005281e-4, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[0.0, 0.0, 0.0, 4.035891906823963e-4, 5.276656011119485e-4, 2.574124082457274e-4, 0.0, 0.0, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0, 7.329034269787371e-5, 2.342205698369071e-4, 3.8295105332508683e-4, 2.8712040511891246e-4, 1.9128978601656854e-4, 9.545915963826701e-5]
]
>
Computes the Short-Time Fourier Transform of a tensor.
Returns the complex spectrum Z, the time in seconds for each frame and the frequency bins in Hz.
The STFT is parameterized through:
- $k$: length of the Discrete Fourier Transform (DFT)
- $N$: length of each frame
- $H$: hop (in samples) between frames (calculated as $H = N - \text{overlap\_length}$)
- $M$: number of frames
- $x[n]$: the input time-domain signal
- $w[n]$: the window function to be applied to each frame
$$ DFT(x, w) := \sum_{n=0}^{N - 1} x[n]w[n]e^\frac{-2 \pi i k n}{N} \\ X[m, k] = DFT(x[mH..(mH + N - 1)], w) $$
where $m$ assumes all values in the interval $[0, M - 1]$
See also: NxSignal.Windows
, istft/3
, stft_to_mel/3
options
Options
:sampling_rate
- the sampling frequency $F_s$ for the input in Hz. Defaults to1000
.:fft_length
- the DFT length that will be passed toNx.fft/2
. Defaults to:power_of_two
.:overlap_length
- the number of samples for the overlap between frames. Defaults to half the window size.:window_padding
-:reflect
,:zeros
ornil
. Seeas_windowed/3
for more details.:scaling
-nil
,:spectrum
or:psd
.:spectrum
- each frame is divided by $\sum_{i} window[i]$.nil
- No scaling is applied.:psd
- each frame is divided by $\sqrt{F_s\sum_{i} window[i]^2}$.
examples
Examples
iex> {z, t, f} = NxSignal.stft(Nx.iota({4}), NxSignal.Windows.rectangular(n: 2), overlap_length: 1, fft_length: 2, sampling_rate: 400)
iex> z
#Nx.Tensor<
c64[frames: 3][frequencies: 2]
[
[1.0+0.0i, -1.0+0.0i],
[3.0+0.0i, -1.0+0.0i],
[5.0+0.0i, -1.0+0.0i]
]
>
iex> t
#Nx.Tensor<
f32[frames: 3]
[0.0024999999441206455, 0.004999999888241291, 0.007499999832361937]
>
iex> f
#Nx.Tensor<
f32[frequencies: 2]
[0.0, 200.0]
>
Converts a given STFT time-frequency spectrum into a MEL-scale time-frequency spectrum.
See also: stft/3
, istft/3
, mel_filters/4
arguments
Arguments
z
- STFT spectrumsampling_rate
- Sampling frequency in Hz
options
Options
:fft_length
- Number of FFT bins:mel_bins
- Number of target MEL bins. Defaults to 128:type
- Target output type. Defaults to{:f, 32}
examples
Examples
iex> fft_length = 16
iex> sampling_rate = 8.0e3
iex> {z, _, _} = NxSignal.stft(Nx.iota({10}), NxSignal.Windows.hann(n: 4), overlap_length: 2, fft_length: fft_length, sampling_rate: sampling_rate, window_padding: :reflect)
iex> Nx.axis_size(z, :frequencies)
16
iex> Nx.axis_size(z, :frames)
5
iex> NxSignal.stft_to_mel(z, sampling_rate, fft_length: fft_length, mel_bins: 4)
#Nx.Tensor<
f32[frames: 5][mel: 4]
[
[0.2900530695915222, 0.17422175407409668, 0.18422472476959229, 0.09807997941970825],
[0.6093881130218506, 0.5647397041320801, 0.4353824257850647, 0.08635270595550537],
[0.7584103345870972, 0.7085014581680298, 0.5636920928955078, 0.179118812084198],
[0.8461772203445435, 0.7952491044998169, 0.6470762491226196, 0.2520409822463989],
[0.908548891544342, 0.8572604656219482, 0.7078656554222107, 0.3086767792701721]
]
>
Link to this section Functions: Windowing
Returns a tensor of K windows of length N
options
Options
:window_length
- the number of samples in a window:stride
- The number of samples to skip between windows. Defaults to1
.:padding
- A can be:reflect
or a valid padding as perNx.pad/3
over the input tensor's shape. Defaults to:valid
. If:reflect
or:zeros
, the first window will be centered at the start of the signal. For:reflect
, each incomplete window will be reflected as if it was periodic (see examples foras_windowed/2
). For:zeros
, each incomplete window will be zero-padded.
examples
Examples
iex> NxSignal.as_windowed(Nx.tensor([0, 1, 2, 3, 4, 10, 11, 12]), window_length: 4)
#Nx.Tensor<
s64[5][4]
[
[0, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 10],
[3, 4, 10, 11],
[4, 10, 11, 12]
]
>
iex> NxSignal.as_windowed(Nx.tensor([0, 1, 2, 3, 4, 10, 11, 12]), window_length: 3)
#Nx.Tensor<
s64[6][3]
[
[0, 1, 2],
[1, 2, 3],
[2, 3, 4],
[3, 4, 10],
[4, 10, 11],
[10, 11, 12]
]
>
iex> NxSignal.as_windowed(Nx.tensor([0, 1, 2, 3, 4, 10, 11]), window_length: 2, stride: 2, padding: [{0, 3}])
#Nx.Tensor<
s64[5][2]
[
[0, 1],
[2, 3],
[4, 10],
[11, 0],
[0, 0]
]
>
iex> t = Nx.iota({7});
iex> NxSignal.as_windowed(t, window_length: 6, padding: :reflect, stride: 1)
#Nx.Tensor<
s64[7][6]
[
[1, 2, 1, 0, 1, 2],
[2, 1, 0, 1, 2, 3],
[1, 0, 1, 2, 3, 4],
[0, 1, 2, 3, 4, 5],
[1, 2, 3, 4, 5, 6],
[2, 3, 4, 5, 6, 5],
[3, 4, 5, 6, 5, 4]
]
>
iex> NxSignal.as_windowed(Nx.iota({10}), window_length: 6, padding: :reflect, stride: 2)
#Nx.Tensor<
s64[5][6]
[
[1, 2, 1, 0, 1, 2],
[1, 0, 1, 2, 3, 4],
[1, 2, 3, 4, 5, 6],
[3, 4, 5, 6, 7, 8],
[5, 6, 7, 8, 9, 8]
]
>
Performs the overlap-and-add algorithm over an M by N tensor, where M is the number of windows and N is the window size.
The tensor is zero-padded on the right so the last window fully appears in the result.
options
Options
:overlap_length
- The number of overlapping samples between windows:type
- output type for casting the accumulated result. If not given, defaults toNx.Type.to_complex/1
called on the input type.
examples
Examples
iex> NxSignal.overlap_and_add(Nx.iota({3, 4}), overlap_length: 0)
#Nx.Tensor<
s64[12]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
>
iex> NxSignal.overlap_and_add(Nx.iota({3, 4}), overlap_length: 3)
#Nx.Tensor<
s64[6]
[0, 5, 15, 18, 17, 11]
>