View Source NxSignal (NxSignal v0.1.0)

Nx library extension for digital signal processing.

Link to this section Summary

Functions: Time-Frequency

Computes the frequency bins for a FFT with given options.

Computes the Inverse Short-Time Fourier Transform of a tensor.

Generates weights for converting an STFT representation into MEL-scale.

Computes the Short-Time Fourier Transform of a tensor.

Converts a given STFT time-frequency spectrum into a MEL-scale time-frequency spectrum.

Functions: Windowing

Returns a tensor of K windows of length N

Performs the overlap-and-add algorithm over an M by N tensor, where M is the number of windows and N is the window size.

Link to this section Functions: Time-Frequency

Link to this function

fft_frequencies(sampling_rate, opts \\ [])

View Source

Computes the frequency bins for a FFT with given options.

arguments

Arguments

  • sampling_rate - Sampling frequency in Hz.

options

Options

  • :fft_length - Number of FFT frequency bins.
  • :type - Optional output type. Defaults to {:f, 32}
  • :name - Optional axis name for the tensor. Defaults to :frequencies

examples

Examples

iex> NxSignal.fft_frequencies(1.6e4, fft_length: 10)
#Nx.Tensor<
  f32[frequencies: 10]
  [0.0, 1.6e3, 3.2e3, 4.8e3, 6.4e3, 8.0e3, 9.6e3, 1.12e4, 1.28e4, 1.44e4]
>
Link to this function

istft(data, window, opts)

View Source

Computes the Inverse Short-Time Fourier Transform of a tensor.

Returns a tensor of M time-domain frames of length fft_length.

See also: NxSignal.Windows, stft/3

options

Options

  • :fft_length - the DFT length that will be passed to Nx.fft/2. Defaults to :power_of_two.
  • :overlap_length - the number of samples for the overlap between frames. Defaults to half the window size.
  • :sampling_rate - the sampling rate $F_s$ in Hz. Defaults to 1000.
  • :scaling - nil, :spectrum or :psd.
    • :spectrum - each frame is multiplied by $\sum_{i} window[i]$.
    • nil - No scaling is applied.
    • :psd - each frame is multiplied by $\sqrt{F_s\sum_{i} window[i]^2}$.

examples

Examples

In general, istft/3 takes in the same parameters and window as the stft/3 that generated the spectrum. In the first example, we can notice that the reconstruction is mostly perfect, aside from the first sample.

This is because the Hann window only ensures perfect reconstruction in overlapping regions, so the edges of the signal end up being distorted.

iex> t = Nx.tensor([10, 10, 1, 0, 10, 10, 2, 20])
iex> w = NxSignal.Windows.hann(n: 4)
iex> opts = [sampling_rate: 1, fft_length: 4]
iex> {z, _time, _freqs} = NxSignal.stft(t, w, opts)
iex> result = NxSignal.istft(z, w, opts)
iex> Nx.as_type(result, Nx.type(t))
#Nx.Tensor<
  s64[8]
  [0, 10, 1, 0, 10, 10, 2, 20]
>

Different scaling options are available (see stft/3 for a more detailed explanation). For perfect reconstruction, you want to use the same scaling as the STFT:

iex> t = Nx.tensor([10, 10, 1, 0, 10, 10, 2, 20])
iex> w = NxSignal.Windows.hann(n: 4)
iex> opts = [scaling: :spectrum, sampling_rate: 1, fft_length: 4]
iex> {z, _time, _freqs} = NxSignal.stft(t, w, opts)
iex> result = NxSignal.istft(z, w, opts)
iex> Nx.as_type(result, Nx.type(t))
#Nx.Tensor<
  s64[8]
  [0, 10, 1, 0, 10, 10, 2, 20]
>

iex> t = Nx.tensor([10, 10, 1, 0, 10, 10, 2, 20], type: :f32)
iex> w = NxSignal.Windows.hann(n: 4)
iex> opts = [scaling: :psd, sampling_rate: 1, fft_length: 4]
iex> {z, _time, _freqs} = NxSignal.stft(t, w, opts)
iex> result = NxSignal.istft(z, w, opts)
iex> Nx.as_type(result, Nx.type(t))
#Nx.Tensor<
  f32[8]
  [0.0, 10.0, 0.9999999403953552, -2.1900146407460852e-7, 10.0, 10.0, 2.000000238418579, 20.0]
>
Link to this function

mel_filters(fft_length, mel_bins, sampling_rate, opts \\ [])

View Source

Generates weights for converting an STFT representation into MEL-scale.

See also: stft/3, istft/3, stft_to_mel/3

arguments

Arguments

  • fft_length - Number of FFT bins
  • mel_bins - Number of target MEL bins
  • sampling_rate - Sampling frequency in Hz

options

Options

  • :max_mel - the pitch for the last MEL bin before log scaling. Defaults to 3016
  • :mel_frequency_spacing - the distance in Hz between two MEL bins before log scaling. Defaults to 66.6
  • :type - Target output type. Defaults to {:f, 32}

examples

Examples

iex> NxSignal.mel_filters(10, 5, 8.0e3)
#Nx.Tensor<
  f32[mels: 5][frequencies: 10]
  [
    [0.0, 8.129207999445498e-4, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
    [0.0, 9.972016559913754e-4, 2.1870288765057921e-4, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
    [0.0, 0.0, 9.510891977697611e-4, 4.150509194005281e-4, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
    [0.0, 0.0, 0.0, 4.035891906823963e-4, 5.276656011119485e-4, 2.574124082457274e-4, 0.0, 0.0, 0.0, 0.0],
    [0.0, 0.0, 0.0, 0.0, 7.329034269787371e-5, 2.342205698369071e-4, 3.8295105332508683e-4, 2.8712040511891246e-4, 1.9128978601656854e-4, 9.545915963826701e-5]
  ]
>
Link to this function

stft(data, window, opts \\ [])

View Source

Computes the Short-Time Fourier Transform of a tensor.

Returns the complex spectrum Z, the time in seconds for each frame and the frequency bins in Hz.

The STFT is parameterized through:

  • $k$: length of the Discrete Fourier Transform (DFT)
  • $N$: length of each frame
  • $H$: hop (in samples) between frames (calculated as $H = N - \text{overlap\_length}$)
  • $M$: number of frames
  • $x[n]$: the input time-domain signal
  • $w[n]$: the window function to be applied to each frame

$$ DFT(x, w) := \sum_{n=0}^{N - 1} x[n]w[n]e^\frac{-2 \pi i k n}{N} \\ X[m, k] = DFT(x[mH..(mH + N - 1)], w) $$

where $m$ assumes all values in the interval $[0, M - 1]$

See also: NxSignal.Windows, istft/3, stft_to_mel/3

options

Options

  • :sampling_rate - the sampling frequency $F_s$ for the input in Hz. Defaults to 1000.
  • :fft_length - the DFT length that will be passed to Nx.fft/2. Defaults to :power_of_two.
  • :overlap_length - the number of samples for the overlap between frames. Defaults to half the window size.
  • :window_padding - :reflect, :zeros or nil. See as_windowed/3 for more details.
  • :scaling - nil, :spectrum or :psd.
    • :spectrum - each frame is divided by $\sum_{i} window[i]$.
    • nil - No scaling is applied.
    • :psd - each frame is divided by $\sqrt{F_s\sum_{i} window[i]^2}$.

examples

Examples

iex> {z, t, f} = NxSignal.stft(Nx.iota({4}), NxSignal.Windows.rectangular(n: 2), overlap_length: 1, fft_length: 2, sampling_rate: 400)
iex> z
#Nx.Tensor<
  c64[frames: 3][frequencies: 2]
  [
    [1.0+0.0i, -1.0+0.0i],
    [3.0+0.0i, -1.0+0.0i],
    [5.0+0.0i, -1.0+0.0i]
  ]
>
iex> t
#Nx.Tensor<
  f32[frames: 3]
  [0.0024999999441206455, 0.004999999888241291, 0.007499999832361937]
>
iex> f
#Nx.Tensor<
  f32[frequencies: 2]
  [0.0, 200.0]
>
Link to this function

stft_to_mel(z, sampling_rate, opts \\ [])

View Source

Converts a given STFT time-frequency spectrum into a MEL-scale time-frequency spectrum.

See also: stft/3, istft/3, mel_filters/4

arguments

Arguments

  • z - STFT spectrum
  • sampling_rate - Sampling frequency in Hz

options

Options

  • :fft_length - Number of FFT bins
  • :mel_bins - Number of target MEL bins. Defaults to 128
  • :type - Target output type. Defaults to {:f, 32}

examples

Examples

iex> fft_length = 16
iex> sampling_rate = 8.0e3
iex> {z, _, _} = NxSignal.stft(Nx.iota({10}), NxSignal.Windows.hann(n: 4), overlap_length: 2, fft_length: fft_length, sampling_rate: sampling_rate, window_padding: :reflect)
iex> Nx.axis_size(z, :frequencies)
16
iex> Nx.axis_size(z, :frames)
5
iex> NxSignal.stft_to_mel(z, sampling_rate, fft_length: fft_length, mel_bins: 4)
#Nx.Tensor<
  f32[frames: 5][mel: 4]
  [
    [0.2900530695915222, 0.17422175407409668, 0.18422472476959229, 0.09807997941970825],
    [0.6093881130218506, 0.5647397041320801, 0.4353824257850647, 0.08635270595550537],
    [0.7584103345870972, 0.7085014581680298, 0.5636920928955078, 0.179118812084198],
    [0.8461772203445435, 0.7952491044998169, 0.6470762491226196, 0.2520409822463989],
    [0.908548891544342, 0.8572604656219482, 0.7078656554222107, 0.3086767792701721]
  ]
>

Link to this section Functions: Windowing

Link to this function

as_windowed(tensor, opts \\ [])

View Source

Returns a tensor of K windows of length N

options

Options

  • :window_length - the number of samples in a window
  • :stride - The number of samples to skip between windows. Defaults to 1.
  • :padding - A can be :reflect or a valid padding as per Nx.pad/3 over the input tensor's shape. Defaults to :valid. If :reflect or :zeros, the first window will be centered at the start of the signal. For :reflect, each incomplete window will be reflected as if it was periodic (see examples for as_windowed/2). For :zeros, each incomplete window will be zero-padded.

examples

Examples

iex> NxSignal.as_windowed(Nx.tensor([0, 1, 2, 3, 4, 10, 11, 12]), window_length: 4)
#Nx.Tensor<
  s64[5][4]
  [
    [0, 1, 2, 3],
    [1, 2, 3, 4],
    [2, 3, 4, 10],
    [3, 4, 10, 11],
    [4, 10, 11, 12]
  ]
>

iex> NxSignal.as_windowed(Nx.tensor([0, 1, 2, 3, 4, 10, 11, 12]), window_length: 3)
#Nx.Tensor<
  s64[6][3]
  [
    [0, 1, 2],
    [1, 2, 3],
    [2, 3, 4],
    [3, 4, 10],
    [4, 10, 11],
    [10, 11, 12]
  ]
>

iex> NxSignal.as_windowed(Nx.tensor([0, 1, 2, 3, 4, 10, 11]), window_length: 2, stride: 2, padding: [{0, 3}])
#Nx.Tensor<
  s64[5][2]
  [
    [0, 1],
    [2, 3],
    [4, 10],
    [11, 0],
    [0, 0]
  ]
>

iex> t = Nx.iota({7});
iex> NxSignal.as_windowed(t, window_length: 6, padding: :reflect, stride: 1)
#Nx.Tensor<
  s64[7][6]
  [
    [1, 2, 1, 0, 1, 2],
    [2, 1, 0, 1, 2, 3],
    [1, 0, 1, 2, 3, 4],
    [0, 1, 2, 3, 4, 5],
    [1, 2, 3, 4, 5, 6],
    [2, 3, 4, 5, 6, 5],
    [3, 4, 5, 6, 5, 4]
  ]
>

iex> NxSignal.as_windowed(Nx.iota({10}), window_length: 6, padding: :reflect, stride: 2)
#Nx.Tensor<
  s64[5][6]
  [
    [1, 2, 1, 0, 1, 2],
    [1, 0, 1, 2, 3, 4],
    [1, 2, 3, 4, 5, 6],
    [3, 4, 5, 6, 7, 8],
    [5, 6, 7, 8, 9, 8]
  ]
>
Link to this function

overlap_and_add(tensor, opts \\ [])

View Source

Performs the overlap-and-add algorithm over an M by N tensor, where M is the number of windows and N is the window size.

The tensor is zero-padded on the right so the last window fully appears in the result.

options

Options

  • :overlap_length - The number of overlapping samples between windows
  • :type - output type for casting the accumulated result. If not given, defaults to Nx.Type.to_complex/1 called on the input type.

examples

Examples

iex> NxSignal.overlap_and_add(Nx.iota({3, 4}), overlap_length: 0)
#Nx.Tensor<
  s64[12]
  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
>
iex> NxSignal.overlap_and_add(Nx.iota({3, 4}), overlap_length: 3)
#Nx.Tensor<
  s64[6]
  [0, 5, 15, 18, 17, 11]
>