Table of Contents
Fetching ...

Audio Compression using Periodic Gabor with Biorthogonal Exchange: Implementation Using the Zak Transform

Roger Alimi, David J. Tannor

TL;DR

The paper introduces Audio Compression using Periodic Gabor with Biorthogonal Exchange implemented via the Zak Transform (PGBZ). By exchanging the Gabor and biorthogonal bases, the method achieves sparsity, while the Zak transform enables a fast, low-memory computation of PGBZ coefficients that scales as $O(N\log N)$, avoiding large matrix inversions. Compared with STFT and DWT on a diverse set of audio signals, PGBZ delivers substantially lower reconstruction error at high compression levels and maintains strong temporal fidelity, with practical memory and speed advantages. The approach offers a parameter-light, basis-based alternative to conventional TF methods and shows promise for extension to image compression; residual periodic noise is manageable via post-processing such as noise reduction filters or Porat-like corrections.

Abstract

An efficient new approach to signal compression is presented based of a novel variation on the Gabor basis set. Following earlier work by Shimshovitz and Tannor, we convolve the conventional Gabor functions with Dirichlet functions to obtain a Periodic Gabor basis set (PG). The PG basis is exact for continuous functions that are periodic band-limited. Using the orthonormality of the Dirichlet functions, the calculation of the PG coefficients becomes trivial and numerically stable, but its representation does not allow compression. Large compression factors are achieved by exchanging the PG basis with its biorthogonal basis, thereby using the localized PG basis to calculate the coefficients (PGB). Here we implement the PGB formalism using the Fast Zak Transform and obtain very high efficiency with respect to both CPU and memory. We compare the method with the state of the art Short-Time Fourier Transform (STFT) and Discrete Wavelet Transform (DWT) methods on a variety of audio files, including music and speech samples. In all cases tested our scheme surpasses the STFT by far and in most cases outperforms DWT.

Audio Compression using Periodic Gabor with Biorthogonal Exchange: Implementation Using the Zak Transform

TL;DR

The paper introduces Audio Compression using Periodic Gabor with Biorthogonal Exchange implemented via the Zak Transform (PGBZ). By exchanging the Gabor and biorthogonal bases, the method achieves sparsity, while the Zak transform enables a fast, low-memory computation of PGBZ coefficients that scales as , avoiding large matrix inversions. Compared with STFT and DWT on a diverse set of audio signals, PGBZ delivers substantially lower reconstruction error at high compression levels and maintains strong temporal fidelity, with practical memory and speed advantages. The approach offers a parameter-light, basis-based alternative to conventional TF methods and shows promise for extension to image compression; residual periodic noise is manageable via post-processing such as noise reduction filters or Porat-like corrections.

Abstract

An efficient new approach to signal compression is presented based of a novel variation on the Gabor basis set. Following earlier work by Shimshovitz and Tannor, we convolve the conventional Gabor functions with Dirichlet functions to obtain a Periodic Gabor basis set (PG). The PG basis is exact for continuous functions that are periodic band-limited. Using the orthonormality of the Dirichlet functions, the calculation of the PG coefficients becomes trivial and numerically stable, but its representation does not allow compression. Large compression factors are achieved by exchanging the PG basis with its biorthogonal basis, thereby using the localized PG basis to calculate the coefficients (PGB). Here we implement the PGB formalism using the Fast Zak Transform and obtain very high efficiency with respect to both CPU and memory. We compare the method with the state of the art Short-Time Fourier Transform (STFT) and Discrete Wavelet Transform (DWT) methods on a variety of audio files, including music and speech samples. In all cases tested our scheme surpasses the STFT by far and in most cases outperforms DWT.

Paper Structure

This paper contains 9 sections, 28 equations, 14 figures, 1 table.

Figures (14)

  • Figure 1: (a) 9 Gabor unit cells and 9 values of a BL Fourier series cover the same area in TF space, $S=2\pi N$. Superimposed is a schematic Gabor function. (b) The periodic Gabor basis has the same boundary conditions as the BL Fourier series and is therefore a complete set for the truncated space.
  • Figure 2: The Zak transform (absolute value) of a Gaussian (8sec duration sampled at 44.1 kHz).
  • Figure 3: Flowchart of the compression process.
  • Figure 4: (a) The difference between the original and the PGBZ reconstructed signal (single sine + glitch).(b) Zoom on (a). The arrows all have the same length. Note the periodic, spiky shape of the error in the reconstructed signal.
  • Figure 5: Original signal vs. compressed and corrected signals (5a top). Zoom around sample 3646 (5b bottom)
  • ...and 9 more figures