Evaluation of Audio Compression Codecs
Thien T. Duong, Jan P. Springer
TL;DR
The paper addresses the need to evaluate audio codecs not only on compression efficiency but also on perceptual quality. It combines traditional metrics, visual spectral analyses, and PEAQ-based scores to compare FLAC, MP3, AAC, Vorbis, and an AI-based RVQGAN, revealing that lossy codecs often trade fidelity for size, with Vorbis performing notably well among lossy codecs. AI-driven RVQGAN delivers extreme compression but shows poor perceptual quality and playback compatibility, highlighting current limitations and the need for psychoacoustic integration. The results provide practical guidance for codec selection and underscore the importance of perceptual assessment in real-world applications, especially as AI-based approaches mature and streaming contexts grow.
Abstract
Perceptual quality of audio is the combination of aural accuracy and listener-perceived sound fidelity. It is how humans respond to the accuracy, intelligibility, and fidelity of aural media. Today this fidelity is also heavily influenced by the use of audio compression codecs for storing aural media in digital form. We argue that, when choosing an audio compression codec, users should not only look at compression efficiency but also consider the sonic perceptual quality properties of available audio compression codecs. We evaluate several commonly used audio compression codecs in terms of compression performance as well as their sonic perceptual quality via codec performance measurements, visualizations, and PEAQ scores. We demonstrate how perceptual quality is affected by digital audio compression techniques, providing insights for users in the process of choosing a digital audio compression scheme.
