Table of Contents
Fetching ...

The CARFAC v2 Cochlear Model in Matlab, NumPy, and JAX

Richard F. Lyon, Rob Schonberger, Malcolm Slaney, Mihajlo Velimirović, Honglin Yu

TL;DR

CARFAC v2 advances the cochlear model by delivering cross-platform improvements (MATLAB, NumPy, JAX) with DC distortion fixes, high-frequency synchrony reduction via a revised IHC, and a differentiable impairment framework linked to ohc_health. Key methodological updates include relocating the AC highpass filter, introducing a two-capacitor IHC, a delay-per-stage option, and streamlined AGC smoothing, together with stronger AMT integration and broader testing. The NumPy and JAX implementations enable scalable parameter optimization and fast execution, with benchmarks indicating substantial speedups (notably in JAX) and support for differentiable training contexts. The work also documents a clear path toward richer neural-output representations and future porting to C++, supporting broader multi-model comparisons and potential clinical modeling of hearing impairment.

Abstract

The open-source CARFAC (Cascade of Asymmetric Resonators with Fast-Acting Compression) cochlear model is upgraded to version 2, with improvements to the Matlab implementation, and with new Python/NumPy and JAX implementations -- but C++ version changes are still pending. One change addresses the DC (direct current, or zero frequency) quadratic distortion anomaly previously reported; another reduces the neural synchrony at high frequencies; the others have little or no noticeable effect in the default configuration. A new feature allows modeling a reduction of cochlear amplifier function, as a step toward a differentiable parameterized model of hearing impairment. In addition, the integration into the Auditory Model Toolbox (AMT) has been extensively improved, as the prior integration had bugs that made it unsuitable for including CARFAC in multi-model comparisons.

The CARFAC v2 Cochlear Model in Matlab, NumPy, and JAX

TL;DR

CARFAC v2 advances the cochlear model by delivering cross-platform improvements (MATLAB, NumPy, JAX) with DC distortion fixes, high-frequency synchrony reduction via a revised IHC, and a differentiable impairment framework linked to ohc_health. Key methodological updates include relocating the AC highpass filter, introducing a two-capacitor IHC, a delay-per-stage option, and streamlined AGC smoothing, together with stronger AMT integration and broader testing. The NumPy and JAX implementations enable scalable parameter optimization and fast execution, with benchmarks indicating substantial speedups (notably in JAX) and support for differentiable training contexts. The work also documents a clear path toward richer neural-output representations and future porting to C++, supporting broader multi-model comparisons and potential clinical modeling of hearing impairment.

Abstract

The open-source CARFAC (Cascade of Asymmetric Resonators with Fast-Acting Compression) cochlear model is upgraded to version 2, with improvements to the Matlab implementation, and with new Python/NumPy and JAX implementations -- but C++ version changes are still pending. One change addresses the DC (direct current, or zero frequency) quadratic distortion anomaly previously reported; another reduces the neural synchrony at high frequencies; the others have little or no noticeable effect in the default configuration. A new feature allows modeling a reduction of cochlear amplifier function, as a step toward a differentiable parameterized model of hearing impairment. In addition, the integration into the Auditory Model Toolbox (AMT) has been extensively improved, as the prior integration had bugs that made it unsuitable for including CARFAC in multi-model comparisons.
Paper Structure (18 sections, 4 figures, 3 tables)

This paper contains 18 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Spectral view of propagating distortion products. This FFT analysis of the BM outputs, with four-tone waveform input, shows no propagating DC (0 Hz) component, unlike the book's Figure 17.7. Other low-frequency quadratic distortion tones (e.g. 200, 400, and 600 Hz) are still prominent, as are cubic distortion tones (e.g. 1000, 1200, 1400, and 2400 Hz
  • Figure 2: The two-cap IHC model. The dynamic state of the IHC is in the three LPF blocks, which are unity-gain one-pole smoothing filters. The first one smooths the receptor potential, and is the main effect contributing to reduced synchrony at high frequencies; the last one further reduces the synchrony. The middle LPF represents the reservoir of neurotransmitter that can deplete and recover on a time scale of milliseconds; it does not cause any smoothing in the forward pass, like the one in the one-cap model and unlike the first LPF here.
  • Figure 3: IHC model tone-burst responses. Responses of the IHC to 3 kHz 10 ms tone burst (without ramping at onset or offset, starting at time 0, at $-40$ dBFS, channel 23 in default CARFAC, with CF near 3 kHz). Top: BM response, the IHC input (this is from v2, but v1 is not noticeably different), offset to +12. Middle: the receptor potential of the two-cap IHC model, multiplied by 20 and offset to +6. Bottom: NAP response, the IHC output, with the corresponding v1 output dotted. Note that the receptor potential shows a subtle onset sharpening, but not much overshoot. The v2 NAP still shows a strong onset emphasis, but about half the AC component, or vector strength, of the v1 model at 3 kHz; the difference is much less at lower frequencies.
  • Figure 4: IHC smoothing comparison. The longer smoothing time constant in v2's two-cap IHC model yields a (linearized, small-signal) magnitude transfer function that is lower than the one for v1's one-cap model. Their ratio is plotted (dashed), and lines are drawn connecting the points corresponding to the 3 kHz tone plotted in Figure \ref{['fig:ihc_compare']}, as well as the 300 Hz tone that is shown in the book, where the synchrony reduction is minor. This more severe loss of synchrony in v2 may be more realistic.