Cool-chic video: Learned video coding with 800 parameters
Thomas Leguay, Théo Ladune, Pierrick Philippe, Olivier Déforges
TL;DR
This work targets low-complexity learned video compression by building on the Cool-chic image codec and adding an inter-frame coding module to exploit temporal redundancies. The proposed approach achieves a decoder with roughly 900 multiplications per decoded pixel and about 800 parameters per frame, enabling frame-wise encoding suitable for both low-delay and random-access configurations. RD performance is reported to be near AVC and better than previous overfitted codecs like FFNeRV, while maintaining very low decoding complexity and an open-source release for further research. The study also highlights current limitations in motion estimation, high-rate performance, and encoding time, outlining concrete directions for improving practical deployment of learned Video codecs.
Abstract
We propose a lightweight learned video codec with 900 multiplications per decoded pixel and 800 parameters overall. To the best of our knowledge, this is one of the neural video codecs with the lowest decoding complexity. It is built upon the overfitted image codec Cool-chic and supplements it with an inter coding module to leverage the video's temporal redundancies. The proposed model is able to compress videos using both low-delay and random access configurations and achieves rate-distortion close to AVC while out-performing other overfitted codecs such as FFNeRV. The system is made open-source: orange-opensource.github.io/Cool-Chic.
