Auxiliary Learning by Implicit Differentiation
Aviv Navon, Idan Achituve, Haggai Maron, Gal Chechik, Ethan Fetaya
TL;DR
AuxiLearn introduces a bi-level framework that uses implicit differentiation to optimize auxiliary learning for neural networks. It handles two settings: (i) learning a non-linear combination of predefined auxiliary losses to form a single coherent objective, and (ii) generating novel auxiliary tasks from data when none are provided, via a teacher-student setup. The optimization leverages the implicit function theorem to compute hypergradients with respect to auxiliary parameters, using Neumann-series approximations for efficiency. Theoretical analysis uncovers potential overfitting risks in the auxiliary space and highlights the Newton update as a key indicator of auxiliary usefulness. Empirically, AuxiLearn improves main-task performance across image segmentation and low-data classification tasks, often outperforming strong baselines and demonstrating effective automatic auxiliary design and weighting.
Abstract
Training neural networks with auxiliary tasks is a common practice for improving the performance on a main task of interest. Two main challenges arise in this multi-task learning setting: (i) designing useful auxiliary tasks; and (ii) combining auxiliary tasks into a single coherent loss. Here, we propose a novel framework, AuxiLearn, that targets both challenges based on implicit differentiation. First, when useful auxiliaries are known, we propose learning a network that combines all losses into a single coherent objective function. This network can learn non-linear interactions between tasks. Second, when no useful auxiliary task is known, we describe how to learn a network that generates a meaningful, novel auxiliary task. We evaluate AuxiLearn in a series of tasks and domains, including image segmentation and learning with attributes in the low data regime, and find that it consistently outperforms competing methods.
