Parallel Algorithms for Combined Regularized Support Vector Machines: Application in Music Genre Classification
Rongmei Liang, Zizheng Liu, Xiaofei Wu, Jingwen Tu
TL;DR
This paper tackles scalable learning for combined regularized SVMs (CR-SVMs) when data are distributed across multiple machines. It introduces a unified consensus optimization framework solved by a parallel ADMM with Gaussian back-substitution, ensuring convergence and scalability regardless of the chosen loss or regularization. A new SGL-SVM model is proposed to capture group and within-group sparsity, and it is applied to music genre classification. The authors prove convergence properties and loss/regularizer-agnostic complexity, and demonstrate effectiveness through synthetic and FMA-based experiments, highlighting MFCCs as the most informative feature group.
Abstract
In the era of rapid development of artificial intelligence, its applications span across diverse fields, relying heavily on effective data processing and model optimization. Combined Regularized Support Vector Machines (CR-SVMs) can effectively handle the structural information among data features, but there is a lack of efficient algorithms in distributed-stored big data. To address this issue, we propose a unified optimization framework based on consensus structure. This framework is not only applicable to various loss functions and combined regularization terms but can also be effectively extended to non-convex regularization terms, showing strong scalability. Based on this framework, we develop a distributed parallel alternating direction method of multipliers (ADMM) algorithm to efficiently compute CR-SVMs when data is stored in a distributed manner. To ensure the convergence of the algorithm, we also introduce the Gaussian back-substitution method. Meanwhile, for the integrity of the paper, we introduce a new model, the sparse group lasso support vector machine (SGL-SVM), and apply it to music information retrieval. Theoretical analysis confirms that the computational complexity of the proposed algorithm is not affected by different regularization terms and loss functions, highlighting the universality of the parallel algorithm. Experiments on synthetic and free music archiv datasets demonstrate the reliability, stability, and efficiency of the algorithm.
