具有l1正则项的稀疏支持向量机的最优性条件及算法

杨晓辉; YANG Xiaohui; 高彩霞; GAO Caixia

doi:10.48014/fcpm.20240411001

具有l1正则项的稀疏支持向量机的最优性条件及算法

Optimality Conditions and Algorithms for Sparse Support Vector Machines with l1 Regular Terms

中国理论数学前沿 2024年第2卷第3期页码[16-24] 下载全文[0.6MB] [HTML]

摘要

支持向量机 (SVM) 作为机器学习的主要方法之一, 是用于解决分类和回归任务的强大学习工具, 在图像分类、模式识别和疾病诊断领域都备受瞩目。在支持向量机模型中, L0/1损失函数是理想的损失函数, 已有的损失函数大多是其代理函数。稀疏优化是研究带有稀疏结构的最优化问题, 根据l1范数良好的稀疏性, 可以通过特征选择去除冗余特征, 本文在L0/1软间隔损失的模型基础上, 提出一个基于L0/1损失的l1范数稀疏支持向量机 (简称L0/1-SSVM) , 证明了模型解的存在性, 给出模型的KKT点和P-稳定点, 并证明全局最优解与KKT点的关系。利用l1范数的近端算子设计ADMM算法迭代框架, 并对算法进行收敛性分析, 证明其收敛于P-稳定点。

Abstract

Support Vector Machine (SVM) , as one of the main methods of machine learning, is a popular learning tool used to solve classification and regression tasks, and has attracted much attention in the fields of image classification, pattern recognition and disease diagnosis. In the context of the support vector machine model (SVM) , the loss function is considered optimal, with most existing loss functions acting as proxies. Since l1 norm has good sparsity, redundant features can be removed through feature selection. In this paper, a loss-based norm sparse support vector machine (called L0/1-SSVM) is proposed based on the L0/1 soft margin loss model. The existence of the model solution is proved, the KKT points and P-stable points of the model are given, and the relationship between the global optimal solution and KKT points is proved. The iterative framework of ADMM algorithm is designed by using the proximal operator of l1 norm, and the convergence analysis is carried out to prove that the algorithm converges to the P-stable point.

DOI	10.48014/fcpm.20240411001
文章类型	研究性论文
收稿日期	2024-04-11
接收日期	2024-05-22
出版日期	2024-09-28
关键词	l1 范数, L0/1 损失, 稀疏支持向量机, 最优性条件, ADMM 算法
Keywords	l1 norm, L0/1 loss function, sparse support vector machine, optimality conditions, ADMM algorithm
作者	杨晓辉, 高彩霞^*
Author	YANG Xiaohui, GAO Caixia^*
所在单位	内蒙古大学数学科学学院, 呼和浩特 010021
Company	School of Mathematical Sciences, Inner Mongolia University, Hohhot 010021, China
浏览量	318
下载量	106
参考文献	[1] Boyd S, Parikh N, Chu E, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers[J]. Foundations and Trends' in Mac-hine learning, 2011, 3(1): 1-122. https://doi.org/10.1561/2200000016. [2] Chartrand R. Exact reconstruction of sparse signals via nonconvex minimization[J]. IEEE Signal Processing L-etters, 2007, 14(10): 707-710. https://doi.org/10.1109/LSP.2007.898300. [3] Wright J, Ma Y, Mairal J, et al. Sparse representation for computer vision and pattern recognition[J]. Proceedings of the IEEE, 2010, 98(6): 1031-1044. [4] Liu J, Chen J, Ye J. Large-scale sparse logistic regression[C]. Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, 2009: 547-556. https://doi.org/10.1145/1557019.1557082. [5] Bienstock D. Computational study of a famliy of mixedinteger quadratic programming problems[J]. Mathematical programming, 1996, 74: 121-140. https://doi.org/10.1007/BF02592208. [6] Zou H, Hastie T, Tibshirani R. Sparse principal component analysis[J]. Journal of computational and graphical statistics, 2006, 15(2): 265-286. https://doi.org/10.1198/106186006X113430. [7] Cortes C, Vapnik V. Support-vector networks[J]. Machine Learning, 1995, 20: 273-297. https://doi.org/10.1007/BF00994018. [8] Schölkopf B, Smola A J. Learning with kernels: support vector machines, regularization, optimization, and beyond [M]. MIT press, 2002. https://doi.org/10.1198/jasa.2003.s269. [9] Elad M, Figueiredo M A T, Ma Y. On the role of sparse and redundant representations in image processing[J]. Proceedings of the IEEE, 2010, 98(6): 972-982. https://doi.org/10.1109/jproc.2009.2037655. [10] Bartlett P L, Wegkamp M H. Classification with a Reject Option using a Hinge Loss. Journal of Machine Learning Research, 2008, 9(8). [11] Jumutc V, Huang X, Suykens J A K. Fixed-size Pegasos f-or hinge and pinball loss SVM[C]. The 2013 International Joint Conference on Neural Networks(IJCNN). IEEE, 2013: 1-7. https://doi.org/10.1109/ijcnn.2013.6706864. [12] Shen X, Tseng G C, Zhang X, et al. On psi-Learning[J]. Journal of the American Statistical Association, 2003, 98(1): 724-734. [13] Wang H, Shao Y, Zhou S, et al. Support Vector Machine Classifier via L0/1Soft-Margin Loss[J]. IEEE transactions on pattern analysis and machine intelligence, 2021, 44(10): 7253-7265. https://doi.org/10.1109/TPAMI.2021.3092177. [14] Zhu J, Rosset S, Tibshirani R, et al. 1-norm support vector machines[J]. Advances in neural information processing systems, 2003, 16. [15] Tanveer M, Sharma S, Rastogi R, et al. Sparse support vector machine with pinball loss[J]. Transactions on Emerging Telecommunications Technologies, 2021, 32(2): e3820. https://doi.org/10.1002/ett.3820. [16] Zheng Q, Zhu F, Qin J, et al. Sparse support matrix machine[J]. Pattern Recognition, 2018, 76: 715-726. https://doi.org/10.1016/j.patcog.2017.10.003. [17] Wen F, Adhikari L, Pei L, et al. Nonconvex regularization- based sparse recovery and demixing with application to color image inpainting[J]. IEEE Access, 2017, 5: 11513-11527. https://doi.org/10.1109/access.2017.2705646. [18] Shao Y H, Li C N, Huang L W, et al. Joint sample and feature selection via sparse primal and dual LSSVM[J]. Knowledge-Based Systems, 2019, 185: 104915. https://doi.org/10.1016/j.knosys.2019.104915. [19] Makela M M, Neittaanmaki P. Nonsmooth optimization: analysis and algorithms with applications to optimal control[M]. 5th Edition. World Scientific, 1992. [20] 赵晨, 罗自炎, 修乃华. 稀疏优化理论与算法若干新进展[J]. 运筹学学报, 2020, 24(4): 1-24. https://doi.org/10.15960/j.cnki.issn.1007-6093.2020.04.001.
引用本文	杨晓辉, 高彩霞. 具有l1 正则项的稀疏支持向量机的最优性条件及算法[J]. 中国理论数学前沿, 2024, 2(3): 16-24.
Citation	YANG Xiaohui, GAO Caixia. Optimality conditions and algorithms for sparse support vector machines with l1 regular terms[J]. Frontiers of Chinese Pure Mathematics, 2024, 2(3): 16-24.