基于lq正则项的稀疏线性判别分析

陈静; CHEN Jing; 高彩霞; GAO Caixia

doi:10.48014/fcpm.20230529001

基于lq正则项的稀疏线性判别分析

Sparse Linear Discriminant Analysis Based on lq Regularization

中国理论数学前沿 2023年第1卷第2期页码[31-38] 下载全文[0.6MB] [HTML]

摘要

线性判别分析在特征提取和数据降维及分类上具有重要的作用, 随着科技的进步, 需要处理的数据越来越庞大, 然而在高维情形下, 线性判别分析面对着两个问题—投影后的数据解释性不足, 因为它们均涉及到所有p个特征, 是所有特征的线性组合以及类内协方差矩阵的奇异性问题。线性判别分析存在三个不同的论点: 多元高斯模型、Fisher判别问题和最优评分问题。针对这两个问题, 本文建立了一种求解第k个判别成分的模型, 该模型首先对线性判别分析中的Fisher判别问题原始模型做出变换, 利用类内方差的对角估计矩阵代替原始类内协方差矩阵, 克服了矩阵奇异的问题, 同时将其投影到正交投影空间上, 以便去掉其正交约束, 随后加入了lq范数正则项, 增强其解释性, 实现降维和分类的目的。最后给出了求解该模型的迭代算法及收敛性分析, 证明了由该算法产生的序列是下降收敛的, 且对任意初始值均收敛到问题的局部最小值。

Abstract

Linear discriminant analysis plays an important role in feature extraction, data dimensionality reduction, and classification. With the progress of science and technology, the data that need to be processed are becoming increasingly large. However, in high-dimensional situations, linear discriminant analysis faces two problems: the lack of interpretability of the projected data since they all involve all p features, which are linear combinations of all features, as well as the singularity problem of the within-class covariance matrix. There are three different arguments for linear discriminant analysis: multivariate Gaussian model, Fisher discrimination problem, and optimal scoring problem. To solve these two problems, this article establishes a model for solving the kth discriminant component, which first transforms the original model of Fisher discriminant problem in linear discriminant analysis by using a diagonal estimated matrix for the within-class variance in place of the original within-class covariance matrix, which overcomes the singularity problem of the matrix and projects it to an orthogonal projection space to remove its orthogonal constraints, and subsequently an lq norm regularization term is added to enhance its interpretability for the purpose of dimensionality reduction and classification. Finally, an iterative algorithm for solving the model and a convergence analysis are given, and it is proved that the sequence generated by the algorithm is descended and converges to a local minimum of the problem for any initial value.

DOI

10.48014/fcpm.20230529001

文章类型

研究性论文

收稿日期

2023-05-29

接收日期

2023-06-21

出版日期

2023-09-28

关键词

线性判别分析, 稀疏优化, lq 范数, 投影

Keywords

Linear discriminant analysis, sparse optimization, lq norm, projection

作者

陈静, 高彩霞^*

Author

CHEN Jing, GAO Caixia^*

所在单位

内蒙古大学数学科学学院, 呼和浩特 010021

Company

School of Mathematical Science, Inner Mongolia University, Hohhot 010021, China

浏览量

947

下载量

554

参考文献

[1] Taibi F, Akbarizadeh G, Farshidi E. Robust reservoir rock fracture recognition based on a new sparse feature learning and data training method[J]. Multidimensional Systems and Signal Processing, 2019, 30(4): 390-403.
[2] Sharifzadeh F, Akbarizadeh G, Kavian Y S. Ship Classification in SAR Images Using a New Hybrid CNN-MLP Classifier[J]. Journal of the Indian Society of Remote Sensing, 2019, 47(4): 551-562.
https://doi.org/10.1007/s12524-018-0891-y
[3] Zhang Z Y, Wang J, Zha H Y. Adaptive manifold learning[ J]. IEEE transactions on pattern analysis and machine intelligence, 2012, 34(2): 253-265.
https://doi.org/10.1109/TPAMI.2011.115
[4] Hand D J. Classifier Technology and the Illusion of Progress[J]. Statistical Science, 2006, 21(1): 1-15.
[5] Clemmensen L, Hastie T, Witten D, et al. Sparse discriminant analysis[J]. Techonometrics, 2011, 53(4): 406-413.
https://doi.org/10.1198/TECH.2011.08118
[6] Krzanowski W J, Jonathan P, McCarthy W V, et al. Discriminant Analysis with Singular Covariance Matrices: Methods and Applications to Spectroscopic Data[J]. Journal of the Royal Statistical Society. Series C(Applied Statistics), 1995, 44(1): 101-115.
[7] Guo Y, Hastie T, Tibshirani R. Regularized linear discriminant analysis and its application in microarrays[J]. Biostatistics(Oxford, England), 2007, 8(1): 86-100.
[8] Zou H, Hastie T, Tibshirani R. Sparse Principal Component Analysis[J]. Journal of Computational and Graphical Statistics, 2006, 15(2): 265-286.
[9] Qiao Z H, Zhou L, Huang J Z. Sparse Linear Discriminant Analysis with Applications to High Dimensional Low Sample Size Data[J]. IAENG International Journal of Applied Mathematics, 2009, 39(1): 48.
[10] Witten D M, Tibshirani R. Penalized classification using Fisher's linear discriminant[J]. Journal of the Royal Statistical Society. Series B, Statistical methodology, 2011, 73(5): 753-772.
https://doi.org/10.1111/j.1467-9868.2011.00783.x
[11] Shao J, Wang Y Z, Deng X W, et al. Sparse linear linear discriminant analysis by thersholding for high dimensional data[J]. The Annals of Statistics, 2011, 39(2): 1241-1265.
https://doi.org/10.2307/29783672
[12] Trevor H, Andreas B, Robert T. Penalized Discriminant Analysis[J]. The Annals of Statistics, 1995, 23(1): 73-102.
[13] Esedo 􀭺glu S, Osher S J. Decomposition of images by the anisotropic Rudin-Osher-Fatemi model[J]. Communications on Pure and Applied Mathematics, 2004, 57(12): 1609-1626.
https://doi.org/10.1002/cpa.20045
[14] Yin Q, Shu L. Sparse linear discriminant analysis via l0 constraint[J]. Journal of University of Science and Technology of China, 2022, 52(08): 21-27.
[15] Hoai A L T, Duy N P. DC programming and DCA for sparse optimal scoring problem[J]. Neurocomputing, 2016, 186(1): 170-181.
https://doi.org/10.1016/j.neucom.2015.12.068
[16] Li G Q, Duan X X, Wu Z Y, et al. Generalized elastic net optimal scoring problem for feature selection[J]. Neurocomputing, 2021, 447(447): 183-195.
https://doi.org/10.1016/J.NEUCOM.2021.03.018
[17] Duintjer T J, Schlesinger P. Improving implementation of linear discriminant analysis for the high dimension small sample size problem[J]. Computational Statistics and Data Analysis, 2007, 52(1): 423-437.
https://doi.org/10.1016/j.csda.2007.02.001
[18] Guo Y, Hastie T, Tibshirani R. Regularized linear discriminant analysis and its application in microarrays[J]. Biostatistics, 2007, 8(1): 86-100.
[19] 尹祺. 基于l0 惩罚下的主成分分析与线性判别分析[D]. 合肥: 中国科学技术大学, 2022.
https://doi.org/10.27517/d.cnki.gzkju.2022.000185
[20] Friedman J H. Regularized Discriminant Analysis[J]. Journal of the American Statistical Association, 1989, 84(405): 165-175.
[21] Mai Q, Zou H. A Note On the Connection and Equivalence of Three Sparse Linear Discriminant Analysis Methods[J]. Technometrics, 2013, 55(2): 243-246.

引用本文

陈静, 高彩霞. 基于lq 正则项的稀疏线性判别分析[J]. 中国理论数学前沿, 2023, 1(2): 31-38.

Citation

CHEN Jing, GAO Caixia. Sparse linear discriminant analysis based on lq regularization[J]. Frontiers of Chinese Pure Mathematics, 2023, 1(2): 31-38.