《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (7): 2073-2079.DOI: 10.11772/j.issn.1001-9081.2023070923
收稿日期:
2023-07-11
修回日期:
2023-09-19
接受日期:
2023-09-20
发布日期:
2023-10-26
出版日期:
2024-07-10
通讯作者:
赵为华
作者简介:
葛焌迟(1993—),男,江苏南通人,硕士研究生,主要研究方向:统计学习、高维数据分析;基金资助:
Received:
2023-07-11
Revised:
2023-09-19
Accepted:
2023-09-20
Online:
2023-10-26
Published:
2024-07-10
Contact:
Weihua ZHAO
About author:
GE Junchi, born in 1993, M. S. candidate. His research interests include statistical learning, high-dimensional data analysis.Supported by:
摘要:
距离加权判别(DWD)是一种已被广泛应用的矩阵数据分类模型,当数据中存在严重的噪声污染时,该模型的性能会明显下降。鲁棒主成分分析(RPCA)因具备分离数据矩阵低秩结构和稀疏部分的特性已成为解决该问题的有效手段之一。因此,提出一种矩阵数据鲁棒距离加权判别(RDWD-2D)模型。特别地,该模型以有监督的方式对数据矩阵进行鲁棒主成分分析,并同步实现干净数据的恢复与分类。在MNIST和COIL20数据集上的实验结果表明,针对有噪声污染或数据缺失的矩阵数据,与DWD-2D、RPCA+DWD等模型相比,RDWD-2D模型有最好的数据恢复能力和最高的分类准确率;同时RDWD-2D模型对于数据污染度也有较好的鲁棒性。
中图分类号:
葛焌迟, 赵为华. 矩阵数据基于鲁棒主成分分析的距离加权判别分析[J]. 计算机应用, 2024, 44(7): 2073-2079.
Junchi GE, Weihua ZHAO. Distance weighted discriminant analysis based on robust principal component analysis for matrix data[J]. Journal of Computer Applications, 2024, 44(7): 2073-2079.
二分类对象 | 像素污染率/% | SVM | DWD-1D | DWD-2D | RPCA+DWD | RDWD-2D |
---|---|---|---|---|---|---|
数字1/ 数字7 | 0 | 95.37 | 95.89 | 98.46 | 97.93 | 98.37 |
5 | 95.07 | 95.46 | 97.84 | 98.06 | 98.54 | |
10 | 94.83 | 95.02 | 94.32 | 97.62 | 97.89 | |
15 | 92.65 | 94.58 | 93.47 | 96.78 | 97.52 | |
20 | 87.28 | 94.29 | 92.81 | 95.94 | 97.63 | |
25 | 85.76 | 92.83 | 93.44 | 96.19 | 97.07 | |
30 | 82.42 | 91.16 | 90.23 | 96.43 | 97.21 | |
35 | 79.36 | 88.78 | 89.05 | 93.11 | 95.87 | |
40 | 77.29 | 80.48 | 86.27 | 92.72 | 95.53 | |
数字3/ 数字8 | 0 | 96.12 | 94.95 | 98.01 | 96.44 | 97.23 |
5 | 95.63 | 95.42 | 95.76 | 97.18 | 97.29 | |
10 | 93.81 | 94.78 | 95.09 | 96.33 | 96.42 | |
15 | 90.33 | 93.21 | 92.45 | 97.02 | 97.59 | |
20 | 88.11 | 92.87 | 93.06 | 96.82 | 96.91 | |
25 | 87.28 | 89.37 | 91.38 | 95.89 | 96.73 | |
30 | 84.29 | 87.53 | 88.46 | 95.27 | 95.74 | |
35 | 81.57 | 86.39 | 87.85 | 94.03 | 96.06 | |
40 | 81.23 | 83.49 | 88.02 | 94.11 | 96.25 |
表1 不同模型分类精度对比 ( %)
Tab. 1 Comparison of classification accuracy of different models
二分类对象 | 像素污染率/% | SVM | DWD-1D | DWD-2D | RPCA+DWD | RDWD-2D |
---|---|---|---|---|---|---|
数字1/ 数字7 | 0 | 95.37 | 95.89 | 98.46 | 97.93 | 98.37 |
5 | 95.07 | 95.46 | 97.84 | 98.06 | 98.54 | |
10 | 94.83 | 95.02 | 94.32 | 97.62 | 97.89 | |
15 | 92.65 | 94.58 | 93.47 | 96.78 | 97.52 | |
20 | 87.28 | 94.29 | 92.81 | 95.94 | 97.63 | |
25 | 85.76 | 92.83 | 93.44 | 96.19 | 97.07 | |
30 | 82.42 | 91.16 | 90.23 | 96.43 | 97.21 | |
35 | 79.36 | 88.78 | 89.05 | 93.11 | 95.87 | |
40 | 77.29 | 80.48 | 86.27 | 92.72 | 95.53 | |
数字3/ 数字8 | 0 | 96.12 | 94.95 | 98.01 | 96.44 | 97.23 |
5 | 95.63 | 95.42 | 95.76 | 97.18 | 97.29 | |
10 | 93.81 | 94.78 | 95.09 | 96.33 | 96.42 | |
15 | 90.33 | 93.21 | 92.45 | 97.02 | 97.59 | |
20 | 88.11 | 92.87 | 93.06 | 96.82 | 96.91 | |
25 | 87.28 | 89.37 | 91.38 | 95.89 | 96.73 | |
30 | 84.29 | 87.53 | 88.46 | 95.27 | 95.74 | |
35 | 81.57 | 86.39 | 87.85 | 94.03 | 96.06 | |
40 | 81.23 | 83.49 | 88.02 | 94.11 | 96.25 |
1 | YOU J, KONG W-K, ZHANG D, et al. On hierarchical palmprint coding with multiple features for personal identification in large databases [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2004, 14(2): 234-243. |
2 | LI Y, XIA R, HUANG Q, et al. Survey of spatio-temporal interest point detection algorithms in video [J]. IEEE Access, 2017, 5: 10323-10331. |
3 | KONG H, WANG L, TEOH E K, et al. A framework of 2D Fisher discriminant analysis: application to face recognition with small number of training samples [C]// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2005: 1083-1088. |
4 | YANG J, CHU D, ZHANG L, et al. Sparse representation classifier steered discriminative projection with applications to face recognition [J]. IEEE Transactions on Neural Networks and Learning Systems, 2013, 24(7): 1023-1035. |
5 | TOMIOKA R, AIHARA K, K-R MÜLLER. Logistic regression for single trial EEG classification [C]// Proceedings of the 19th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2006: 1377-1384. |
6 | LAMPERT T, O’KEEFE S. A survey of spectrogram track detection algorithms [J]. Applied Acoustics, 2010, 71(2): 87-100. |
7 | MARRON J S. Distance-weighted discrimination [J]. Wiley Interdisciplinary Reviews: Computational Statistics, 2015, 7(2): 109-114. |
8 | WANG B, ZOU H. Another look at distance-weighted discrimination [J]. Journal of the Royal Statistical Society Series B: Statistical Methodology, 2018, 80(1): 177-198. |
9 | WANG B, ZOU H. Sparse distance weighted discrimination [J]. Journal of Computational and Graphical Statistics, 2016, 25(3): 826-838. |
10 | QIAO X, ZHANG H H, LIU Y, et al. Weighted distance weighted discrimination and its asymptotic properties [J]. Journal of the American Statistical Association, 2010, 105(489): 401-414. |
11 | HUANG H, LU X, LIU Y, et al. R/DWD: distance-weighted discrimination for classification, visualization and batch adjustment [J]. Bioinformatics, 2012, 28(8): 1182-1183. |
12 | QIAO X, ZHANG L. Distance weighted support vector machine [J]. Statistics and Its Interface, 2015, 8: 331-345. |
13 | WANG B, ZOU H. kerndwd: Distance Weighted Discrimination (DWD) and kernel methods [EB/OL]. [2023-07-01]. . |
14 | XIONG J, DITTMER D P, MARRON J S. “Virus hunting” using radial distance weighted discrimination [J]. The Annals of Applied Statistics, 2015, 9(4): 2090-2109. |
15 | CORTES C, VAPNIK V. Support-vector networks [J]. Machine Learning, 1995, 20: 273-297. |
16 | LUO Y, TAO D, XU C, et al. Multiview vector-valued manifold regularization for multilabel image classification [J]. IEEE Transactions on Neural Networks and Learning Systems, 2013, 24(5): 709-722. |
17 | SHAO L, LIU L, LI X. Feature learning for image classification via multiobjective genetic programming [J]. IEEE Transactions on Neural Networks and Learning Systems, 2014, 25(7): 1359-1371. |
18 | GUO Y, DING G, HAN J, et al. Zero-shot learning with transferred samples [J]. IEEE Transactions on Image Processing, 2017, 26(7): 3277-3290. |
19 | ZHANG D, HAN J, LI C, et al. Detection of co-salient objects by looking deep and wide [J]. International Journal of Computer Vision, 2016, 120: 215-232. |
20 | CAI D, HE X, HU Y, et al. Learning a spatially smooth subspace for face recognition [C]// Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2007: 1-7. |
21 | SONG K, NIE F, HAN J, et al. Parameter free large margin nearest neighbor for distance metric learning [C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. Menlo Park: AAAI Press, 2017: 2555-2561. |
22 | HOU C, ZHANG C, WU Y, et al. Stable local dimensionality reduction approaches [J]. Pattern Recognition, 2009, 42(9): 2054-2066. |
23 | YAO X, HAN J, CHENG G, et al. Semantic annotation of high-resolution satellite images via weakly supervised learning [J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(6): 3660-3671. |
24 | DING G, GUO Y, ZHOU J, et al. Large-scale cross-modality search via collective matrix factorization hashing [J]. IEEE Transactions on Image Processing, 2016, 25(11): 5427-5440. |
25 | HAN J, SONG K, NIE F, et al. Bilateral k-means algorithm for fast co-clustering [C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. Menlo Park: AAAI Press, 2017: 1969-1975. |
26 | LYU T, LOCK E F, EBERLY L E. Discriminating sample groups with multi-way data [J]. Biostatistics, 2017, 18(3): 434-450. |
27 | PREGIBON D. Logistic regression diagnostics [J]. The Annals of Statistics, 1981, 9(4): 705-724. |
28 | WRIGHT J, PENG Y, MA Y, et al. Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimization [C]// Proceedings of the 22nd International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2009: 2080-2088. |
29 | 李航.统计学习方法[M]. 2版.北京:清华大学出版社, 2019: 111-154. |
LI H. Statistical Learning Methods [M]. 2nd ed. Beijing: Tsinghua University Press, 2019: 111-154. | |
30 | LIU Y, ZHANG H H, WU Y. Hard or soft classification? large-margin unified machines [J]. Journal of the American Statistical Association, 2011, 106(493): 166-177. |
31 | WANG L, ZHU J, ZOU H. The doubly regularized support vector machine [J]. Statistica Sinica, 2006, 16(2): 589-615. |
32 | LANGVILLE A N, STEWART W J. The Kronecker product and stochastic automata networks [J]. Journal of Computational and Applied Mathematics, 2004, 167(2): 429-447. |
33 | CAI J-F, CANDÈS E J, SHEN Z. A singular value thresholding algorithm for matrix completion [J]. SIAM Journal on Optimization, 2010, 20(4): 1956-1982. |
34 | LIN Z, CHEN M, MA Y. The augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices [EB/OL]. (2013-10-18) [2023-07-01]. . |
35 | WANG Y, YIN W, ZENG J. Global convergence of ADMM in nonconvex nonsmooth optimization [J]. Journal of Scientific Computing, 2019, 78: 29-63. |
36 | YE J, JANARDAN R, LI Q. Two-dimensional linear discriminant analysis [C]// Proceedings of the 17th AAAI Conference on Artificial Intelligence. Cambridge: MIT Press, 2004: 1569-1576. |
37 | HOU C, NIE F, ZHANG C, et al. Multiple rank multi-linear SVM for matrix data classification [J]. Pattern Recognition, 2014, 47(1): 454-469. |
38 | CHANG X, NIE F, WANG S, et al. Compound rank-k projections for bilinear analysis [J]. IEEE Transactions on Neural Networks and Learning Systems, 2016, 27(7): 1502-1513. |
39 | WAINWRIGHT M J, RAVIKUMAR P, LAFFERTY J. High-dimensional graphical model selection using l1-regularized logistic regression [C]// Proceedings of the 19th AAAI Conference on Artificial Intelligence. Cambridge: MIT Press, 2006: 1465-1472. |
[1] | 周云龙 陈德富 刘小湖 桑伊健 周晗昀. 基于改进Transformer的端到端说话人确认模型[J]. 《计算机应用》唯一官方网站, 0, (): 0-0. |
[2] | 王康, 董元菲. 基于角度间隔嵌入特征的端到端声纹识别模型[J]. 计算机应用, 2019, 39(10): 2937-2941. |
[3] | 向立, 严迪群, 王让定, 李孝文. 针对多种处理痕迹的数字语音取证算法[J]. 计算机应用, 2019, 39(1): 126-130. |
[4] | 解本铭, 韩明明, 张攀, 张威. 飞机牵引车语音识别的动态时间规整优化算法[J]. 计算机应用, 2018, 38(6): 1771-1776. |
[5] | 陈秋菊, 李应. 基于优化正交匹配追踪和深度置信网的声音识别[J]. 计算机应用, 2017, 37(2): 505-511. |
[6] | 晁浩, 宋成, 彭维平. 基于发音特征的声效相关鲁棒语音识别算法[J]. 计算机应用, 2015, 35(1): 257-261. |
[7] | 袁桦 蔡猛 赵军红 张卫强 刘加. 发音错误检测中基于多数据流的Tandem特征方法[J]. 计算机应用, 2014, 34(6): 1694-1698. |
[8] | 朱国腾 孙伟. 基于模板匹配的快速语音关键词检出方法[J]. 计算机应用, 2013, 33(11): 3138-3140. |
[9] | 晁浩 杨占磊 刘文举. 基于发音特征的汉语声调建模方法及其在汉语语音识别中的应用[J]. 计算机应用, 2013, 33(10): 2939-2944. |
[10] | 晁浩 杨占磊 刘文举. 汉语语音识别中基于音节的声学模型改进算法[J]. 计算机应用, 2013, 33(06): 1742-1745. |
[11] | 王改良 武妍. 基于仿生模式识别理论的声调识别[J]. 计算机应用, 2010, 30(10): 2709-2711. |
[12] | 那斯尔江·吐尔逊 吾守尔·斯拉木. 基于HMM的维吾尔语连续语音识别系统[J]. 计算机应用, 2009, 29(07): 2009-2011. |
[13] | 刘宗礼 曹洁 郝元宏. 一种新的特征提取方法及其在模式识别中的应用[J]. 计算机应用, 2009, 29(4): 1032-1035. |
[14] | 刘勇进 史晓东 . 基于HTK的语音识别的并行化研究与实现[J]. 计算机应用, 2009, 29(4): 1052-1055. |
[15] | 王永生,柴佩琪. 英语语音合成中基于有限泛化法的字素切分规则的机器学习[J]. 计算机应用, 2005, 25(09): 2010-2014. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||