Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (9): 2657-2664.DOI: 10.11772/j.issn.1001-9081.2022091404
• 2022 10th CCF Conference on Big Data • Previous Articles Next Articles
Tian HE1, Zongxin SHEN1, Qianqian HUANG2, Yanyong HUANG1()
Received:
2022-09-20
Revised:
2022-10-27
Accepted:
2022-11-03
Online:
2023-09-10
Published:
2023-09-10
Contact:
Yanyong HUANG
About author:
HE Tian, born in 1995, M. S. His research interests include data mining.Supported by:
通讯作者:
黄雁勇
作者简介:
何添(1995—),男,重庆人,硕士,主要研究方向:数据挖掘基金资助:
CLC Number:
Tian HE, Zongxin SHEN, Qianqian HUANG, Yanyong HUANG. Adaptive learning-based multi-view unsupervised feature selection method[J]. Journal of Computer Applications, 2023, 43(9): 2657-2664.
何添, 沈宗鑫, 黄倩倩, 黄雁勇. 基于自适应学习的多视图无监督特征选择方法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2657-2664.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022091404
符号 | 符号含义 |
---|---|
数据集中第v个视图的数据 | |
数据集中第v个视图的第i个样本 | |
Y | 模糊隶属度矩阵 |
第i个样本属于第k个簇的模糊隶属度 | |
n | 样本量 |
m | 视图个数 |
c | 类别数 |
S | 样本相似度矩阵, |
样本相似度矩阵 S 中第i行,第j列的元素 | |
第v个视图中特征数 | |
第v个视图的视图权重 | |
第v个视图中的特征权重向量 | |
α | 超参数,用于控制相似度矩阵 S 的稀疏性 |
β | 超参数,用于控制特征权重向量稀疏性 |
超参数,用于融合两个聚类过程 | |
γ | 超参数,用于控制视图稀疏性 |
超参数,用于控制相似度矩阵 S 中连通分量个数 |
Tab.1 Symbols and their meanings
符号 | 符号含义 |
---|---|
数据集中第v个视图的数据 | |
数据集中第v个视图的第i个样本 | |
Y | 模糊隶属度矩阵 |
第i个样本属于第k个簇的模糊隶属度 | |
n | 样本量 |
m | 视图个数 |
c | 类别数 |
S | 样本相似度矩阵, |
样本相似度矩阵 S 中第i行,第j列的元素 | |
第v个视图中特征数 | |
第v个视图的视图权重 | |
第v个视图中的特征权重向量 | |
α | 超参数,用于控制相似度矩阵 S 的稀疏性 |
β | 超参数,用于控制特征权重向量稀疏性 |
超参数,用于融合两个聚类过程 | |
γ | 超参数,用于控制视图稀疏性 |
超参数,用于控制相似度矩阵 S 中连通分量个数 |
数据集 | 视图数 | 样本数 | 类别数 | 特征维度 |
---|---|---|---|---|
Yale[ | 4 | 165 | 15 | 256/256/256/256 |
MSRCV1[ | 6 | 210 | 7 | 1 203/48/512/100/256/ 210 |
Politics[ | 7 | 348 | 7 | 1 047/1 051/348/348/ 348/348/348 |
Politicsuk[ | 3 | 419 | 5 | 2 879/419/419 |
WikipediaArticles[ | 2 | 693 | 10 | 128/10 |
Rugby[ | 2 | 854 | 15 | 854/854 |
WebKB[ | 2 | 1 051 | 2 | 2 949/334 |
Caltech101-7[ | 6 | 1 474 | 7 | 48/40/254/1 984/512/928 |
Tab. 2 Statistics of datasets
数据集 | 视图数 | 样本数 | 类别数 | 特征维度 |
---|---|---|---|---|
Yale[ | 4 | 165 | 15 | 256/256/256/256 |
MSRCV1[ | 6 | 210 | 7 | 1 203/48/512/100/256/ 210 |
Politics[ | 7 | 348 | 7 | 1 047/1 051/348/348/ 348/348/348 |
Politicsuk[ | 3 | 419 | 5 | 2 879/419/419 |
WikipediaArticles[ | 2 | 693 | 10 | 128/10 |
Rugby[ | 2 | 854 | 15 | 854/854 |
WebKB[ | 2 | 1 051 | 2 | 2 949/334 |
Caltech101-7[ | 6 | 1 474 | 7 | 48/40/254/1 984/512/928 |
数据集 | All-feature | ALMUFS | ACSL | ASVM | CGMV-UFS | OMVFS | LS |
---|---|---|---|---|---|---|---|
Yale | 44.65±4.01 | 36.24±2.68* | 41.49±4.51* | 36.93±2.64* | 36.38±3.35* | 39.82±4.64* | |
MSRCV1 | 68.70±8.21* | 72.83±8.41 | 63.17±9.02* | 55.56±6.87* | 61.98±7.28* | 68.27±8.84 | |
Politics | 49.83±6.89* | 63.65±9.02 | 50.76±4.82* | 47.84±4.80* | 44.01±3.94* | 42.27±4.21* | |
Politicsuk | 72.11±7.15 | 63.52±6.26* | 52.00±5.03* | 48.23±2.03* | 56.78±7.73* | 59.90±4.77* | |
WikipediaArticles | 23.71±1.36* | 45.00±3.55 | 25.83±2.28* | 20.34±1.05* | 27.19± 1.95* | 25.91±3.34* | |
Rugby | 44.31±6.55* | 52.08±5.16 | 36.89±5.79* | 39.48±4.94* | 27.70±3.81* | 40.27±6.01* | |
WebKB | 78.01±0.18 | 63.46±0.83* | 66.37±2.79* | 66.64±0.74* | 66.56±1.60* | 66.06±1.70* | |
Caltech101-7 | 51.93±6.67 | 54.21±7.60 | 52.42± 6.49 | 50.60±7.36* | 50.66±4.88 | 52.27±5.87 |
Tab. 3 ACC results with feature selection ratio of 0.4
数据集 | All-feature | ALMUFS | ACSL | ASVM | CGMV-UFS | OMVFS | LS |
---|---|---|---|---|---|---|---|
Yale | 44.65±4.01 | 36.24±2.68* | 41.49±4.51* | 36.93±2.64* | 36.38±3.35* | 39.82±4.64* | |
MSRCV1 | 68.70±8.21* | 72.83±8.41 | 63.17±9.02* | 55.56±6.87* | 61.98±7.28* | 68.27±8.84 | |
Politics | 49.83±6.89* | 63.65±9.02 | 50.76±4.82* | 47.84±4.80* | 44.01±3.94* | 42.27±4.21* | |
Politicsuk | 72.11±7.15 | 63.52±6.26* | 52.00±5.03* | 48.23±2.03* | 56.78±7.73* | 59.90±4.77* | |
WikipediaArticles | 23.71±1.36* | 45.00±3.55 | 25.83±2.28* | 20.34±1.05* | 27.19± 1.95* | 25.91±3.34* | |
Rugby | 44.31±6.55* | 52.08±5.16 | 36.89±5.79* | 39.48±4.94* | 27.70±3.81* | 40.27±6.01* | |
WebKB | 78.01±0.18 | 63.46±0.83* | 66.37±2.79* | 66.64±0.74* | 66.56±1.60* | 66.06±1.70* | |
Caltech101-7 | 51.93±6.67 | 54.21±7.60 | 52.42± 6.49 | 50.60±7.36* | 50.66±4.88 | 52.27±5.87 |
数据集 | All-feature | ALMUFS | ACSL | ASVM | CGMV-UFS | OMVFS | LS |
---|---|---|---|---|---|---|---|
Yale | 50.06±4.05 | 38.63±2.61* | 44.90±3.72* | 39.56±2.15* | 39.48±2.81* | 42.34±3.73* | |
MSRCV1 | 72.58±6.94* | 76.82±6.07 | 68.57± 7.14* | 59.44±5.62* | 66.22±5.57* | 72.88±6.71* | |
Politics | 53.00±5.52* | 72.02±6.29 | 53.00±4.00* | 50.52± 4.47* | 47.08±2.94* | 46.40±4.00* | |
Politicsuk | 78.71±11.60 | 63.12±3.53* | 59.36±4.21* | 53.01±3.40* | 63.19±7.00* | 63.39±2.68* | |
WikipediaArticles | 25.83±1.45* | 46.72±2.94 | 26.66±1.83* | 21.72±0.92* | 29.60±1.99* | 27.17±2.67* | |
Rugby | 47.70±5.35* | 56.14±4.28 | 40.87±4.40* | 42.35± 4.27* | 33.52±2.59* | 44.82±5.11* | |
WebKB | 87.45±0.38 | 65.57±0.55* | 68.25±3.90* | 67.71±0.50* | 67.63±1.17* | 66.70±1.09* | |
Caltech101-7 | 55.49±5.27 | 57.35±5.19 | 54.36±5.59* | 53.41±6.28* | 54.58±3.23* | 56.29±4.39 |
Tab. 4 F-measure results with feature selection ratio of 0.4
数据集 | All-feature | ALMUFS | ACSL | ASVM | CGMV-UFS | OMVFS | LS |
---|---|---|---|---|---|---|---|
Yale | 50.06±4.05 | 38.63±2.61* | 44.90±3.72* | 39.56±2.15* | 39.48±2.81* | 42.34±3.73* | |
MSRCV1 | 72.58±6.94* | 76.82±6.07 | 68.57± 7.14* | 59.44±5.62* | 66.22±5.57* | 72.88±6.71* | |
Politics | 53.00±5.52* | 72.02±6.29 | 53.00±4.00* | 50.52± 4.47* | 47.08±2.94* | 46.40±4.00* | |
Politicsuk | 78.71±11.60 | 63.12±3.53* | 59.36±4.21* | 53.01±3.40* | 63.19±7.00* | 63.39±2.68* | |
WikipediaArticles | 25.83±1.45* | 46.72±2.94 | 26.66±1.83* | 21.72±0.92* | 29.60±1.99* | 27.17±2.67* | |
Rugby | 47.70±5.35* | 56.14±4.28 | 40.87±4.40* | 42.35± 4.27* | 33.52±2.59* | 44.82±5.11* | |
WebKB | 87.45±0.38 | 65.57±0.55* | 68.25±3.90* | 67.71±0.50* | 67.63±1.17* | 66.70±1.09* | |
Caltech101-7 | 55.49±5.27 | 57.35±5.19 | 54.36±5.59* | 53.41±6.28* | 54.58±3.23* | 56.29±4.39 |
1 | LIU H, MOTODA H. Feature Selection for Knowledge Discovery and Data Mining, SECS 454[M]. New York: Springer, 1998: 1-10. 10.1007/978-1-4615-5689-3 |
2 | LI J D, CHENG K W, WANG S H, et al. Feature selection: a data perspective[J]. ACM Computing Surveys, 2017, 50(6): No.94. 10.1145/3136625 |
3 | STAŃCZYK U, JAIN L C. Feature selection for data and pattern recognition: an introduction[M]// Feature Selection for Data and Pattern Recognition, SCI 584. Berlin: Springer, 2015: 1-7. 10.1007/978-3-662-45620-0_1 |
4 | MOHAMAD M A, HASSAN H, NASIEN D, et al. A review on feature extraction and feature selection for handwritten character recognition[J]. International Journal of Advanced Computer Science and Applications, 2015, 6(2): 204-212. 10.14569/ijacsa.2015.060230 |
5 | FEGN L, CAI L, LIU Y, et al. Multi-view spectral clustering via robust local subspace learning[J]. Soft Computing, 2017, 21(8): 1937-1948. 10.1007/s00500-016-2120-3 |
6 | CAI J, LUO J W, WANG S L, et al. Feature selection in machine learning: a new perspective[J]. Neurocomputing, 2018, 300: 70-79. 10.1016/j.neucom.2017.11.077 |
7 | NGUYEN B H, XUE B, ZHANG M J. A survey on swarm intelligence approaches to feature selection in data mining[J]. Swarm and Evolutionary Computation, 2020, 54: No.100663. 10.1016/j.swevo.2020.100663 |
8 | ZEBARI R R, ABDULAZEEZ A M, ZEEBAREE D Q, et al. A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction[J]. Journal of Applied Science and Technology Trends, 2020, 1(2): 56-70. 10.38094/jastt1224 |
9 | LUALDI M, FASANO M. Statistical analysis of proteomics data: a review on feature selection[J]. Journal of Proteomics, 2019, 198: 18-26. 10.1016/j.jprot.2018.12.004 |
10 | CHANDRA B, GUPTA M. An efficient statistical feature selection approach for classification of gene expression data[J]. Journal of Biomedical Informatics, 2011, 44(4): 529-535. 10.1016/j.jbi.2011.01.001 |
11 | HE X F, CAI D, NIYOGI P. Laplacian score for feature selection[C]// Proceedings of the 18th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2005: 507-514. |
12 | ZHAO Z A, LIU H. Spectral Feature Selection for Data Mining[M]. New York: Chapman & Hall, 2011: 1-110. |
13 | ZHAO Z, WANG L, LIU H. Efficient spectral feature selection with minimum redundancy[C]// Proceedings of the 24th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2010: 673-678. 10.1609/aaai.v24i1.7671 |
14 | WANG Z, FENG Y F, QI T, et al. Adaptive multi-view feature selection for human motion retrieval[J]. Signal Processing, 2016, 120: 691-701. 10.1016/j.sigpro.2014.11.015 |
15 | TANG J L, HU X, GAO H J, et al. Unsupervised feature selection for multi-view data in social media[C]// Proceedings of the 2013 SIAM International Conference on Data Mining. Philadelphia, PA: SIAM, 2013: 270-278. 10.1137/1.9781611972832.30 |
16 | FENG Y F, XIAO J, ZHUANG Y T, et al. Adaptive unsupervised multi-view feature selection for visual concept recognition[C]// Proceedings of the 2012 Asian Conference on Computer Vision, LNCS 7724. Berlin: Springer, 2013: 343-357. |
17 | DONG X, ZHU L, SONG X M, et al. Adaptive collaborative similarity learning for unsupervised multi-view feature selection[C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence. California: ijcai.org, 2018: 2064-2070. 10.24963/ijcai.2018/285 |
18 | HOU C P, NIE F P, TAO H, et al. Multi-view unsupervised feature selection with adaptive similarity and view weight[J]. IEEE Transactions on Knowledge and Data Engineering, 2017, 29(9): 1998-2011. 10.1109/tkde.2017.2681670 |
19 | SHAO W X, HE L F, LU C T, et al. Online unsupervised multi-view feature selection[C]// Proceedings of the IEEE 16th International Conference on Data Mining. Piscataway: IEEE, 2016: 1203-1208. 10.1109/icdm.2016.0160 |
20 | TANG C, CHEN J J, LIU X W, et al. Consensus learning guided multi-view unsupervised feature selection[J]. Knowledge-Based Systems, 2018, 160: 49-60. 10.1016/j.knosys.2018.06.016 |
21 | BEZDEK J C, EHRLICH R, FULL W. FCM: the fuzzy c-means clustering algorithm[J]. Computers and Geosciences, 1984, 10(2/3): 191-203. 10.1016/0098-3004(84)90020-7 |
22 | TANG C, ZHENG X, LIU X W, et al. Cross-view locality preserved diversity and consensus learning for multi-view unsupervised feature selection[J]. IEEE Transactions on Knowledge and Data Engineering, 2022, 34(10): 4705-4716. 10.1109/tkde.2020.3048678 |
23 | MIAO J Y, YANG T J, SUN L J, et al. Graph regularized locally linear embedding for unsupervised feature selection[J]. Pattern Recognition, 2022, 122: No.108299. 10.1016/j.patcog.2021.108299 |
24 | ZHU X F, LI X L, ZHANG S C, et al. Robust joint graph sparse coding for unsupervised spectral feature selection[J]. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28(6): 1263-1275. 10.1109/tnnls.2016.2521602 |
25 | SOLORIO-FERNáNDEZ S, CARRASCO-OCHOA J A, MARTÍNEZ-TRINIDAD J F. A review of unsupervised feature selection methods[J]. Artificial Intelligence Review, 2020, 53(2): 907-948. 10.1007/s10462-019-09682-y |
26 | LUO M N, NIE F P, CHANG X J, et al. Adaptive unsupervised feature selection with structure regularization[J]. IEEE Transactions on Neural Networks and Learning Systems, 2018, 29(4): 944-956. 10.1109/tnnls.2017.2650978 |
27 | BELHUMEUR P N, HESPANHA J P, KRIEGMAN D J. Eigenfaces vs. fisherfaces: recognition using class specific linear projection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(7): 711-720. 10.1109/34.598228 |
28 | LEE Y J, GRAUMAN K. Foreground focus: unsupervised learning from partially matching images[J]. International Journal of Computer Vision, 2009, 85(2): 143-166. 10.1007/s11263-009-0252-y |
29 | GREENE D, CUNNINGHAM P. Producing a unified graph representation from multiple social network views[C]// Proceedings of the 5th Annual ACM Web Science Conference. New York: ACM, 2013: 118-121. 10.1145/2464464.2464471 |
30 | YANG Y, WANG H. Multi-view clustering: a survey[J]. Big Data Mining and Analytics, 2018, 1(2): 83-107. 10.26599/bdma.2018.9020003 |
31 | ZONG L, ZHANG X, LIU X, et al. Multi-view clustering on data with partial instances and clusters[J]. Neural Networks, 2020, 129: 19-30. 10.1016/j.neunet.2020.05.021 |
32 | SINDHWANI V, NIYOGI P, BELKIN M. Beyond the point cloud: from transductive to semi-supervised learning[C]// Proceedings of the 22nd International Conference on Machine Learning. New York: ACM, 2005: 824-831. 10.1145/1102351.1102455 |
33 | DUECK D, FREY B J. Non-metric affinity propagation for unsupervised image categorization[C]// Proceedings of the IEEE 11th International Conference on Computer Vision. Piscataway: IEEE, 2007: 1-8. 10.1109/iccv.2007.4408853 |
[1] | Tianyu HUANG, Yuanxing LI, Hao CHEN, Zijia GUO, Mingjun WEI. User cluster partitioning method based on weighted fuzzy clustering in ground-air collaboration scenarios [J]. Journal of Computer Applications, 2024, 44(5): 1555-1561. |
[2] | Jingxin LIU, Wenjing HUANG, Liangsheng XU, Chong HUANG, Jiansheng WU. Unsupervised feature selection model with dictionary learning and sample correlation preservation [J]. Journal of Computer Applications, 2024, 44(12): 3766-3775. |
[3] | Jianwen GAN, Yan CHEN, Peng ZHOU, Liang DU. Clustering ensemble algorithm with high-order consistency learning [J]. Journal of Computer Applications, 2023, 43(9): 2665-2672. |
[4] | Jinghuan LAO, Dong HUANG, Changdong WANG, Jianhuang LAI. Multi-view ensemble clustering algorithm based on view-wise mutual information weighting [J]. Journal of Computer Applications, 2023, 43(6): 1713-1718. |
[5] | Zhifeng MA, Junyang YU, Longge WANG. Diversity represented deep subspace clustering algorithm [J]. Journal of Computer Applications, 2023, 43(2): 407-412. |
[6] | Peichong WANG, Haojing FENG, Lirong LI. Improved TLBO algorithm with adaptive competitive learning [J]. Journal of Computer Applications, 2023, 43(12): 3868-3874. |
[7] | Yu YANG, Weiwei DUAN. Spectral clustering based dynamic community discovery algorithm in social network [J]. Journal of Computer Applications, 2023, 43(10): 3129-3135. |
[8] | Gaofeng PAN, Yuan FAN, Yu RU, Yuchao GUO. Low-texture monocular visual simultaneous localization and mapping algorithm based on point-line feature fusion [J]. Journal of Computer Applications, 2022, 42(7): 2170-2176. |
[9] | ZHANG Cheng, WAN Yuan, QIANG Haopeng. Deep unsupervised discrete cross-modal hashing based on knowledge distillation [J]. Journal of Computer Applications, 2021, 41(9): 2523-2531. |
[10] | CAI Ruiguang, ZHANG Desheng, XIAO Yanting. Parameter independent weighted local mean-based pseudo nearest neighbor classification algorithm [J]. Journal of Computer Applications, 2021, 41(6): 1694-1700. |
[11] | LYU Yali, MIAO Junzhong, HU Weixin. Semi-supervised learning algorithm of graph based on label metric learning [J]. Journal of Computer Applications, 2020, 40(12): 3430-3436. |
[12] | SONG Yan, YIN Jun. Multi-view spectral clustering algorithm based on shared nearest neighbor [J]. Journal of Computer Applications, 2020, 40(11): 3211-3216. |
[13] | LIU Ran, LIU Yu, GU Jinguang. Improved AdaNet based on adaptive learning rate optimization [J]. Journal of Computer Applications, 2020, 40(10): 2804-2810. |
[14] | YANG Yanlin, YE Zhonglin, ZHAO Haixing, MENG Lei. Link prediction algorithm based on high-order proximity approximation [J]. Journal of Computer Applications, 2019, 39(8): 2366-2373. |
[15] | ZHU Jie, ZHANG Junsan, WU Shufang, DONG Yukun, LYU Lin. Multi-center convolutional feature weighting based image retrieval [J]. Journal of Computer Applications, 2018, 38(10): 2778-2781. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||