基于汉明距离的量子K-Means算法

doi:10.11772/j.issn.1001-9081.2022091469

《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (8): 2493-2498.DOI: 10.11772/j.issn.1001-9081.2022091469

基于汉明距离的量子K-Means算法

钟静, 林晨, 盛志伟(), 张仕斌

成都信息工程大学网络空间安全学院，成都 610225

收稿日期:2022-10-08 修回日期:2023-02-09 接受日期:2023-02-10 发布日期:2023-08-07 出版日期:2023-08-10
通讯作者: 盛志伟
作者简介:钟静（1997—），女，四川内江人，硕士研究生，主要研究方向：量子聚类算法、量子模糊算法
林晨（1990—），男，甘肃张掖人，讲师，博士，主要研究方向：量子容错计算、量子机器学习
张仕斌（1970—），男，重庆人，教授，博士，CCF高级会员，主要研究方向：量子计算与量子安全通信、网络与信息安全、人工智能与系统安全、区块链技术及应用。
基金资助:
国家自然科学基金资助项目(62076042)

Quantum K-Means algorithm based on Hamming distance

Jing ZHONG, Chen LIN, Zhiwei SHENG(), Shibin ZHANG

School of Cybersecurity，Chengdu University of Information Technology，Chengdu Sichuan 610225，China

Received:2022-10-08 Revised:2023-02-09 Accepted:2023-02-10 Online:2023-08-07 Published:2023-08-10
Contact: Zhiwei SHENG
About author:ZHONG Jing， born in 1997， M. S. candidate. Her research interests include quantum clustering algorithm， quantum fuzzy algorithm.
LIN Chen， born in 1990， Ph. D.， lecturer. His research interests include quantum fault tolerant computing， quantum machine learning.
ZHANG Shibin， born in 1970， Ph. D.， professor. His research interests include quantum computing and quantum secure communication， network and information security， artificial intelligence and system security， blockchain technology and applications.
Supported by:
National Natural Science Foundation of China(62076042)

摘要/Abstract

摘要：

K-Means算法在处理大规模异构数据时，通常使用欧氏距离来衡量数据点之间的相似度，然而这样存在效率低下以及计算复杂性过高的问题。受到汉明距离在处理数据相似性计算上存在显著优势的启发，提出一种基于汉明距离的量子K-Means（QKMH）算法来计算相似度。首先，将数据制备成量子态，并使用量子汉明距离计算待聚类点和K个聚类中心之间的相似度；然后，改进了Grover最小值搜索算法查找距离待聚类点最近的聚类中心；最后，循环以上步骤，直到达到规定迭代次数或者聚类中心不再改变。基于量子模拟计算框架QisKit，将提出的算法在MNIST手写数字数据集上进行了验证并与传统和改进的多种方法进行了对比，实验结果表明，QKMH算法的F1值相较于基于曼哈顿距离的量子K-Means算法提高了10个百分点，相较于最新优化的基于欧氏距离的量子K-Means算法提高了4.6个百分点；同时经计算，QKMH算法时间复杂度比上述对比算法更低。

关键词: 量子机器学习, 量子算法, 量子K-Means算法, 汉明距离, Grover搜索算法

Abstract:

The K-Means algorithms typically utilize Euclidean distance to calculate the similarity between data points when dealing with large-scale heterogeneous data. However， this method has problems of low efficiency and high computational complexity. Inspired by the significant advantage of Hamming distance in handling data similarity calculation， a Quantum K-Means Hamming （QKMH） algorithm was proposed to calculate similarity. First， the data was prepared and made into quantum state， and the quantum Hamming distance was used to calculate similarity between the points to be clustered and the K cluster centers. Then， the Grover’s minimum search algorithm was improved to find the cluster center closest to the points to be clustered. Finally， these steps were repeated until the designated number of iterations was reached or the clustering centers no longer changed. Based on the quantum simulation computing framework QisKit， the proposed algorithm was validated on the MNIST handwritten digit dataset and compared with various traditional and improved methods. Experimental results show that the F1 score of the QKMH algorithm is improved by 10 percentage points compared with that of the Manhattan distance-based quantum K-Means algorithm and by 4.6 percentage points compared with that of the latest optimized Euclidean distance-based quantum K-Means algorithm， and the time complexity of the QKMH algorithm is lower than those of the above comparison algorithms.

Key words: quantum machine learning, quantum algorithm, quantum K-Means algorithm, Hamming distance, Grover’s search algorithm

中图分类号:

TP393.1

钟静, 林晨, 盛志伟, 张仕斌. 基于汉明距离的量子K-Means算法[J]. 计算机应用, 2023, 43(8): 2493-2498.

Jing ZHONG, Chen LIN, Zhiwei SHENG, Shibin ZHANG. Quantum K-Means algorithm based on Hamming distance[J]. Journal of Computer Applications, 2023, 43(8): 2493-2498.

图/表 5

图1 量子汉明距离整体电路图

Fig. 1 Overall circuit diagram of quantum Hamming distance

图2 可控增量电路详细电路图

Fig. 2 Detailed circuit diagram of controllable incremental circuit

图3 改进加法电路图

Fig. 3 Improved additive circuit diagram

图4 三组数字聚类结果可视图

Fig. 4 Visualization of three groups of digit clustering results

表1 各种量子K-Means算法的各项评价指标结果

Tab.1 Results of various evaluation indicators of various quantum K-Means algorithms

方法	样本	Purity	ARI	Precision	Recall	F1
方法A	0和1	0.81	0.68	0.81	0.81	0.81
	3和7	0.86	0.78	0.86	0.86	0.86
	6和9	0.90	0.89	0.90	0.90	0.90
方法B	0和1	0.80	0.68	0.80	0.80	0.80
	3和7	0.83	0.72	0.83	0.83	0.83
	6和9	0.85	0.73	0.85	0.85	0.85
方法C	0和1	0.85	0.73	0.85	0.85	0.85
	3和7	0.89	0.77	0.89	0.89	0.89
	6和9	0.90	0.82	0.90	0.90	0.90
方法D	0和1	0.88	0.78	0.88	0.88	0.88
	3和7	0.92	0.85	0.92	0.92	0.92
	6和9	0.98	0.96	0.98	0.98	0.98

参考文献 37

1	SINAGA K P， YANG M S. Unsupervised k-means clustering algorithm［J］. IEEE Access， 2020， 8： 80716-80727. 10.1109/access.2020.2988796
2	YUAN C H， YANG H T. Research on K-value selection method of K-means clustering algorithm［J］. Multidisciplinary Scientific Journal， 2019， 2（2）： 226-235. 10.3390/j2020016
3	AHMED M， SERAJ R， ISLAM S M S. The k-means algorithm： a comprehensive survey and performance evaluation［J］. Electronics， 2020， 9（8）： No.1295. 10.3390/electronics9081295
4	DENG D S. DBSCAN clustering algorithm based on density［C］// Proceedings of the 7th International Forum on Electrical Engineering and Automation. Piscataway： IEEE， 2020： 949-953. 10.1109/ifeea51475.2020.00199
5	LÖFFLER M， ZHANG A Y， ZHOU H H. Optimality of spectral clustering in the Gaussian mixture model［J］. The Annals of Statistics， 2021， 49（5）： 2506-2530. 10.1214/20-aos2044
6	SIERANOJA S， FRÄNTI P. Adapting k-means for graph clustering［J］. Knowledge and Information Systems， 2022， 64（1）： 115-142. 10.1007/s10115-021-01623-y
7	ZHU Q D， TANG X M， ELAHI A. Application of the novel harmony search optimization algorithm for DBSCAN clustering［J］. Expert Systems with Applications， 2021， 178： No.115054. 10.1016/j.eswa.2021.115054
8	WIBISONO S， ANWAR M T， SUPRIYANTO A， et al. Multivariate weather anomaly detection using DBSCAN clustering algorithm［J］. Journal of Physics： Conference Series， 2021， 1869： No.012077. 10.1088/1742-6596/1869/1/012077
9	LIKAS A， VLASSIS N， VERBEEK J J. The global k-means clustering algorithm［J］. Pattern Recognition， 2003， 36（2）： 451-461. 10.1016/s0031-3203(02)00060-2
10	MONTANARO A. Quantum algorithms： an overview［J］. npj Quantum Information， 2016， 2： No.15023. 10.1038/npjqi.2015.23
11	ADADI A. A survey on data-efficient algorithms in big data era［J］. Journal of Big Data， 2021， 8： No.24. 10.1186/s40537-021-00419-9
12	HUANG H Y， BROUGHTON M， MOHSENI M， et al. Power of data in quantum machine learning［J］. Nature Communications， 2021， 12： No.2631. 10.1038/s41467-021-22539-9
13	ALCHIERI L， BADALOTTI D， BONARDI P， et al. An introduction to quantum machine learning： from quantum logic to quantum deep learning［J］. Quantum Machine Intelligence， 2021， 3（2）： No.28. 10.1007/s42484-021-00056-8
14	MEYER J J， MULARSKI M， GIL-FUSTER E， et al. Exploiting symmetry in variational quantum machine learning［J］. PRX Quantum， 2023， 4（1）： No.010328. 10.1103/prxquantum.4.010328
15	MANGINI S， TACCHINO F， GERACE D， et al. Quantum computing models for artificial neural networks［J］. Europhysics Letters， 2021， 134（1）： No.10002. 10.1209/0295-5075/134/10002
16	BEER K， KHOSLA M， KÖHLER J， et al. Quantum machine learning of graph-structured data［EB/OL］. （2021-03-19）［2023-02-01］.. 10.1103/physreva.108.012410
17	LLOYD S， MOHSENI M， REBENTROST P. Quantum principal component analysis［J］. Nature Physics， 2014， 10（9）： 631-633. 10.1038/nphys3029
18	LLOYD S， MOHSENI M， REBENTROST P. Quantum algorithms for supervised and unsupervised machine learning［EB/OL］. （2013-11-04）［2023-02-01］..
19	LLOYD S， GARNERONE S， ZANARDI P. Quantum algorithms for topological and geometric analysis of data［J］. Nature Communications， 2016， 7： No.10138. 10.1038/ncomms10138
20	WIEBE N， KAPOOR A， SVORE K M. Quantum algorithms for nearest-neighbor methods for supervised and unsupervised learning［J］. Quantum Information and Computation， 2015， 15（3/4）： 318-358. 10.26421/qic15.3-4-7
21	WIEBE N， GRANADE C， FERRIE C， et al. Quantum Hamiltonian learning using imperfect quantum resources［J］. Physical Review A， 2014， 89（4）： No.042314. 10.1103/physreva.89.042314
22	AÏMEUR E， BRASSARD G， GAMBS S. Machine learning in a quantum world［C］// Proceedings of the 2006 Conference of the Canadian Society for Computational Studies of Intelligence， LNCS 4013. Berlin： Springer， 2006： 431-442.
23	AÏMEUR E， BRASSARD G， GAMBS S. Quantum clustering algorithms［C］// Proceedings of the 24th International Conference on Machine Learning. New York： ACM， 2007： 1-8. 10.1145/1273496.1273497
24	AÏMEUR E， BRASSARD G， GAMBS S. Quantum speed-up for unsupervised learning［J］. Machine Learning， 2013， 90（2）： 261-287. 10.1007/s10994-012-5316-5
25	刘雪娟，袁家斌，许娟，等. 量子k-means算法［J］. 吉林大学学报（工学版）， 2018， 48（2）：539-544. 10.13229/j.cnki.jdxbgxb20170051
	LIU X J， YUAN J B， XU J， et al. Quantum k-means algorithm［J］. Journal of Jilin University （Engineering and Technology Edition）， 2018， 48（2）： 539-544. 10.13229/j.cnki.jdxbgxb20170051
26	WU Z H， SONG T T， ZHANG Y B. Quantum k-means algorithm based on Manhattan distance［J］. Quantum Information Processing， 2022， 21（1）： No.19. 10.1007/s11128-021-03384-7
27	POGGIALI A， BERTI A， BERNASCONI A， et al. Quantum clustering with k-Means： a hybrid approach［EB/OL］. （2022-12-15）［2023-02-08］..
28	YANG N. KNN algorithm simulation based on quantum information［C/OL］// Proceedings of the 2019 Student-Faculty Research Day Conference. ［2023-02-08］.. 10.1109/icicn56848.2022.10006555
29	KAYE P. Reversible addition circuit using one ancillary bit with application to quantum computing［EB/OL］. （2004-09-06）［2023-02-01］.. 10.48550/arXiv.quant-ph/0408173
30	ZHANG G， ZHANG C C， ZHANG H Y. Improved K-means algorithm based on density Canopy［J］. Knowledge-Based Systems， 2018， 145： 289-297. 10.1016/j.knosys.2018.01.031
31	DANG Y J， JIANG N， HU H， et al. Image classification based on quantum K-Nearest-Neighbor algorithm［J］. Quantum Information Processing， 2018， 17（9）： No.239. 10.1007/s11128-018-2004-9
32	GIOVANNETTI V， LLOYD S， MACCONE L. Quantum random access memory［J］. Physical Review Letters， 2008， 100（16）： No.160501. 10.1103/physrevlett.100.160501
33	PRAKASH A. Quantum algorithms for linear algebra and machine learning［D］. Berkeley： University of California， Berkeley， 2014： 68-74.
34	KERENIDIS I， PRAKASH A. Quantum recommendation systems［EB/OL］. （2016-09-22）［2023-02-02］..
35	DÜRR C， HØYER P. A quantum algorithm for finding the minimum［EB/OL］. （1999-01-07）［2023-02-02］..
36	YU S S， CHU S W， WANG C M， et al. Two improved k-means algorithms［J］. Applied Soft Computing， 2018， 68： 747-755. 10.1016/j.asoc.2017.08.032
37	OLISEENKO V D， ABRAMOV M V， TULUPYEV A L. Identification of user accounts by image comparison： the pHash-based approach［J］. Scientific and Technical Journal of Information Technologies， Mechanics and Optics， 2021， 21（4）： 562-570. 10.17586/2226-1494-2021-21-4-562-570

[1]	马敏耀, 徐艺, 刘卓. 隐私保护DNA序列汉明距离计算问题[J]. 计算机应用, 2019, 39(9): 2636-2640.
[2]	修春波, 马云菲, 潘肖楠. 基于距离融合的图像特征点匹配方法[J]. 计算机应用, 2019, 39(11): 3158-3162.
[3]	李新春, 曹志强, 林森, 张春华. 掌纹掌脉图像超小波域融合识别算法[J]. 计算机应用, 2018, 38(8): 2205-2210.
[4]	杜柳青, 许贺作, 余永维, 张建恒. 基于改进SURF算法的柔性装夹机器人快速工件匹配方法[J]. 计算机应用, 2018, 38(7): 2050-2055.
[5]	乔屾, 吕志民, 张楠. 基于汉明距离的改进粒子群算法求解旅行商问题[J]. 计算机应用, 2017, 37(10): 2767-2772.
[6]	高洪元刁鸣. 数字滤波器设计的文化量子算法[J]. 计算机应用, 2010, 30(05): 1410-1414.

基于汉明距离的量子K-Means算法

Quantum K-Means algorithm based on Hamming distance

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 5

参考文献 37

相关文章 6

编辑推荐

Metrics