基于历史车牌识别数据的套牌车并行检测方法

doi:10.11772/j.issn.1001-9081.2016.03.864

计算机应用 ›› 2016, Vol. 36 ›› Issue (3): 864-870.DOI: 10.11772/j.issn.1001-9081.2016.03.864

基于历史车牌识别数据的套牌车并行检测方法

李悦^1,2, 刘晨^1,2

1. 北方工业大学大规模流数据集成与分析技术北京市重点实验室, 北京 100144;
2. 北方工业大学云计算研究中心, 北京 100144

收稿日期:2015-09-02 修回日期:2015-10-01 出版日期:2016-03-10 发布日期:2016-03-17
通讯作者: 李悦
作者简介:李悦(1992-),女,河南长葛人,硕士研究生,主要研究方向:数据集成、云计算;刘晨(1980-),男,山东莱芜人,副研究员,博士,CCF会员,主要研究方向:流数据集成与分析、云计算。
基金资助:
北京市教育委员会科技计划面上项目(KM201310009003);北京市属高等学校创新团队建设与教师职业发展计划基金资助项目(IDHT20130502);北方工业大学"人才强校计划"青年拔尖人才培育计划项目("增量式的大规模多源感知数据即时关联方法")。

Parallel discovery of fake plates based on historical automatic number plate recognition data

LI Yue^1,2, LIU Chen^1,2

1. Beijing Key Laboratory on Integration and Analysis of Large-scale Stream Data, North China University of Technology, Beijing 100144, China;
2. Cloud Computing Research Center, North China University of Technology, Beijing 100144, China

Received:2015-09-02 Revised:2015-10-01 Online:2016-03-10 Published:2016-03-17
Supported by:
This work is partially supported by the Scientific Research Common Program of Beijing Municipal Commission of Education (KM201310009003), Project of Construction of Innovative Teams and Teacher Career Development for Universities and Colleges under Beijing Municipality (IDHT20130502), and the Training Plan of Top Young Talents in North China University of Technology ("An Incremental Approach to Instant Discovery of Data Correlations Among Multi-Source and Large-scale Sensor Data").

摘要/Abstract

摘要： 针对现有套牌车检测方法中所具有的成本高及检测效率低等缺点,提出一种基于历史车牌识别数据(ANPR)集的套牌车并行检测方法TP-Finder,实现了基于整数划分的数据分块策略,能有效求解大规模数据并行处理时的数据倾斜问题,显著提升套牌车辆的发现性能。此外,实现了基于TP-Finder方法的套牌车辆查询系统,可准确呈现所有疑似套牌车辆的历史行车轨迹。最后,在某市真实交通数据集上对TP-Finder方法的性能进行了实验验证。实验结果表明,与缺省的MapReduce 分块策略相比较,TP-Finder的分块策略能够带来最大20%的性能提升。

关键词: 套牌车, 车牌识别数据集, 数据倾斜, 数据划分, MapReduce

Abstract: The existing detection approaches for fake plate vehicles have high cost and low efficiency. A new parallel detection approach, called TP-Finder, was proposed based on historical Automatic Number Plate Recognition (ANPR) dataset. To effectively handle the data skew problem emerged in the parallel processing of large-scale dataset, a new data partition strategy based on the idea of integer partition was implemented, which obviously improved the performance of fake plate vehicle detection. Besides, a prototype system for recognizing fake plate vehicles was developed based on the TP-Finder approach, and it could exactly present historical trajectories of all suspicious fake plate vehicles. Finally, the performance of TP-Finder approach was verified on a real ANPR dataset from a city. The experimental results prove that the partition strategy of TP-Finder can achieve a maximum of 20% performance improvement compared with the default MapReduce partition strategy.

Key words: fake plate vehicle, Automatic Number Plate Recognition (ANPR) dataset, data skew, data partition, MapReduce

中图分类号:

TP274

李悦, 刘晨. 基于历史车牌识别数据的套牌车并行检测方法[J]. 计算机应用, 2016, 36(3): 864-870.

LI Yue, LIU Chen. Parallel discovery of fake plates based on historical automatic number plate recognition data[J]. Journal of Computer Applications, 2016, 36(3): 864-870.

参考文献

[1] 唐晓东.套牌车机动车辆检测方法分析[J].中国人民公安大学学报(自然科学版),2013(2):76-79. (TANG X D. Analysis of fake plate vehicles detection methods [J]. Journal of People's Public Security University of China (Science and Technology), 2013(2): 76-79.)
[2] 杨博.物联网ZigBee技术在套牌车监管中的应用研究[J].制造业自动化,2012,34(9):41-43. (YANG B. Research on the technology of detecting fake plate vehicles by ZigBee protocol [J]. Manufacturing Automation, 2012, 34(9):41-43.)
[3] 张小琴,赵池航,沙月进,等.基于HOG特征及支持向量机的车辆品牌识别方法[J].东南大学学报(自然科学版),2013,43(S2):411-413. (ZHANG X Q, ZHAO C H, SHA Y J, et al. Vehicle brand recognition based on HOG feature and support vector machine [J]. Journal of Southeast University (Natural Science Edition), 2013, 43(S2): 411-413.)
[4] NGO H, RE C, RUDRA A. Skew strikes back: new developments in the theory of join algorithms [J]. ACM SIGMOD Record, 2013, 42(4): 5-16.
[5] 刘大有,陈慧灵,齐红,等.时空数据挖掘研究进展[J].计算机研究与发展,2013,50(2):225-239. (LIU D Y, CHEN H L, QI H, et al. Advances in spatiotemporal data mining [J]. Journal of Computer Research and Development, 2013, 50(2): 225-239.)
[6] DEAN J, GHEMAWAT S. MapReduce: simplified data processing on large clusters [J]. Communications of the ACM, 2004, 51(1): 107-113.
[7] The Apache Software Foundation. Apache Hadoop [EB/OL]. [2015-04-15]. http://hadoop.apache.org/.
[8] KOLB L, THOR A, RAHM E. Multi-pass sorted neighborhood blocking with MapReduce [J]. Computer Science — Research and Development, 2012, 27(1): 45-63.
[9] SLAGTER K, HSU C H, CHUNG Y C, et al. An improved partitioning mechanism for optimizing massive data analysis using MapReduce [J]. Journal of Supercomputing, 2013, 66(1): 539-555.
[10] WALTON C B, DALE A G, JENEVEIN R M. A taxonomy and performance model of data skew effects in parallel joins [C]//Proceedings of the 17th International Conference on Very Large Data Bases. San Francisco: Morgan Kaufmann, 1991: 537-548.
[11] 贾雄派,王会举,杜小勇,等.大数据分析——RDBMS与MapReduce的竞争与共生[J].软件学报,2012,23(1):32-45. (JIA X P, WANG H J, DU X Y. Big data analysis — competition and symbiosis of RDBMS and MapReduce [J]. Journal of Software, 2012, 23(1): 32-45.)
[12] 王涛,王顺,沈益民.交通流大数据中的套牌车并行检测算法[J].湖北工程学院学报, 2014, 34(6): 29-31. (WANG T, WANG S, SHEN Y M. A parallel algorithm for detecting fake plate in big data of traffic flow [J]. Journal of Hubei Engineering University, 2014, 34(6): 29-31.)
[13] 俞东进,平利强,李万清,等.一种基于Hadoop的套牌车识别方法: CN104035954A [P]. 2014-09-10. (YU D J, PING L Q, LI W Q, et al. Hadoop-based recognition method for fake-licensed car: CN104035954A [P]. 2014-09-10.)
[14] CHEN C, LIU Z, LIN W-H, et al. Distributed modeling in a MapReduce framework for data-driven traffic flow forecasting [J]. IEEE Transactions on Intelligent Transportation Systems, 2013, 14(1): 22-33.
[15] SKIENA S. The algorithm design manual [M]. 2nd edition. New York: Springer-Verlag, 2008: 56-59.

基于历史车牌识别数据的套牌车并行检测方法

Parallel discovery of fake plates based on historical automatic number plate recognition data

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	杨程, 陆佳民, 冯钧. 分布式环境下大规模资源描述框架数据划分方法综述[J]. 计算机应用, 2020, 40(11): 3184-3191.
[2]	郭方方, 潮洛蒙, 朱建文. 基于相似连接的多源数据并行预处理方法[J]. 计算机应用, 2019, 39(1): 57-60.
[3]	马友忠, 张智辉, 林春杰. 大数据相似性连接查询技术研究进展[J]. 计算机应用, 2018, 38(4): 978-986.
[4]	曹云鹏, 王海峰. 面向MapReduce计算模式的中间数据通信优化[J]. 计算机应用, 2018, 38(4): 1078-1083.
[5]	张承畅, 张华誉, 罗建昌, 何丰. 基于云计算和改进K-means算法的海量用电数据分析方法[J]. 计算机应用, 2018, 38(1): 159-164.
[6]	马生俊, 陈旺虎, 俞茂义, 李金溶, 郏文博. 云环境下影响数据分布并行应用执行效率的因素分析[J]. 计算机应用, 2017, 37(7): 1883-1887.
[7]	廖彬, 张陶, 国冰磊, 于炯, 张旭光, 刘炎. 基于Spark的ItemBased推荐算法性能优化[J]. 计算机应用, 2017, 37(7): 1900-1905.
[8]	肖子达, 朱立谷, 冯东煜, 张迪. 分布式数据库聚合计算性能优化[J]. 计算机应用, 2017, 37(5): 1251-1256.
[9]	吴家皋, 夏轩, 刘林峰. 基于MapReduce的轨迹压缩并行化方法[J]. 计算机应用, 2017, 37(5): 1282-1286.
[10]	王卓, 索勃, 潘巍. 三角形的并行枚举算法[J]. 计算机应用, 2017, 37(12): 3397-3400.
[11]	乔通, 赵卓峰, 丁维龙. 面向套牌甄别的流式计算系统[J]. 计算机应用, 2017, 37(1): 153-158.
[12]	杨俊杰, 廖卓凡, 冯超超. 大数据存储架构和算法研究综述[J]. 计算机应用, 2016, 36(9): 2465-2471.
[13]	吕红瑾, 夏士雄, 杨旭, 黄丹. 基于区域划分的出租车统一推荐算法[J]. 计算机应用, 2016, 36(8): 2109-2113.
[14]	赵虎, 杨宇. 基于迭代式MapReduce的误差反向传播算法[J]. 计算机应用, 2016, 36(4): 923-926.
[15]	梁俊杰, 甘文婷, 余敦辉. 基于位置编码索引树的个性化推荐算法[J]. 计算机应用, 2016, 36(2): 419-423.