LIU Qing, FU Yinjin, NI Guiqiang, MEI Jianmin. Distributed deduplication storage system based on Hadoop platform[J]. Journal of Computer Applications, 2016, 36(2): 330-335.
[1] 付印金,肖侬,刘芳.重复数据删除关键技术研究进展[J].计算机研究与发展, 2012, 49(1): 12-22. (FU Y J, XIAO N, LIU F. Research and development on key techniques of data deduplication[J]. Journal of Computer Research and Development, 2012, 49(1):12-20.) [2] 程学旗,靳小龙,王元卓,等.大数据系统和分析技术综述[J].软件学报,2014,25(9):1889-1908. (CHENG X Q, JIN X L, WANG Y Z, et al. Survey on big data system and analytic technology[J]. Journal of Software, 2014, 25(9): 1889-1908.) [3] CHANG R-S, LIAO C-S, FAN K-Z, et al. Dynamic de-duplication decision in a Hadoop distributed file system[J]. International Journal of Distributed Sensor Networks, 2014, 2014(6): 774-777. [4] SUN Z, SHEN J, YONG J. A novel approach to data deduplication over the engineering-oriented cloud systems[J]. Integrated Computer-Aided Engineering, 2013, 20(1): 45-57. [5] KOLB L, THOR A, RAHM E. Dedoop: efficient deduplication with Hadoop[J]. Proceedings of the VLDB Endowment, 2012, 5(12): 1878-1881. [6] TWEET. Data deduplication tactics with HDFS and MapReduce [EB/OL]. (2013-03-25) [2015-05-25]. http://www.hadoopsphere.com/2013/02/data-de-duplication-tactics-with-hdfs.html. [7] 曹英忠.基于Hadoop的重复数据删除技术的研究与应用[D].桂林:桂林理工大学,2012:61-67. (CAO Y Z. Research and application of data deduplication techniques based on Hadoop [D]. Guilin: Guilin University of Technology, 2012: 61-67.) [8] KATHPAL A, JOHN M, MAKKAR G. Distributed duplicate detection in post-process data de-duplication[C]//HiPC 2011: Proceedings of the 2011 18th International Conference on High Performance Computing. Washington, DC: IEEE Computer Society, 2011 [2015-03-09]. http://www.hipc.org/hipc2011/studsym-papers/1569512535.pdf. [9] WHITE T. Hadoop: the definitive guide[M]. 3rd edition. [S.l.]: Yahoo! Press, 2010: 45.