[1] 庞俊,谷峪,许嘉,等.相似性连接查询技术研究进展[J].计算机科学与探索,2013,7(1):1-13.(PANG J, GU Y, XU J, et al. Research advance on similarity join queries[J]. Journal of Frontiers of Computer Science and Technology, 2013, 7(1):1-13.) [2] 庞俊,于戈,许嘉,等.基于MapReduce框架的海量数据相似性连接研究进展[J].计算机科学,2015,42(1):1-5.(PANG J, YU G, XU J, et al. Similarity joins on massive data based on MapReduce framework[J]. Computer Science, 2015, 42(1):1-5.) [3] SHIM K, SRIKANT R, AGRAWAL R. High-dimensional similarity joins[J]. IEEE Transactions on Knowledge and Data Engineering, 2002, 14(1):156-171. [4] BÖHM C, BRAUNMVLLER B, KREBS F, et al. Epsilon grid or-der:an algorithm for the similarity join on massive high-dimensional data[C]//Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. New York:ACM, 2001:379-388. [5] KALASHNIKOV D. Super-EGO:fast multi-dimensional similarity join[J]. The VLDB Journal, 2013, 22(4):561-585. [6] DEAN J, GHEMAWAT S. MapReduce:simplified data processing on large clusters[C]//Proceedings of the 6th USENIX Symposium on Operating Systems Design and Implementation. San Francisco:USENIX Association, 2004:137-150. [7] SEIDL T, FRIES S, BODEN B. MR-DSJ:distance-based self-join for large-scale vector data analysis with MapReduce[C]//Proceedings of the 15th BTW Conference on Database Systems for Business, Technology, and Web. Berlin:Springer, 2013:37-56. [8] FRIES S, BODEN B, STEPIEN G, et al. PHiDJ:parallel similarity self-join for high-dimensional vector data with MapReduce[C]//Proceedings of the 30th IEEE International Conference on Data Engineering. Piscataway, NJ:IEEE, 2014:796-807. [9] LUO W, TAN H, MAO H, et al. Efficient similarity joins on massive high-dimensional datasets using MapReduce[C]//Proceedings of the 13th IEEE International Conference on Mobile Data Management. Piscataway, NJ:IEEE, 2012:1-10. [10] LU W, SHEN Y, CHEN S, et al. Efficient processing of k nearest neighbor joins using MapReduce[J]. Proceedings of the VLDB Endowment, 2012, 5(10):1016-1027. [11] ZHANG C, LI F, JESTES J. Efficient parallel kNN joins for large data in MapReduce[C]//Proceedings of the 15th International Conference on Extending Database Technology. New York:ACM, 2012:38-49. [12] VERNICA R, CAREY M, LI C. Efficient parallel set-similarity joins using MapReduce[C]//Proceedings of the ACM SIGMOD International Conference on Management of Data. New York:ACM, 2010:495-506. [13] RONG C, LU W, WANG X, et al. Efficient and scalable processing of string similarity join[J]. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(10):2217-2230. [14] ELSAYED T, LIN J, OARD D. Pairwise document similarity in large collections with MapReduce[C]//Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computer Linguistics, 2008:265-268. [15] METWALLY A, FALOUTSOS C. V-SMART-Join:a scalable MapReduce framework for all-pair similarity joins of multisets and vectors[C]//Proceedings of the VLDB Endowment, 2012, 5(8):704-715. [16] BARAGLIA R, MORALES G, LUCCHESE C. Document similarity self-join with MapReduce[C]//Proceedings of the 10th IEEE International Conference on Data Mining. Piscataway, NJ:IEEE, 2010:731-736. [17] 刘义,陈荦,景宁,等.海量空间数据的并行Top-k连接查询[J].计算机研究与发展,2011,48(z2):163-172.(LIU Y, CHEN L, JING N, et al. Parallel top-k spatial join query processing on massive spatial data[J]. Journal of Computer Research and Development, 2011, 48(z2):163-172.) [18] 雷斌,许嘉,谷峪,等.概率数据上基于EMD距离的并行Top-k相似性连接算法[J].软件学报,2013,24(S2):188-199.(LEI B, XU J, GU Y, et al. Parallel top-k similarity join algorithm on large probabilistic data based on earth mover's distance[J]. Journal of Software, 2013, 24(S2):188-199.) [19] HUANG J, ZHANG R, BUYYA R, et al. MELODY-JOIN:efficient earth mover's distance similarity joins using MapReduce[C]//Proceedings of the 30th IEEE International Conference on Data Engineering. Piscataway, NJ:IEEE, 2014:808-819. |