[1] HICKOK D, LESNIAK D, ROWE M. File type detection technology[EB/OL].[2015-10-10]. http://www.micsymposium.org/mics_2005/papers/paper7.pdf. [2] McDANIEL M, HEYDARI M H. Content based file type detection algorithms[C]//Proceedings of the 36th Annual Hawaii International Conference on System Sciences. Washington, DC:IEEE Computer Society, 2003:332a. [3] LI W, WANG K, STOLFO S J, et al. Fileprints:identifying file types by n-gram analysis[C]//Proceedings of the 6th IEEE Systems, Man and Cybernetics Information Assurance Workshop. Piscataway, NJ:IEEE, 2005:64-71. [4] 胡元, 石冰.基于区域划分的KNN文本快速分类算法研究[J]. 计算机科学, 2012, 39(10):182-186.(HU Y, SHI B. Fast KNN text classification algorithm based on area division[J]. Computer Science, 2012, 39(10):182-186.) [5] SONG Q, NI J, WANG G. A fast clustering-based feature subset selection algorithm for high-dimensional data[J]. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(1):1-14. [6] 张永, 孟晓飞, 基于投影追踪的KNN文本分类算法的加速策略[J]. 科学技术与工程, 2014, 36(14):92-96.(ZHANG Y, MENG X F. Accelerated K-nearest neighbors text classification algorithm[J]. Science Technology and Engineering,2014, 36(14):92-96.) [7] 郑洁, 罗军勇, 卢斌.基于统计特征值的文件类型识别算法[J]. 计算机工程, 2007, 33(1):142-144.(ZHENG J, LUO J Y, LU B. Documents type identification based on statistical characteristic[J]. Computer Engineering, 2007, 33(1):142-144.) [8] 史淼, 刘锋.基于PCA和KNN混合算法的文本分类方法[J]. 电脑知识与技术, 2015, 11(10):169-171.(SHI M, LIU F. A hybrid algorithm for text classification based PCA and KNN[J]. Journal of Computer Knowledge and Technology, 2015, 11(10):169-171.) [9] 陈振洲, 李磊, 姚正安.基于SVM的特征加权KNN算法[J]. 中山大学学报(自然科学版), 2005, 44(1):17-20.(CHEN Z Z, LI L, YAO Z A. Feature-weighted K-nearest neighbor algorithm with SVM[J]. Journal of Acta Scientiarum Naturalium Universitatis Sunyatseni, 2005, 44(1):17-20.) [10] 沈志斌, 白清源.基于加权修正的KNN文本分类算法[C]//第二十五届中国数据库学术会议论文集. 重庆:计算机科学, 2008, 38(10A):220-225.(SHEN Z B, BAI Q Y. KNN text classification method based weight modify[C]//NDBC 2008:Proceedings of the 25th National DataBase Conference. Chongqing:Computer Science, 2008, 38(10A):220-225.) [11] 曹鼎, 罗军勇, 尹美娟.基于变长元组的文件类型识别算法[J]. 计算机应用, 2011, 31(7):1894-1900.(CAO D, LUO J Y, YIN M J. Variable length gram based file type identification algorithm[J]. Journal of Computer Applications, 2011, 31(7):1894-1900). [12] PANG G, JIN H, JIANG S. CenKNN:a scalable and effective text classifier[J]. Data Mining and Knowledge Discovery, 2015, 29(3):593-625. [13] EVENSEN J D, LINDAHL S, GOODWIN M. File-type detection using naive Bayes and n-gram analysis[EB/OL].[2015-10-10]. http://ojs.bibsys.no/index.php/NISK/article/download/99/88. [14] AHMED I, LHEE K S, SHIN H, et al. Content-based file-type identification using cosine similarity and a divide-and-conquer approach[J]. IETE Technical Review, 2010, 27(6):465-477. |