[1] 马建刚.检察实务中的大数据[M].北京:中国检察出版社,2017: 17-23. (MA J G. Procuratorial Big Data[M]. Beijing: China Procurational Press, 2017:17-23.) [2] ZHANG N, PU Y, YANG S, et al. An ontological Chinese legal consultation system[J]. IEEE Access, 2017, 5:18250-18261. [3] CASARI A, ZHENG A. Feature Engineering for Machine Learning[M]. Sebastopol, CA: O'Reilly Media, 2018:247-251. [4] LI C L, SU Y C, LIN T W, et al. Combination of feature engineering and ranking models for paper-author identification in KDD Cup 2013[C]// Proceedings of the 2013 KDD Cup 2013 Workshop. New York: ACM, 2013: Article No. 2. [5] XU Y, HONG K, TSUJⅡ J, et al. Feature engineering combined with machine learning and rule-based methods for structured information extraction from narrative clinical discharge summaries[J]. Journal of the American Medical Informatics Association, 2012, 19(5): 824-832. [6] GALGANI F, COMPTON P, HOFFMANN A. LEXA: building knowledge bases for automatic legal citation classification[J]. Expert Systems with Applications, 2015, 42(17): 6391-6407. [7] SALTON G, WONG A, YANG C S. A vector space model for automatic indexing[J]. Communications of the ACM, 1975, 18(11): 613-620. [8] HAMMOUDA K, KAMEL M. Phrase-based document similarity based on an index graph model[C]// Proceedings of the 2002 IEEE International Conference on Data Mining. Washington, DC: IEEE Computer Society, 2002: 203-210. [9] BLEI D M, NG A Y, JORDAN M I, et al. Latent Dirichlet allocation[J]. Journal of Machine Learning Research, 2003, 3(4/5): 993-1022. [10] ROITBLAT H L, KERSHAW A, OOT P. Document categorization in legal electronic discovery: computer classification vs. manual review[J]. Journal of the Association for Information Science and Technology, 2010, 61(1): 70-80. [11] NOORTWIJK K V, NOORTWIJK K C. Automatic document classification in integrated legal content collections[C]// Proceedings of the 16th International Conference on Artificial Intelligence and Law. New York: ACM, 2017: 129-134. [12] SULEA O, ZAMPIERI M, MALMASI S, et al. Exploring the use of text classification in the legal domain[J/OL]. arXiv Preprint, 2017, 2017: arXiv:1710.09306[2017-10-25]. https://arxiv.org/abs/1710.09306. [13] SARIC F, DALBELO B, MOENS M F, et al. Multi-label classification of croatian legal documents using eurovoc thesaurus[EB/OL].[2018-03-20].http://core.ac.uk/download/pdf/34600531.pdf. [14] BAJWA I S, KARIM F, NAEEM M A, et al. A semi-supervised approach for catchphrase classification in legal text documents[J]. Journal of Computers, 2017, 12(5): 451-461. [15] SILVESTRO L D, SPAMPINATO D, TORRISI A. Automatic classification of legal textual documents using C4.5[EB/OL].[2018-03-20].http://www.ittig.cnr.it/Ricerca/Testi/Spampinato-Di_Silvestro-Torrisi2009.pdf. [16] KUSNER M J, SUN Y, KOLKIN N I, et al. From word embeddings to document distances[C]// Proceedings of the 32nd International Conference on Machine Learning. New York: JMLR.org, 2015: 957-966. [17] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[J/OL]. arXiv Preprint, 2013, 2013: arXiv:1301.3781(2013-01-16)[2013-09-07]. https://arxiv.org/abs/1301.3781. [18] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. New York: Curran Associates, 2013: 3111-3119. [19] ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks[C]// Proceedings of the 13th European Conference on Computer Vision. London: Springer, 2014: 818-833. [20] GOMEZ-PEREZ A, FERNANDEZ-LOPEZ M, CORCHO O. Ontological Engineering[M]. London: Springer, 2004:173-182. [21] SUN J J. Jieba Chinese word segmentation tool[CP/OL]. (2018-01-21)[2018-06-25]. https://github.com/fxsjy/jieba. [22] LEVENSHTEIN V I. Binary codes capable of correcting deletions, insertions, and reversals[J]. Soviet Physics Doklady, 1966, 10(8): 707-710. |