Abstract:The current mainstream sequence labeling is based on Recurrent Neural Network (RNN). Aiming at the problem of RNN and sequence labeling, an improved multilayer Bi-direction Long Short Term Memory (BLSTM) network for sequence labeling was proposed. Each layer of BLSTM had an operation of information fusion, and the output contained more contextual information. In addition, a method to perform Chinese word segmentation and punctuation prediction jointly was proposed. Experiments on the public datasets show that the improved multilayer BLSTM network model can improve the classification accuracy of Chinese segmentation and punctuation prediction. In the case of two tasks that need to be accomplished, the joint task method can greatly reduce the complexity of the system, and the new model and the joint task method can also be applied to solve other sequence labeling problems.
李雅昆, 潘晴, Everett X. WANG. 基于改进的多层BLSTM的中文分词和标点预测[J]. 计算机应用, 2018, 38(5): 1278-1282.
LI Yakun, PAN Qing, WANG Feng. Joint Chinese word segmentation and punctuation prediction based on improved multilayer BLSTM network. Journal of Computer Applications, 2018, 38(5): 1278-1282.
[1] XUE N, CONVERSE S P. Combining classifiers for Chinese word segmentation[C]//Proceedings of the 1st SIGHAN Workshop on Chinese Language. Stroudsburg, PA:Association for Computational Linguistics, 2002:57-63. [2] 翟凤文, 赫枫龄, 左万利. 字典与统计相结合的中文分词方法[J]. 小型微型计算机系统, 2006, 27(9):1766-1771.(ZHAI F W, HE F L, ZUO W L. Chinese word segmentation based on dictionary and statistics[J]. MNI-MICRO Systems, 2006, 27(9):1766-1771.) [3] KNESER R, NEY H. Improved backing-off for N-gram language modeling[C]//Proceedings of the 1995 IEEE International Conference on Acoustics, Speech and Signal Processing. Washington, DC:IEEE Computer Society, 1995:181-184. [4] GOLDBERG Y, LEVY O. word2vec explained:deriving Mikolov et al.'s negative-sampling word-embedding method[EB/OL].[2017-06-20]. http://www.inf.ed.ac.uk/teaching/courses/nlu/reading/skipgram-derivation.pdf. [5] 万建成, 杨春花. 书面汉语的全切分分词算法模型[J]. 小型微型计算机系统, 2003, 24(7):1247-1251.(WAN J C, YANG C H. An algorithm model of word omni-segmentation for written Chinese[J]. MNI-MICRO Systems, 2003, 24(7):1247-1251.) [6] ZHAO H, HUANG C N, LI M, et al. An improved Chinese word segmentation system with conditional random field[EB/OL].[2017-06-20].http://acl.ldc.upenn.edu/W/W06/W06-0127.pdf. [7] 任智慧, 徐浩煜, 封松林,等. 基于LSTM网络的序列标注中文分词法[J]. 计算机应用研究, 2017, 34(5):1321-1324.(REN Z H,XU H Y,FENG S L, et al. Sequence labeling Chinese word segmentation method based on LSTM networks[J]. Application Research of Computers, 2017, 34(5):1321-1324.) [8] RATNAPARKHI A. A maximum entropy part-of-speech tagger[C]//Proceedings of the 1996 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:Association for Computational Linguistics, 1996:133-142. [9] LAFFERTY J D, MCCALLUM A, PEREIRA F C N. Conditional random fields:probabilistic models for segmenting and labeling sequence data[C]//Proceedings of the 18th International Conference on Machine Learning. San Francisco, CA:Morgan Kaufmann, 2001:282-289. [10] PENG F, FENG F, MCCALLUM A. Chinese segmentation and new word detection using conditional random fields[C]//Proceedings of the 20th International Conference on Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics, 2004:562-568. [11] ZHENG X, CHEN H, XU T. Deep learning for Chinese word segmentation and POS tagging[C]//Proceedings of the 2013 Conference on Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:Association for Computational Linguistics, 2013:647-657. [12] PEI W, GE T, CHANG B. Max-margin tensor neural network for Chinese word segmentation[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics, 2014:293-303. [13] GERS F A, SCHMIDHUBER J. Recurrent nets that time and count[C]//Proceedings of the 2000 IEEE-INNS-ENNS International Joint Conference on Neural Networks. Washington, DC:IEEE Computer Society, 2000:189-194. [14] CHO K, MERRIENBOER B V, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[EB/OL].[2017-06-20]. http://www.statnlp.org/wp-content/uploads/2016/02/rnn.pdf. [15] CHEN X, QIU X, ZHU C, et al. Long short-term memory neural networks for Chinese word segmentation[EB/OL].[2017-06-20]. http://www.emnlp2015.org/proceedings/EMNLP/pdf/EMNLP141.pdf. [16] GREFF K, SRIVASTAVA R K, KOUTNIK J, et al. LSTM:a search space odyssey[J]. IEEE Transactions on Neural Networks & Learning Systems, 2015, 28(10):2222-2232. [17] YAO Y, HUANG Z. Bi-directional LSTM recurrent neural network for Chinese word segmentation[C]//Proceedings of the 23rd International Conference on Neural Information Processing. Berlin:Springer, 2016:345-353. [18] 胡婕, 张俊驰. 双向循环网络中文分词模型[J]. 小型微型计算机系统, 2017, 38(3):522-526.(HU J, ZHANG J C. Bidirectional recurrent networks for Chinese word segmentation[J]. Journal of Chinese Computer Systems, 2017, 38(3):522-526.) [19] 黄积杨. 基于双向LSTMN神经网络的中文分词研究分析[D]. 南京:南京大学, 2016.(HUANG J Y. Chinese word segmentation analysis based on bidirectional LSTM recurrent neural network[D]. Nanjing:Nanjing University, 2016.) [20] JING H, ZWEIG G. Maximum entropy model for punctuation annotation from speech[C]//Proceedings of the 7th International Conference on Spoken Language Processing.[S.l.]:DBLP, 2002:917-920. [21] SHRIBERG E, SHRIBERG E, SHRIBERG E, et al. Using conditional random fields for sentence boundary detection in speech[C]//Proceedings of the 43rd Annual Meeting on Association for Computational Linguistic. Stroudsburg, PA:Association for Computational Linguistics, 2005:451-458. [22] ZHANG Y, CLARK S. Joint word segmentation and POS tagging using a single perceptron[C]//Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics, 2008:888-896. [23] WANG A, KAN M Y. Mining informal language from Chinese microtext:joint word recognition and segmentation[EB/OL].[2017-06-20]. http://www.comp.nus.edu.sg/~kanmy/papers/acl2013.pdf. [24] WU K, WANG X, ZHOU N, et al. Joint Chinese word segmentation and punctuation prediction using deep recurrent neural network for social media data[C]//Proceedings of the 2015 International Conference on Asian Language Processing. Piscataway, NJ:IEEE, 2016:41-44. [25] QIAN X, LIU Y. Joint Chinese word segnentation, POS tagging and parsing[C]//Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Stroudsburg, PA:Association for Computational Linguistics, 2012:501-511. [26] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8):1735-1780. [27] SIGHAN. Second International Chinese word segmentation Bakeoff data[EB/OL].[2005-11-18].http://sighan.cs.uchicago.edu/bakeoff2005/ [28] 并行执行中文分词和标点预测的Python程序[CP/OL].[2017-05-06].https://github.com/camel2000.(Python programs that perform Chinese word segmentation and punctuation prediction in parallel[CP/OL].[2017-05-06].https://github.com/camel2000.) [29] SIGHAN.Second International Chinese Word Segmentation Bakeoff Result Summary[EB/OL].[2005-11-18].http://sighan.cs.uchicago.edu/bakeoff2005/data/results.php.htm.