Appraisal expression recognition is very important in sentiment analysis. Because of the lack of labeled corpus, most former works in appraisal expression recognition are focused on construction of rules and templates manually. In order to reduce the training work of labeling corpus and further mining information of unlabeled corpus, a new algorithm based on co-training was proposed, which mainly used massive unlabeled corpus and only a small number of labeled corpus. The proposed algorithm was based on Tri-training and combined Support Vector Machine (SVM), Maximum Entropy (MaxEnt) and Conditional Random Field (CRF) to build a new approach for candidate appraisal expression classification. By comparing the Tri-training based algorithm with the former single classifier based algorithms, the former can effectively improve the performance of appraisal expression recognition in subjective sentences.
MA X, JIN B, FAN B. An analysis of Chinese text emotional tendency [J]. Information and Documentation Services,2013(1):52-56. (马晓玲,金碧漪,范并思.中文文本情感倾向分析研究[J].情报资料工作, 2013(1): 52-56.)
[2]
ZHU X. Semi-supervised learning literature survey, Computer Science TR 1530 [R]. Madison: University of Wisconsin, 2008.
[3]
CHANG Y, LIANG J, GAO J, et al.A semi-supervised clustering algorithm based on seeds and pair-wise constraints [J]. Journal of Nanjing University: Natural Science Edition, 2012,48(4):405-411.(常瑜,梁吉业,高嘉伟,等.一种基于Seeds集和成对约束的半监督聚类算法[J].南京大学学报:自然科学版,2012,48(4):405-411.)
[4]
LIU B, HU M, CHENG J. Opinion observer: analyzing and comparing opinions on the Web [C] // WWW '05: Proceedings of the 14th International Conference on World Wide Web. New York: ACM, 2005: 342-351.
[5]
YAO T, LOU D. Research on semantic orientation analysis for topics in Chinese sentences [J]. Journal of Chinese Information Processing, 2007, 21(5): 73-79. (姚天昉,娄德成.汉语语句主题语义倾向分析方法的研究[J].中文信息学报,2007,21(5):73-79.)
[6]
ZHAO Y, QIN B, CHE W, et al.Appraisal expression recognition based on syntactic path[J]. Journal of Software, 2011, 22(5): 887-898. (赵妍妍,秦兵,车万翔,等.基于句法路径的情感评价单元识别[J].软件学报, 2011, 22(5): 887-898.)
[7]
FANG M, LIU P. Identification of evaluation collocation based on maximum entropy model [J]. Application Research of Computers, 2011, 28(10): 3714-3716. (方明,刘培玉.基于最大熵模型的评价搭配识别[J].计算机应用研究, 2011, 28(10): 3714-3716.)
[8]
XU B,ZHAO T,WANG S,et al.Extraction of opinion targets based on shallow parsing features[J].Acta Automatica Sinica,2011,37(10):1241-1247.(徐冰,赵铁军,王山雨,等.基于浅层句法特征的评价对象抽取研究[J].自动化学报,2011,37(10): 1241-1247.)
[9]
SHAHSHAHANI B M, LANDGREBE D A. The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon [J]. IEEE Transactions on Geoscience and Remote Sensing, 1994, 32(5): 1087-1095.
[10]
ZHOU Z, WANG J. Machine learning and application [M]. Beijing: Tsinghua University Press, 2007: 259-275.(周志华,王珏.机器学习及其应用[M].北京:清华大学出版社, 2007: 259-275.)
[11]
GOLDMAN S A, ZHOU Y. Enhancing supervised learning with unlabeled data [C]// ICML '00: Proceedings of the Seventeenth International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers Inc., 2000: 327-334.
[12]
ZHOU Z, LI M. Tri-training: exploiting unlabeled data using three classifiers [J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(11): 1529-1541.
[13]
ZHANG W, LIU J, GUO X. Xuesheng Baobianyi Cidian[M]. Beijing: Encyclopedia of China Publishing House, 2004. (张伟,刘缙,郭先珍.学生褒贬义词典[M].北京:中国大百科全书出版社,2004.)
[14]
DONG Q, DONG Z. HowNet knowledge database [EB/OL]. [2013-03-18]. http://www.keenage.com/. (董强,董振东.知网简介[EB/OL]. [2013-03-18]. http://www.keenage.com/.)
[15]
XU L, LIN H, PAN Y, et al.Constructing the affective lexicon ontology [J]. Journal of the China Society for Scientific and Technical Information, 2008, 27(2): 180-185. (徐琳宏,林鸿飞,潘宇,等.情感词汇本体的构造[J].情报学报,2008,27(2):180-185.)
[16]
TAN S. Chinese sentiment corpus — ChnSentiCorp [EB/OL]. [2012-11-20]. http://www.searchforum.org.cn/tansongbo/senti_corpus.jsp. (谭松波. 中文情感挖掘语料——ChnSentiCorp [EB/OL]. [2012-11-20].http://www.searchforum.org.cn/tansongbo/senti_corpus.jsp.)