[1] FARHADI A,HEJRATI M,SADEGHI M A,et al. Every picture tells a story:generating sentences from images[C]//Proceedings of the 2010 European Conference on Computer Vision,LNCS 6314. Berlin:Springer,2010:15-29. [2] JIANG Y,COSTELLO P,FANG F,et al. A gender-and sexual orientation-dependent spatial attentional effect of invisible images[J]. Proceedings of the National Academy of Sciences of the United States of America,2006,103(45):17048-17052. [3] BAHRAMI B,LAVIE N,REES G. Attentional load modulates responses of human primary visual cortex to invisible stimuli[J]. Current Biology,2007,17(6):509-513. [4] LU J S,XIONG C M,PARIKH D,et al. Knowing when to look:adaptive attention via a visual sentinel for image captioning[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:3242-3250. [5] XU K,BA J,KIROS R,et al. Show,attend and tell:neural image caption generation with visual attention[C]//Proceedings of the 32nd International Conference on Machine Learning. New York:JMLR. org,2015:2048-2057. [6] MAO J H, XU W, YANG Y, et al. Explain images with multimodal recurrent neural networks[EB/OL]. (2014-10-04)[2020-07-28]. http://arxiv.org/pdf/1410.1090.pdf. [7] VINYALS O,TOSHEV A,BENGIO S,et al. Show and tell:a neural image caption generator[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2015:3156-3164. [8] ANDERSON P,HE X D,BUEHLER C,et al. Bottom-up and topdown attention for image captioning and visual question answering[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2018:6077-6086. [9] HE K M,ZHANG X Y,REN S Q,et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:770-778. [10] 赵宏, 王乐, 王伟杰. 基于BiLSTM-CNN串行混合模型的文本情感分析[J]. 计算机应用,2020,40(1):16-22.(ZHAO H, WANG L,WANG W J. Text sentiment analysis based on serial hybrid model of bi-directional long short-term memory and convolutional neural network[J]. Journal of Computer Applications,2020,40(1):16-22.) [11] WU J H,ZHENG H,ZHAO B,et al. Large-scale datasets for going deeper in image understanding[C]//Proceedings of the 2019 IEEE International Conference on Multimedia and Expo. Piscataway:IEEE,2019:1480-1485. [12] PAPINENI K,ROUKOS S,WARD T,et al. BLEU:a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics,2002:311-318. [13] DENKOWSKI M,LAVIE A. Meteor universal:language specific translation evaluation for any target language[C]//Proceedings of the 9th Workshop on Statistical Machine Translation. Stroudsburg, PA:Association for Computational Linguistics, 2014:376-380. [14] LIN C Y. ROUGE:a package for automatic evaluation of summaries[C]//Proceedings of the 2004 ACL Workshop on Text Summarization. Stroudsburg,PA:Association for Computational Linguistics,2004:74-81. [15] VEDANTAM R,ZITNICK C L,PARIKH D. CIDEr:consensusbased image description evaluation[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2015:4566-4575. [16] 程俊华, 曾国辉, 鲁敦科, 等. 基于Dropout的改进卷积神经网络模型平均方法[J]. 计算机应用,2019,39(6):1601-1606. (CHENG J H,ZENG G H,LU D K,et al. Improved convolution neural network model averaging method based on Dropout[J]. Journal of Computer Applications,2019,39(6):1601-1606.) [17] KINGMA D P, BA J L. Adam:a method for stochastic optimization[EB/OL]. (2017-01-30)[2020-08-03]. https://arxiv.org/pdf/1412.6980.pdf. [18] 马书磊, 张国宾, 焦阳, 等. 一种改进的全局注意机制图像描述方法[J]. 西安电子科技大学学报,2019,46(2):17-22.(MA S L,ZHANG G B,JIAO Y,et al. Improved method for image caption with global attention mechanism[J]. Journal of Xidian University,2019,46(2):17-22.) |