[1] YANG Z, HE X, GAO J, et al. Stacked attention networks for image question answering[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:21-29. [2] MACHAJDIK J, HANBURY A. Affective image classification using features inspired by psychology and art theory[C]//Proceedings of the 18th ACM International Conference on Multimedia. New York:ACM, 2010:83-92. [3] CHEN L, ZHANG H, XIAO J, et al. SCA-CNN:spatial and channel-wise attention in convolutional networks for image captioning[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2017:6298-6306. [4] CHEN T, SALAHELDEEN H M, HE X N, et al. VELDA:relating an image tweet's text and images[C]//Proceedings of the 29th AAAI Conference on Artificial Intelligence. Palo Alto, CA:AAAI Press, 2015:30-36. [5] HU X, TANG J, GAO H, et al. Unsupervised sentiment analysis with emotional signals[C]//Proceedings of the 22nd International Conference on World Wide Web. New York:ACM, 2013:607-618. [6] DAVIDOV D, TSUR O, RAPPOPORT A. Enhanced sentiment learning using twitter hashtags and smileys[C]//Proceedings of the 23rd International Conference on Computational Linguistics:Posters. Stroudsburg, PA:Association for Computational Linguistics, 2010:241-249. [7] TANG D, QIN B, LIU T. Document modeling with gated recurrent neural network for sentiment classification[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Processing. Stroudsburg, PA:Association for Computational Linguistics, 2015:1422-1432. [8] ZHANG X, ZHAO J, LeCUN Y. Character-level convolutional networks for text classification[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge:MIT Press, 2015:649-657. [9] LANG P J. A bio-informational theory of emotional imagery[J]. Psychophysiology, 1979, 16(6):495-512. [10] KANG H. Affective content detection using HMMs[C]//Proceedings of the 11th ACM International Conference on Multimedia. New York:ACM, 2003:259-262. [11] WANG W, HE Q H. A survey on emotional semantic image retrieval[C]//Proceedings of the 15th IEEE International Conference on Image Processing. Piscataway:IEEE, 2008:117-120. [12] BORTH D, JI R, CHEN T, et al. Large-scale visual sentiment ontology and detectors using adjective noun pairs[C]//Proceedings of the 21st ACM International Conference on Multimedia. New York:ACM, 2013:223-232. [13] YOU Q, LUO J, JIN H, et al. Robust image sentiment analysis using progressively trained and domain transferred deep networks[C]//Proceedings of the 29th AAAI conference on Artificial Intelligence. Palo Alto, CA:AAAI Press, 2015:381-388. [14] RAO T, LI X, XU M. Learning multi-level deep representations for image emotion classification[J]. Neural Processing Letters, 2020, 51(3):2043-2061. [15] PÉREZ-ROSAS V, MIHALCEA R, MORENCY L P. Utterancelevel multimodal sentiment analysis[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics, 2013:973-982. [16] PORIA S, CHATURVEDI I, CAMBRIA E, et al. Convolutional MKL based multimodal emotion recognition and sentiment analysis[C]//Proceedings of the IEEE 16th International Conference on Data Mining. Piscataway:IEEE, 2016:439-448. [17] WANG M, CAO D, LI L, et al. Microblog sentiment analysis based on cross-media bag-of-words model[C]//Proceedings of the 2014 International Conference on Internet Multimedia Computing and Service. New York:ACM, 2014:76-80. [18] CAO D, JI R, LIN D, et al. A cross-media public sentiment analysis system for microblog[J]. Multimedia Systems, 2016, 22(4):479-486. [19] YOU Q, CAO L, JIN H, et al. Robust visual-textual sentiment analysis:When attention meets tree-structured recursive neural networks[C]//Proceedings of the 24th ACM International Conference on Multimedia. New York:ACM, 2016:1008-1017. [20] XU J, HUANG F, ZHANG X, et al. Visual-textual sentiment classification with bi-directional multi-level attention networks[J]. Knowledge-Based Systems, 2019, 178:61-73. [21] XU N, MAO W J, CHEN G D. Multi-interactive memory network for aspect based multimodal sentiment analysis[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto, CA:AAAI Press, 2019:371-378. [22] YU J F, JIANG J, XIA R. Entity-sensitive attention and fusion network for entity-level multimodal sentiment classification[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2020, 28:429-439. [23] TRUONG Q T, LAUW H W. VistaNet:visual aspect attention network for multimodal sentiment analysis[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto, CA:AAAI Press, 2019:305-312. [24] CAI Y T, CAI H Y, WAN X J. Multi-modal sarcasm detection in twitter with hierarchical fusion model[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics, 2019:2506-2515. [25] ZHANG H Z, LUO Y, AI Q M, et al. Look, read and feel:benchmarking ads understanding with multimodal multitask learning[C]//Proceedings of the 28th ACM International Conference on Multimedia. New York:ACM, 2020:430-438. [26] PENNINGTON J, SOCHER R, MANNING C D. GloVe:global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:Association for Computational Linguistics, 2014:1532-1543. [27] ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks[C]//Proceedings of the 2014 European Conference on Computer Vision, LNCS 8689. Cham:Springer, 2014:818-833. [28] LU D, NEVES L, CARVALHO V, et al. Visual attention model for name tagging in multimodal social media[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics, 2018:1990-1999. [29] VADICAMO L, CARRARA F, CIMINO A, et al. Cross-media learning for image sentiment analysis in the wild[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops. Piscataway:IEEE, 2017:308-317. [30] NIU T, ZHU S A, PANG L, et al. Sentiment analysis on multiview social data[C]//Proceedings of the 2016 International Conference on Multimedia Modeling, LNCS 9517. Cham:Springer, 2016:15-27. |