MATCH： multimodal stock prediction framework integrating time-frequency features and hybrid text

doi:10.11772/j.issn.1001-9081.2025080955

Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (2): 427-436.DOI: 10.11772/j.issn.1001-9081.2025080955

• Artificial intelligence • Previous Articles

MATCH： multimodal stock prediction framework integrating time-frequency features and hybrid text

Hanyue WEI, Chenjuan GUO(), Jieyuan MEI, Jindong TIAN, Peng CHEN, Ronghui XU, Bin YANG

School of Data Science and Engineering，East China Normal University，Shanghai 200062，China

Received:2025-08-20 Revised:2025-09-11 Accepted:2025-10-10 Online:2026-03-02 Published:2026-02-10
Contact: Chenjuan GUO
About author:WEI Hanyue， born in 1999， M. S. candidate. Her research interests include financial time series forecasting， multimodal forecasting.
GUO Chenjuan， born in 1982， Ph. D.， professor. Her research interests include spatio-temporal data analysis， artificial intelligence. cjguo@dase.ecnu.edu.cn
MEI Jieyuan， born in 2000， M. S. candidate. His research interests include time series forecasting.
TIAN Jindong， born in 2000， Ph. D. candidate. His research interests include spatio-temporal data analysis， knowledge-data dual-driven models.
CHEN Peng， born in 1999， Ph. D. candidate. His research interests include time series analysis.
XU Ronghui， born in 1999， Ph. D. candidate. Her research interests include spatio-temporal data analysis， multimodal learning.
YANG Bin， born in 1982， Ph. D.， professor. His research interests include artificial intelligence， data analysis， decision intelligence.
Supported by:
National Natural Science Foundation of China(62372179)

融合时频特征与混合文本的多模态股票预测框架MATCH

魏涵玥, 郭晨娟(), 梅杰源, 田锦东, 陈鹏, 徐榕荟, 杨彬

华东师范大学数据科学与工程学院，上海 200062

通讯作者: 郭晨娟
作者简介:魏涵玥（1999—），女，江苏苏州人，硕士研究生，主要研究方向：金融时序预测、多模态预测
郭晨娟（1982—），女，辽宁铁岭人，教授，博士，CCF会员，主要研究方向：时空数据分析、人工智能 cjguo@dase.ecnu.edu.cn
梅杰源（2000—），男，浙江金华人，硕士研究生，主要研究方向：时序预测
田锦东（2000—），男，四川南充人，博士研究生，CCF会员，主要研究方向：时空数据分析、知识-数据双驱动模型
陈鹏（1999—），男，安徽郎溪人，博士研究生，主要研究方向：时序分析
徐榕荟（1999—），女，广西桂林人，博士研究生，主要研究方向：时空数据分析、多模态学习
杨彬（1982—），男，陕西西安人，教授，博士，CCF会员，主要研究方向：人工智能、数据分析、决策智能。
基金资助:
国家自然科学基金资助项目(62372179)

Abstract

Abstract:

The existing stock prediction models are mainly based on unimodal， and ignore inter-industry linkage effects and information heterogeneity. Although some studies have introduced textual modalities， they still struggle with challenges such as time lag effects and multi-granularity caused by modality inconsistency. Therefore， MATCH（Multimodal stock prediction frAmework inTegrating time-frequenCy features and Hybrid text）， a multimodal fusion framework for stock prediction that integrates heterogeneous information across modalities effectively was proposed. Specifically， a Mixture of Experts （MoE） pretraining strategy was designed to build industry-specific pretrained models of representations， so as to select matched expert networks dynamically and incorporate industry features information. At the same time， a frequency-domain decomposition and hierarchical fusion mechanism was designed to jointly model temporal patterns at multiple frequencies， and a dual-stream pretraining architecture was used to obtain representations of high-frequency future fluctuations and low-frequency future trends， which were interacted with text information across multiple time scales cross-modally， thereby capturing market dynamics more precisely and enabling effective interaction between time-series and textual data in multi-granular scenarios. Experimental results on two real-world stock datasets， S&P 500 and CMIN-US for comparing MATCH and mainstream methods such as ESTIMATE （Efficient STock Integration with teMporal generative filters and wavelet hypergraph ATtEntions） and PatchTST demonstrate that， on S&P 500 dataset， MATCH has the Sharpe Ratio （SR） improved by 50.5% over the sub-optimal baseline model Adv-ALSTM， while on the more challenging CMIN-US dataset， MATCH achieves a 2.35% SR improvement， with other metrics reaching the best. It can be seen that MATCH provides a novel and efficient solution for multimodal data fusion in finance.

Key words: financial time series, multimodal, Mixture of Experts (MoE) model, pretrained model, time-frequency analysis

摘要：

现有股票预测模型多基于单一模态，忽视了行业间的联动效应与信息异质性；部分研究虽引入了文本模态，但在处理模态异构所导致的时滞性和多粒度等问题上仍存在不足。因此，提出面向股票市场的融合时频特征与混合文本的多模态股票预测框架MATCH（Multimodal stock prediction frAmework inTegrating time-frequenCy features and Hybrid text）。一方面，设计混合专家（MoE）预训练策略为每个行业构建特定的预训练表征模型，在预测过程中动态选择匹配的专家网络，并注入行业特征信息；另一方面，设计频域分解与层次化融合机制，通过双流预训练架构获取高频未来波动和低频未来趋势的表征，把它们与不同时间尺度的文本信息进行跨模态交互，更精准地捕捉市场动态变化，并实现多粒度场景下的时序与文本有效交互。在2个真实股票数据集S&P 500和CMIN-US上，MATCH与ESTIMATE（Efficient STock Integration with teMporal generative filters and wavelet hypergraph ATtEntions）和PatchTST等主流方法进行对比的实验结果显示，在S&P 500数据集上相较次优基线模型Adv-ALSTM，MATCH的夏普比率（SR）提升了50.5%；在更具有挑战性的CMIN-US数据集上，MATCH的SR提升了2.35%，其余指标均取得了最佳成绩。MATCH预测性能提升明显可为金融多模态数据融合提供新颖且高效的解决方案。

关键词: 金融时间序列, 多模态, 混合专家模型, 预训练模型, 时频分析

CLC Number:

TP311.13

Hanyue WEI, Chenjuan GUO, Jieyuan MEI, Jindong TIAN, Peng CHEN, Ronghui XU, Bin YANG. MATCH： multimodal stock prediction framework integrating time-frequency features and hybrid text[J]. Journal of Computer Applications, 2026, 46(2): 427-436.

魏涵玥, 郭晨娟, 梅杰源, 田锦东, 陈鹏, 徐榕荟, 杨彬. 融合时频特征与混合文本的多模态股票预测框架MATCH[J]. 《计算机应用》唯一官方网站, 2026, 46(2): 427-436.

Figures/Tables 7

References 46

[1]	陈榕，任崇广，王智远，等. 基于注意力机制的CRNN文本分类算法［J］. 计算机工程与设计， 2019， 40（11）： 3151-3157.
	CHEN R， REN C G， WANG Z Y， et al. Attention based CRNN for text classification［J］. Computer Engineering and Design， 2019， 40（11）： 3151-3157.
[2]	王慧斌，胡展傲，胡节，等. 基于分段注意力机制的时间序列预测模型［J］. 计算机应用， 2025， 45（7）： 2262-2268.
	WANG H B， HU Z A， HU J， et al. Time series forecasting model based on segmented attention mechanism［J］. Journal of Computer Applications， 2025， 45（7）： 2262-2268.
[3]	李岚皓，严皓钧，周号益，等. 基于神经网络的多尺度信息融合时间序列长期预测模型［J］. 计算机应用， 2025， 45（6）： 1776-1783.
	LI L H， YAN H J， ZHOU H Y， et al. Multi-scale information fusion time series long-term forecasting model based on neural network［J］. Journal of Computer Applications， 2025， 45（6）： 1776-1783.
[4]	王泉，陆啟想，施珮. 用于交通流量预测的多图扩散注意力网络［J］. 计算机应用， 2025， 45（5）： 1472-1479.
	WANG Q， LU Q X， SHI P. Multi-graph diffusion attention networks for traffic flow prediction［J］. Journal of Computer Applications， 2025， 45（5）： 1472-1479.
[5]	BALL R， BROWN P. An empirical evaluation of accounting［J］. Journal of Accounting Research， 1968， 6（2）： 159-178.
[6]	BAKER S R， BLOOM N， DAVIS S J. Measuring economic policy uncertainty［J］. The Quarterly Journal of Economics， 2016， 131（4）： 1593-1636.
[7]	PÁSTOR Ľ， VERONESI P. Political uncertainty and risk premia［J］. Journal of Financial Economics， 2013， 110（3）： 520-545.
[8]	GROSSMAN S J， STIGLITZ J E. On the impossibility of informationally efficient markets［J］. The American Economic Review， 1980， 70（3）： 393-408.
[9]	SHI B， HSU W N， LAKHOTIA K， et al. Learning audio-visual speech representation by masked multimodal cluster prediction［EB/OL］. ［2025-06-10］..
[10]	GAVRILYUK K， SANFORD R， JAVAN M， et al. Actor-Transformers for group activity recognition［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 836-845.
[11]	HOCHREITER S， SCHMIDHUBER J. Long short-term memory［J］. Neural Computation， 1997， 9（8）： 1735-1780.
[12]	SALINAS D， FLUNKERT V， GASTHAUS J， et al. DeepAR： probabilistic forecasting with autoregressive recurrent networks［J］. International Journal of Forecasting， 2020， 36（3）： 1181-1191.
[13]	BAI S， KOLTER J Z， KOLTUN V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling［EB/OL］. ［2025-06-10］..
[14]	SCARSELLI F， GORI M， TSOI A C， et al. The graph neural network model［J］. IEEE Transactions on Neural Networks， 2009， 20（1）： 61-80.
[15]	ZHOU H， ZHANG S， PENG J， et al. Informer： beyond efficient Transformer for long sequence time-series forecasting［C］// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2021： 11106-11115.
[16]	WU H， XU J， WANG J， et al. Autoformer： decomposition transformers with auto-correlation for long-term series forecasting［C］// Advances in Neural Information Processing Systems 34. Red Hook： Curran Associates Inc.. 2021： 22419-22430.
[17]	ZHOU T， MA Z， WEN Q， et al. FEDformer： frequency enhanced decomposed Transformer for long-term series forecasting［C］// Proceedings of the 39th International Conference on Machine Learning. New York： JMLR.org， 2022： 27268-27286.
[18]	CIRSTEA R G， GUO C， YANG B， et al. Triformer： triangular， variable-specific attentions for long sequence multivariate time series forecasting-full version［C］// Proceedings of the 31st International Joint Conference on Artificial Intelligence. California： ijcai.org， 2022： 1994-2001.
[19]	ZHANG Y， YAN J. Crossformer： Transformer utilizing cross-dimension dependency for multivariate time series forecasting［EB/OL］. ［2025-06-10］..
[20]	NIE Y， NGUYEN N H， SINTHONG P， et al. A time series is worth 64 words： long-term forecasting with Transformers［EB/OL］. ［2025-06-10］..
[21]	ZENG A， CHEN M， ZHANG L， et al. Are Transformers effective for time series forecasting？［C］// Proceedings of the 37th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2023： 11121-11128.
[22]	HU Z， LIU W， BIAN J， et al. Listening to chaotic whispers： a deep learning framework for news-oriented stock trend prediction［C］// Proceedings of the 11th ACM International Conference on Web Search and Data Mining. New York： ACM， 2018： 261-269.
[23]	XU Y， COHEN S B. Stock movement prediction from tweets and historical prices［C］// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2018： 1970-1979.
[24]	LI S， LIAO W， CHEN Y， et al. PEN： prediction-explanation network to forecast stock price movement with better explainability［C］// Proceedings of the 37th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2023： 5187-5194.
[25]	LUO D， LIAO W， LI S， et al. Causality-guided multi- memory interaction network for multivariate stock price movement prediction［C］// Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2023： 12164-12176.
[26]	KOA K J L， MA Y， NG R， et al. Diffusion variational autoencoder for tackling stochasticity in multi-step regression stock price prediction［C］// Proceedings of the 32nd ACM International Conference on Information and Knowledge Management. New York： ACM， 2023： 1087-1096.
[27]	WANG H， WANG T， LI S， et al. Heterogeneous interactive snapshot network for review-enhanced stock profiling and recommendation［C］// Proceedings of the 31st International Joint Conference on Artificial Intelligence. California： ijcai.org， 2022： 3962-3969.
[28]	XIA H， AO H， LI L， et al. CI-STHPAN： pre-trained attention network for stock selection with channel- independent spatio-temporal hypergraph［C］// Proceedings of the 38th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2024： 9187-9195.
[29]	ZHAO L， KONG S， SHEN Y. DoubleAdapt： a meta- learning approach to incremental learning for stock trend forecasting［C］// Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York： ACM， 2023： 3492-3503.
[30]	YOO J， SOUN Y， PARK Y C， et al. Accurate multivariate stock movement prediction via data-axis Transformer with multi-level contexts［C］// Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York： ACM， 2021： 2037-2045.
[31]	LI S， SUN Y， LIN Y， et al. CausalStock： deep end- to-end causal discovery for news-driven stock movement prediction［C］// Proceedings of the 38th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2024： 47432-47454.
[32]	WU S， İRSOY O， LU S， et al. BloombergGPT： a large language model for finance［EB/OL］. ［2025-06-10］..
[33]	WANG N， YANG H， WANG C D. FinGPT： instruction tuning benchmark for open-source large language models in financial datasets［EB/OL］. ［2025-06-10］..
[34]	ZHANG B， YANG H， ZHOU T， et al. Enhancing financial sentiment analysis via retrieval augmented large language models［C］// Proceedings of the 4th ACM International Conference on AI in Finance. New York： ACM， 2023： 349-356.
[35]	DONG Z， FAN X， PENG Z. FNSPID： a comprehensive financial news dataset in time series［C］// Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York： ACM， 2024： 4918-4927.
[36]	YU X， CHEN Z， LU Y. Harnessing LLMs for temporal data： a study on explainable financial time series forecasting［C］// Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing： Industry Track. Stroudsburg： ACL， 2023： 739-753.
[37]	KOA K J L， MA Y， NG R， et al. Learning to generate explainable stock predictions using self-reflective large language models［C］// Proceedings of the ACM Web Conference 2024. New York： ACM， 2024： 4304-4315.
[38]	LI X， SHEN X， ZENG Y， et al. FinReport： explainable stock earnings forecasting via news factor analyzing model［C］// Companion Proceedings of the ACM Web Conference 2024. New York： ACM， 2024： 319-327.
[39]	CHEN P， ZHANG Y， CHENG Y， et al. Pathformer： multi- scale transformers with adaptive pathways for time series forecasting［EB/OL］. ［2025-06-10］..
[40]	HUANG Q， AN Z， ZHUANG N， et al. Harder tasks need more experts： dynamic routing in MoE models［C］// Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2024： 12883-12895.
[41]	RADFORD A， WU J， CHILD R， et al. Language models are unsupervised multitask learners［EB/OL］. ［2025-06-10］..
[42]	MIKOLOV T， SUTSKEVER I， CHEN K， et al. Distributed representations of words and phrases and their compositionality［C］// Proceedings of the 27th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2013： 3111-3119.
[43]	PENNINGTON J， SOCHER R， MANNING C D. GloVe： global vectors for word representation［C］// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2014： 1532-1543.
[44]	LIU Y， HU T， ZHANG H， et al. iTransformer： inverted Transformers are effective for time series forecasting［EB/OL］. ［2025-06-10］..
[45]	FENG F， CHEN H， HE X， et al. Enhancing stock movement prediction with adversarial training［C］// Proceedings of the 28th International Joint Conference on Artificial Intelligence. California： ijcai.org， 2019： 5843-5849.
[46]	HUYNH T T， NGUYEN M H， NGUYEN T T， et al. Efficient integration of multi-order dynamics and internal dynamics in stock movement prediction［C］// Proceedings of the 16th ACM International Conference on Web Search and Data Mining. New York： ACM， 2023： 850-858.

数据集	股票数	数据源		数据时间区间
数据集	股票数	时序	文本	训练	推理	测试
S&P 500	87	Yahoo Finance	Twitter	2014-01-01 to 2014-12-31	2015-01-01 to 2015-10-01	2015-10-01 to 2015-12-31
CMIN-US	110	Google Finance	Yahoo	2018-01-02 to 2020-10-15	2020-10-16 to 2021-03-11	2021-03-12 to 2021-12-31

数据集	股票数	数据源		数据时间区间
数据集	股票数	时序	文本	训练	推理	测试
S&P 500	87	Yahoo Finance	Twitter	2014-01-01 to 2014-12-31	2015-01-01 to 2015-10-01	2015-10-01 to 2015-12-31
CMIN-US	110	Google Finance	Yahoo	2018-01-02 to 2020-10-15	2020-10-16 to 2021-03-11	2021-03-12 to 2021-12-31

模型	S&P 500					CMIN⁃US
模型	IC（↑）	RIC（↑）	ICIR（↑）	RICIR（↑）	SR（↑）	IC（↑）	RIC（↑）	ICIR（↑）	RICIR（↑）	SR（↑）
Autoformer	0.072 2	0.059 7	0.914 1	0.788 5	0.351 5	0.038 7	0.045 3	0.717 3	0.609 1	0.284 3
DLinear	0.059 6	0.045 5	0.673 0	0.537 9	0.417 4	0.021 5	0.039 0	0.518 1	0.502 8	0.344 0
PatchTST	0.070 7	0.054 2	0.797 2	0.666 8	0.745 6	0.049 4	0.032 9	0.608 8	0.616 2	0.689 1
iTransformer	0.043 6	0.025 7	0.545 0	0.352 5	0.761 9	0.035 1	0.027 6	0.615 5	0.505 4	0.732 0
Adv-ALSTM	0.071 4	0.036 1	0.828 1	0.871 8	0.901 0	0.066 3	0.036 2	0.829 1	0.711 6	0.885 4
ESTIMATE	0.076 3	0.115 3	0.901 7	1.000 6	0.894 3	0.072 1	0.036 1	0.735 4	0.824 1	0.763 3
MATCH	0.090 1	0.086 9	1.151 9	1.102 3	1.356 0	0.081 9	0.049 3	0.857 7	0.831 1	0.906 2

模型	S&P 500					CMIN⁃US
模型	IC（↑）	RIC（↑）	ICIR（↑）	RICIR（↑）	SR（↑）	IC（↑）	RIC（↑）	ICIR（↑）	RICIR（↑）	SR（↑）
Autoformer	0.072 2	0.059 7	0.914 1	0.788 5	0.351 5	0.038 7	0.045 3	0.717 3	0.609 1	0.284 3
DLinear	0.059 6	0.045 5	0.673 0	0.537 9	0.417 4	0.021 5	0.039 0	0.518 1	0.502 8	0.344 0
PatchTST	0.070 7	0.054 2	0.797 2	0.666 8	0.745 6	0.049 4	0.032 9	0.608 8	0.616 2	0.689 1
iTransformer	0.043 6	0.025 7	0.545 0	0.352 5	0.761 9	0.035 1	0.027 6	0.615 5	0.505 4	0.732 0
Adv-ALSTM	0.071 4	0.036 1	0.828 1	0.871 8	0.901 0	0.066 3	0.036 2	0.829 1	0.711 6	0.885 4
ESTIMATE	0.076 3	0.115 3	0.901 7	1.000 6	0.894 3	0.072 1	0.036 1	0.735 4	0.824 1	0.763 3
MATCH	0.090 1	0.086 9	1.151 9	1.102 3	1.356 0	0.081 9	0.049 3	0.857 7	0.831 1	0.906 2

模型	S&P 500					CMIN⁃US
模型	IC（↑）	RIC（↑）	ICIR（↑）	RICIR（↑）	SR（↑）	IC（↑）	RIC（↑）	ICIR（↑）	RICIR（↑）	SR（↑）
MATCH-v0	0.045 4	0.026 4	0.582 4	0.307 7	0.908 3	0.051 7	0.025 8	0.601 9	0.698 4	0.745 7
MATCH-v1	0.082 9	0.081 2	1.012 5	1.003 1	1.182 4	0.067 9	0.047 1	0.842 2	0.758 9	0.739 5
MATCH-v2	0.085 8	0.080 1	1.442 2	1.247 1	0.928 3	0.074 4	0.046 7	0.862 1	0.803 3	0.792 0
MATCH	0.090 1	0.086 9	1.151 9	1.102 3	1.356 0	0.081 9	0.049 3	0.857 7	0.831 1	0.906 2

MATCH： multimodal stock prediction framework integrating time-frequency features and hybrid text

融合时频特征与混合文本的多模态股票预测框架MATCH

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 7

References 46

Related Articles 15

Recommended Articles

Metrics

[1]	Junheng WU, Xiaodong WANG, Qixue HE. Time series prediction model based on statistical distribution sensing and frequency domain dual-channel fusion [J]. Journal of Computer Applications, 2026, 46(1): 113-123.
[2]	Jinyang HUANG, Fengqi CUI, Changxiu MA, Wendong FAN, Meng LI, Jingyu LI, Xiao SUN, Linsheng HUANG, Zhi LIU. Sleep apnea detection based on universal wristband [J]. Journal of Computer Applications, 2025, 45(9): 3045-3056.
[3]	Yihan WANG, Chong LU, Zhongyuan CHEN. Multimodal sentiment analysis model with cross-modal text information enhancement [J]. Journal of Computer Applications, 2025, 45(7): 2237-2244.
[4]	Jiaqi CHEN, Yulin HE, Yingchao CHENG, Zhexue HUANG. Semi-EM algorithm for solving Gamma mixture model of multimodal probability distribution [J]. Journal of Computer Applications, 2025, 45(7): 2153-2161.
[5]	Zonghang WU, Dong ZHANG, Guanyu LI. Multimodal fusion recommendation algorithm based on joint self-supervised learning [J]. Journal of Computer Applications, 2025, 45(6): 1858-1868.
[6]	Qing ZHANG, Fan YANG, Yuhan FANG. Chinese spelling correction algorithm based on multi-modal information fusion [J]. Journal of Computer Applications, 2025, 45(5): 1528-1534.
[7]	Haiyan TIAN, Saihao HUANG, Dong ZHANG, Shoushan LI. Visually guided word segmentation and part of speech tagging [J]. Journal of Computer Applications, 2025, 45(5): 1488-1495.
[8]	Jiana MENG, Chenhao BAI, Di ZHAO, Bolin WANG, Linlin GAO. Multimodal named entity recognition under causal intervention [J]. Journal of Computer Applications, 2025, 45(12): 3796-3803.
[9]	Huilin GUI, Kun YUE, Liang DUAN. Multimodal knowledge graph link prediction method based on fusing image and textual information [J]. Journal of Computer Applications, 2025, 45(11): 3540-3546.
[10]	Jinwen LIU, Lei WANG, Bo MA, Rui DONG, Yating YANG, Ahtamjan Ahmat, Xinyue WANG. Multimodal harmful content detection method based on weakly supervised modality semantic enhancement [J]. Journal of Computer Applications, 2025, 45(10): 3146-3153.
[11]	Yongping WANG, Yao LIU, Xiaolin ZHANG, Jingyu WANG, Lixin LIU. Multimodal adversarial example generation method for Chinese text classification [J]. Journal of Computer Applications, 2025, 45(10): 3074-3082.
[12]	Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392.
[13]	Tian CHEN, Conghu CAI, Xiaohui YUAN, Beibei LUO. Multimodal emotion recognition method based on multiscale convolution and self-attention feature fusion [J]. Journal of Computer Applications, 2024, 44(2): 369-376.
[14]	Hua LAI, Tong SUN, Wenjun WANG, Zhengtao YU, Shengxiang GAO, Ling DONG. Text punctuation restoration for Vietnamese speech recognition with multimodal features [J]. Journal of Computer Applications, 2024, 44(2): 418-423.
[15]	Shengyou ZHENG, Yanxiang CHEN, Zuxing ZHAO, Haiyang LIU. Construction and benchmark detection of multimodal partial forgery dataset [J]. Journal of Computer Applications, 2024, 44(10): 3134-3140.