Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (2): 427-436.DOI: 10.11772/j.issn.1001-9081.2025080955

• Artificial intelligence • Previous Articles    

MATCH: multimodal stock prediction framework integrating time-frequency features and hybrid text

Hanyue WEI, Chenjuan GUO(), Jieyuan MEI, Jindong TIAN, Peng CHEN, Ronghui XU, Bin YANG   

  1. School of Data Science and Engineering,East China Normal University,Shanghai 200062,China
  • Received:2025-08-20 Revised:2025-09-11 Accepted:2025-10-10 Online:2026-03-02 Published:2026-02-10
  • Contact: Chenjuan GUO
  • About author:WEI Hanyue, born in 1999, M. S. candidate. Her research interests include financial time series forecasting, multimodal forecasting.
    GUO Chenjuan, born in 1982, Ph. D., professor. Her research interests include spatio-temporal data analysis, artificial intelligence. cjguo@dase.ecnu.edu.cn
    MEI Jieyuan, born in 2000, M. S. candidate. His research interests include time series forecasting.
    TIAN Jindong, born in 2000, Ph. D. candidate. His research interests include spatio-temporal data analysis, knowledge-data dual-driven models.
    CHEN Peng, born in 1999, Ph. D. candidate. His research interests include time series analysis.
    XU Ronghui, born in 1999, Ph. D. candidate. Her research interests include spatio-temporal data analysis, multimodal learning.
    YANG Bin, born in 1982, Ph. D., professor. His research interests include artificial intelligence, data analysis, decision intelligence.
  • Supported by:
    National Natural Science Foundation of China(62372179)

融合时频特征与混合文本的多模态股票预测框架MATCH

魏涵玥, 郭晨娟(), 梅杰源, 田锦东, 陈鹏, 徐榕荟, 杨彬   

  1. 华东师范大学 数据科学与工程学院,上海 200062
  • 通讯作者: 郭晨娟
  • 作者简介:魏涵玥(1999—),女,江苏苏州人,硕士研究生,主要研究方向:金融时序预测、多模态预测
    郭晨娟(1982—),女,辽宁铁岭人,教授,博士,CCF会员,主要研究方向:时空数据分析、人工智能 cjguo@dase.ecnu.edu.cn
    梅杰源(2000—),男,浙江金华人,硕士研究生,主要研究方向:时序预测
    田锦东(2000—),男,四川南充人,博士研究生,CCF会员,主要研究方向:时空数据分析、知识-数据双驱动模型
    陈鹏(1999—),男,安徽郎溪人,博士研究生,主要研究方向:时序分析
    徐榕荟(1999—),女,广西桂林人,博士研究生,主要研究方向:时空数据分析、多模态学习
    杨彬(1982—),男,陕西西安人,教授,博士,CCF会员,主要研究方向:人工智能、数据分析、决策智能。
  • 基金资助:
    国家自然科学基金资助项目(62372179)

Abstract:

The existing stock prediction models are mainly based on unimodal, and ignore inter-industry linkage effects and information heterogeneity. Although some studies have introduced textual modalities, they still struggle with challenges such as time lag effects and multi-granularity caused by modality inconsistency. Therefore, MATCH(Multimodal stock prediction frAmework inTegrating time-frequenCy features and Hybrid text), a multimodal fusion framework for stock prediction that integrates heterogeneous information across modalities effectively was proposed. Specifically, a Mixture of Experts (MoE) pretraining strategy was designed to build industry-specific pretrained models of representations, so as to select matched expert networks dynamically and incorporate industry features information. At the same time, a frequency-domain decomposition and hierarchical fusion mechanism was designed to jointly model temporal patterns at multiple frequencies, and a dual-stream pretraining architecture was used to obtain representations of high-frequency future fluctuations and low-frequency future trends, which were interacted with text information across multiple time scales cross-modally, thereby capturing market dynamics more precisely and enabling effective interaction between time-series and textual data in multi-granular scenarios. Experimental results on two real-world stock datasets, S&P 500 and CMIN-US for comparing MATCH and mainstream methods such as ESTIMATE (Efficient STock Integration with teMporal generative filters and wavelet hypergraph ATtEntions) and PatchTST demonstrate that, on S&P 500 dataset, MATCH has the Sharpe Ratio (SR) improved by 50.5% over the sub-optimal baseline model Adv-ALSTM, while on the more challenging CMIN-US dataset, MATCH achieves a 2.35% SR improvement, with other metrics reaching the best. It can be seen that MATCH provides a novel and efficient solution for multimodal data fusion in finance.

Key words: financial time series, multimodal, Mixture of Experts (MoE) model, pretrained model, time-frequency analysis

摘要:

现有股票预测模型多基于单一模态,忽视了行业间的联动效应与信息异质性;部分研究虽引入了文本模态,但在处理模态异构所导致的时滞性和多粒度等问题上仍存在不足。因此,提出面向股票市场的融合时频特征与混合文本的多模态股票预测框架MATCH(Multimodal stock prediction frAmework inTegrating time-frequenCy features and Hybrid text)。一方面,设计混合专家(MoE)预训练策略为每个行业构建特定的预训练表征模型,在预测过程中动态选择匹配的专家网络,并注入行业特征信息;另一方面,设计频域分解与层次化融合机制,通过双流预训练架构获取高频未来波动和低频未来趋势的表征,把它们与不同时间尺度的文本信息进行跨模态交互,更精准地捕捉市场动态变化,并实现多粒度场景下的时序与文本有效交互。在2个真实股票数据集S&P 500和CMIN-US上,MATCH与ESTIMATE(Efficient STock Integration with teMporal generative filters and wavelet hypergraph ATtEntions)和PatchTST等主流方法进行对比的实验结果显示,在S&P 500数据集上相较次优基线模型Adv-ALSTM,MATCH的夏普比率(SR)提升了50.5%;在更具有挑战性的CMIN-US数据集上,MATCH的SR提升了2.35%,其余指标均取得了最佳成绩。MATCH预测性能提升明显可为金融多模态数据融合提供新颖且高效的解决方案。

关键词: 金融时间序列, 多模态, 混合专家模型, 预训练模型, 时频分析

CLC Number: