《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (1): 300-310.DOI: 10.11772/j.issn.1001-9081.2023010028

• 前沿与综合应用 • 上一篇    

集成的深度强化学习投资组合模型

龙杰1, 谢良1(), 徐海蛟2   

  1. 1.武汉理工大学 理学院,武汉 430070
    2.广东第二师范学院 计算机学院,广州 510303
  • 收稿日期:2023-01-11 修回日期:2023-04-22 接受日期:2023-04-24 发布日期:2023-06-06 出版日期:2024-01-10
  • 通讯作者: 谢良
  • 作者简介:龙杰(1996—),男,四川遂宁人,硕士研究生,主要研究方向:深度学习、数据挖掘;
    徐海蛟(1972—),男,湖南常德人,高级工程师,博士,主要研究方向:大数据、深度学习、软件工程。
    第一联系人:谢良(1987—),男,湖北荆州人,副教授,博士,主要研究方向:机器学习、数据挖掘、多媒体检索;
  • 基金资助:
    广东省自然科学基金资助项目(2020A1515011208);广州市基础研究教育计划基础与应用基础研究项目(202102080353);广东省普通高校自然科学类特色创新项目(2019KTSCX117)

Integrated deep reinforcement learning portfolio model

Jie LONG1, Liang XIE1(), Haijiao XU2   

  1. 1.College of Science,Wuhan University of Technology,Wuhan Hubei 430070,China
    2.College of Computer Science,Guangdong University of Education,Guangzhou Guangdong 510303,China
  • Received:2023-01-11 Revised:2023-04-22 Accepted:2023-04-24 Online:2023-06-06 Published:2024-01-10
  • Contact: Liang XIE
  • About author:LONG Jie, born in 1996, M. S. candidate. His research interests include deep learning, data mining.
    XU Haijiao, born in 1972, Ph. D., senior engineer. His research interests include big data, deep learning, software engineering.
  • Supported by:
    Natural Science Foundation of Guangdong Province(2020A1515011208);Basic and Applied Basic Research Project of Guangzhou Basic Research Education Plan(202102080353);Natural Science Characteristic Innovation Project of Ordinary Colleges and Universities in Guangdong Province(2019KTSCX117)

摘要:

投资组合问题是量化交易领域中的热点问题。针对现有基于深度强化学习的投资组合模型无法实现自适应的交易策略和有效利用有监督信息的缺陷,提出一种集成的深度强化学习投资组合模型(IDRLPM)。首先,采用多智能体方法构造多个基智能体并设计不同交易风格的奖励函数,以表示不同的交易策略;其次,利用集成学习方法对基智能体的策略网络进行特征融合,得到自适应市场环境的集成智能体;然后,在集成智能体中嵌入基于卷积块注意力模块(CBAM)的趋势预测网络,趋势预测网络输出引导集成策略网络自适应选择交易比重;最后,在有监督深度学习和强化学习交替迭代训练下,IDRLPM有效利用训练数据中的监督信息以增强模型盈利能力。在上证50的成分股和中证500的成分股数据集中,IDRLPM的夏普比率(SR)达到了1.87和1.88,累计收益(CR)达到了2.02和1.34;相较于集合式的深度强化学习(EDRL)交易模型,SR提高了105%和55%,CR提高了124%和79%。实验结果表明,IDRLPM能够有效解决投资组合问题。

关键词: 深度强化学习, 投资组合模型, 集成学习, 卷积块注意力模块, 趋势预测

Abstract:

The portfolio problem is a hot issue in the field of quantitative trading. An Integrated Deep Reinforcement Learning Portfolio Model (IDRLPM) was proposed to address the shortcomings of existing deep reinforcement learning-based portfolio models that cannot achieve adaptive trading strategies and effectively utilize supervised information. Firstly, multi-agent method was used to construct multiple base agents and design reward functions with different trading styles to represent different trading strategies. Secondly, integrated learning method was used to fuse the features of strategy network of the base agents to obtain the integrated agent adaptive to market environment. Then, a trend prediction network based on Convolutional Block Attention Module (CBAM) was embedded in the integrated agent, and the output of the trend prediction network guided integrated strategy network to adaptively select the proportion of trades. Finally, under the alternating iterative training of supervised deep learning and reinforcement learning, IDRLPM effectively utilized supervised information from training data to enhance model profitability. The Sharpe Ratio (SR) of IDRLPM reaches 1.87 and 1.88, and the Cumulative Return (CR) reaches 2.02 and 1.34 in Shanghai Stock Exchange (SSE) 50 constituent stocks and China Securities Index (CSI) 500 constituent stocks; compared with the Ensemble Deep Reinforcement Learning (EDRL) trading model, the SR improves by 105% and 55%, and the CR improves by 124% and 79%. The experimental results show that IDRLPM can effectively solve the portfolio problem.

Key words: deep reinforcement learning, portfolio model, integrated learning, Convolutional Block Attention Module (CBAM), trend prediction

中图分类号: