Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (5): 1647-1654.DOI: 10.11772/j.issn.1001-9081.2022040542

• Frontier and comprehensive applications • Previous Articles    

Ultra-short-term photovoltaic power prediction by deep reinforcement learning based on attention mechanism

Zhengkai DING1,2, Qiming FU1,2(), Jianping CHEN2,3,4, You LU1,2, Hongjie WU1, Nengwei FANG4, Bin XING4   

  1. 1.School of Electronic and Information Engineering,Suzhou University of Science and Technology,Suzhou Jiangsu 215009,China
    2.Jiangsu Key Laboratory of Intelligent Building Energy Efficiency (Suzhou University of Science and Technology),Suzhou Jiangsu 215009,China
    3.School of Architecture and Urban Planning,Suzhou University of Science and Technology,Suzhou Jiangsu 215009,China
    4.Chongqing Industrial Big Data Innovation Center Company Limited,Chongqing 400707,China
  • Received:2022-04-21 Revised:2022-06-13 Accepted:2022-06-15 Online:2022-07-01 Published:2023-05-10
  • Contact: Qiming FU
  • About author:DING Zhengkai, born in 1996, M. S. candidate. His research interests include deep reinforcement learning, building intelligence.
    FU Qiming, born in 1985, Ph. D., associate professor. His research interests include reinforcement learning, pattern recognition, building energy saving.
    CHEN Jianping, born in 1963, Ph. D., professor. His research interests include building energy saving, intelligent information processing.
    LU You, born in 1977, Ph. D., associate professor. His research interests include next generation network architecture, cloud computing and big data, blockchain.
    WU Hongjie, born in 1977, Ph. D., professor. His research interests include artificial intelligence, data mining, bioinformatics, industrial internet.
    FANG Nengwei, born in 1980, M. S. His research interests include industrial big data.
    XING Bin, born in 1962, Ph. D. His research interests include industrial big data analysis, intelligent manufacturing, industrial mechanism model, artificial intelligence.
  • Supported by:
    National Key Research and Development Program of China(2020YFC2006602);National Natural Science Foundation of China(62102278);University Natural Science Foundation of Jiangsu Province(21KJA520005);Key Research and Development Program of Jiangsu Province(BE2020026);Natural Science Foundation of Jiangsu Province(BK20190942)

结合注意力机制与深度强化学习的超短期光伏功率预测

丁正凯1,2, 傅启明1,2(), 陈建平2,3,4, 陆悠1,2, 吴宏杰1, 方能炜4, 邢镔4   

  1. 1.苏州科技大学 电子与信息工程学院, 江苏 苏州 215009
    2.江苏省建筑智慧节能重点实验室(苏州科技大学), 江苏 苏州 215009
    3.苏州科技大学 建筑与城市规划学院, 江苏 苏州 215009
    4.重庆工业大数据创新中心有限公司, 重庆 400707
  • 通讯作者: 傅启明
  • 作者简介:丁正凯(1996—),男,江苏盐城人,硕士研究生,主要研究方向:深度强化学习、建筑智能化
    傅启明(1985—),男,江苏淮安人,副教授,博士,CCF会员,主要研究方向:强化学习、模式识别、建筑节能 fqm_1@mail.usts.edu.cn
    陈建平(1963—),男,江苏南京人,教授,博士,主要研究方向:建筑节能、智能信息处理
    陆悠(1977—),男,江苏苏州人,副教授,博士,主要研究方向:下一代网络体系结构、云计算与大数据、区块链
    吴宏杰(1977—),男,江苏苏州人,教授,博士,主要研究方向:人工智能、数据挖掘、生物信息、工业互联网
    方能炜(1980—),男,北京人,硕士,主要研究方向:工业大数据
    邢镔(1962—),男,北京人,博士,主要研究方向:工业大数据分析、智能制造、工业机制模型、人工智能。
  • 基金资助:
    国家重点研发计划项目(2020YFC2006602);国家自然科学基金资助项目(62102278);江苏省高校自然科学基金资助项目(21KJA520005);江苏省重点研发计划项目(BE2020026);江苏省自然科学基金资助项目(BK20190942)

Abstract:

To address the problem that traditional PhotoVoltaic (PV) power prediction models are affected by random power fluctuation and tend to ignore important information, resulting in low prediction accuracy, ADDPG and ARDPG models were proposed by combining the attention mechanism with Deep Deterministic Policy Gradient (DDPG) and Recurrent Deterministic Policy Gradient (RDPG), respectively, and a PV power prediction framework was proposed on this basis. Firstly, the original PV power data and meteorological data were normalized, and the PV power prediction problem was modeled as a Markov Decision Process (MDP), where the historical power data and current meteorological data were used as the states of MDP. Then the attention mechanism was added to the Actor networks of DDPG and RDPG, giving different weights to different components of the state to highlight important and critical information, and learning critical information in the data through the interaction of Deep Reinforcement Learning (DRL) agents and historical data. Finally, the MDP problem was solved to obtain the optimal strategy and make accurate prediction. Experimental results on DKASC and Alice Springs PV system data show that ADDPG and ARDPG achieve the best results in Root Mean Square Error (RMSE), Mean Absolute Error (MAE) and R2. It can be seen that the proposed models can effectively improve the prediction accuracy of PV power, and can also be extended to other prediction fields such as grid prediction and wind power generation prediction.

Key words: deep reinforcement learning, attention mechanism, PhotoVoltaic (PV) power prediction, Deep Deterministic Policy Gradient (DDPG), Recurrent Deterministic Policy Gradient (RDPG)

摘要:

针对传统光伏(PV)功率预测模型受功率随机波动性影响以及易忽略重要信息导致预测精度低的问题,将注意力机制分别与深度确定性策略梯度(DDPG)和循环确定性策略梯度(RDPG)相结合提出了ADDPG和ARDPG模型,并在此基础上提出一个PV功率预测框架。首先,将原始PV功率数据以及气象数据标准化,并将PV功率预测问题建模为马尔可夫决策过程(MDP),历史功率数据和当前气象数据则作为MDP的状态;然后,将注意力机制加入DDPG和RDPG的Actor网络,赋予状态中各个分量不同的权重来突出重要且关键的信息,并通过深度强化学习智能体和历史数据的交互来学习数据中的关键信息;最后,求解MDP问题得到最优的策略,作出准确的预测。在DKASC、Alice Springs光伏系统数据上的实验结果表明,ADDPG和ARDPG在均方根误差(RMSE)、平均绝对误差(MAE)和决定系数(R2)上均取得了最优结果。可见,所提模型能够有效提高PV功率的预测精度,也可以推广到其他预测领域如电网预测、风力发电预测等。

关键词: 深度强化学习, 注意力机制, 光伏功率预测, 深度确定性策略梯度, 循环确定性策略梯度

CLC Number: