《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (S1): 95-99.DOI: 10.11772/j.issn.1001-9081.2022050736

• 人工智能 • 上一篇    下一篇

基于改进Wide&Deep的卷烟焦油指标预测模型

周涛1,2, 谢立华1(), 王啸飞3   

  1. 1.四川中烟工业有限责任公司 什邡卷烟厂, 什邡618400
    2.四川中烟工业有限责任公司 信息中心, 成都 610020
    3.中国科学院 成都计算机应用研究所, 成都 610041
  • 收稿日期:2022-05-23 修回日期:2022-06-15 接受日期:2022-06-17 发布日期:2023-07-04 出版日期:2023-06-30
  • 通讯作者: 谢立华
  • 作者简介:周涛(1974—),男,四川什邡人,高级工程师,主要研究方向:大数据分析、智能制造
    谢立华(1995—),男,四川简阳人,助理工程师,主要研究方向:信息技术、智能制造.sctobaccoxlh@163.com
    王啸飞(1997—),男,湖南慈利人,硕士研究生,主要研究方向:机器学习、推荐算法。
  • 基金资助:
    中国科学院西部青年学者项目(RRJZ2021003)

Cigarette tar index prediction model based on improved Wide&Deep

Tao ZHOU1,2, Lihua XIE1(), Xiaofei WANG3   

  1. 1.Shifang Cigarette Factory,China Tobacco Sichuan Industry Limited Liability Company,Shifang Sichuan 618400,China
    2.Information Center,China Tobacco Sichuan Industry Limited Liability Company,Chengdu Sichuan 610020,China
    3.Chengdu Institute of Computer Application,Chinese Academy of Sciences,Chengdu Sichuan 610041,China
  • Received:2022-05-23 Revised:2022-06-15 Accepted:2022-06-17 Online:2023-07-04 Published:2023-06-30
  • Contact: Lihua XIE

摘要:

针对卷烟焦油指标预测任务中历史卷烟数据样本具有小样本和高维度的特点,导致模型预测准确度偏低的问题,提出一种基于改进Wide&Deep的卷烟焦油指标预测模型。首先通过多个机器学习模型对数据样本进行预测,并将得到的结果作为模型新特征;然后将机器学习模型得到的新特征输入到Wide&Deep模型的Wide端,同时构建融合特征输入到Wide&Deep模型的Deep端,并在Deep端通过引入二阶特征和注意力机制构建注意力特征交叉层实现特征的高阶组合以提高模型预测的准确度。实验结果表明,所提模型与未经过改进的Wide&Deep模型相比,平均绝对误差(MAE)降低了23.4%,均方根误差(RMSE)降低了21.8%;与基于卷积神经网络提取特征的改进Wide&Deep模型相比,MAE降低了15.0%,RMSE降低了16.4%;有效提升了卷烟焦油指标预测任务的准确度。

关键词: 机器学习, Wide&Deep模型, 小样本, 指标预测, 特征交叉, 卷烟焦油

Abstract:

Aiming at the problem that the historical cigarette data samples in the cigarette tar index prediction task have the characteristics of small sample and high dimension, which leads to the low prediction accuracy of the model, a cigarette tar index prediction model based on the improved Wide&Deep was proposed. First, the data samples were predicted through multiple machine learning models and the obtained results were used as new features of the model. Then the new features obtained by the machine learning models were input to the Wide side of the Wide&Deep model,the fusion features were constructed and input to the Deep side of the Wide&Deep model, and by introducing second-order features and attention mechanism to build an attention feature intersection layer, high-order combination of features were achieved to improve the accuracy of model prediction. Experimental results show that compared with the unimproved Wide&Deep model, the proposed model reduces Mean Absolute Error (MAE) by 23.4% and Root Mean Square Error (RMSE) by 21.8%; compared with the Wide&Deep model based on convolutional neural network for extraction features, the proposed model reduces MAE by 15.0% and RMSE by 16.4%. The proposed model effectively improves the accuracy of the cigarette tar index prediction task.

Key words: machine learning, Wide&Deep model, small sample, index prediction, feature intersection, cigarette tar

中图分类号: