Journal of Computer Applications ›› 2020, Vol. 40 ›› Issue (9): 2762-2767.DOI: 10.11772/j.issn.1001-9081.2019122249

• Frontier & interdisciplinary applications • Previous Articles     Next Articles

Intelligent house price evaluation model based on ensemble LightGBM and Bayesian optimization strategy

GU Tong1,2, XU Guoliang2, LI Wanlin2, LI Jiahao1,2, WANG Zhiyuan2, LUO Jiangtao2   

  1. 1. College of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China;
    2. Electronic Information and Networking Research Institute, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
  • Received:2020-01-09 Revised:2020-02-25 Online:2020-09-10 Published:2020-05-14
  • Supported by:
    This work is partially supported by the Ministry of Education-China Mobile Scientific Research Fund (MCM20170203), the Chongqing Natural Science Foundation (cstc2018jcyjAX0587), the Chongqing Technology Innovation and Application Demonstration (Industry Key Research and Development) Project (cstc2018jszx-cyzdX0124).

基于集成LightGBM和贝叶斯优化策略的房价智能评估模型

顾桐1,2, 许国良2, 李万林2, 李家浩1,2, 王志愿2, 雒江涛2   

  1. 1. 重庆邮电大学 通信与信息工程学院, 重庆 400065;
    2. 重庆邮电大学 电子信息与网络工程研究院, 重庆 400065
  • 通讯作者: 许国良
  • 作者简介:顾桐(1995-),男,四川南充人,硕士研究生,主要研究方向:机器学习、数据挖掘;许国良(1973-),男,浙江金华人,教授,博士,主要研究方向:光电传感与检测、通信网络设计与规划、大数据分析挖掘;李万林(1963-),男,四川广安人,教授,博士生导师,博士,主要研究方向:新一代网络技术、自动驾驶、车联网、移动大数据;李家浩(1994-),男,重庆永川人,硕士研究生,主要研究方向:数据挖掘;王志愿(1995-),男,河南驻马店人,硕士研究生,主要研究方向:数据挖掘;雒江涛(1971-),男,河南郑州人,教授,博士生导师,博士,主要研究方向:移动大数据、新一代网络技术、通信网络测试与优化。
  • 基金资助:
    教育部-中国移动科研基金资助项目(MCM20170203);重庆市自然科学基金资助项目(cstc2018jcyjAX0587);重庆市技术创新与应用示范(产业类重点研发)项目(cstc2018jszx-cyzdX0124)。

Abstract: Concerning the problems in traditional house price evaluation method, such as single data source, over-reliance on subjective experience, idealization of considerations, an intelligent evaluation method based on multi-source data and ensemble learning was proposed. First, feature set was constructed from multi-source data, and the optimal feature subset was extracted using Pearson correlation coefficient and sequential forward selection method. Then, with Bagging ensemble strategy used as a combination method, multiple Light Gradient Boosting Machines (LightGBMs) were integrated based on the constructed features, and the model was optimized by using Bayesian optimization algorithm. Finally, this method was applied to the problem of house price evaluation, and the intelligent evaluation of house prices was realized. Experimental results on the real house price dataset show that, compared with traditional models such as Support Vector Machine (SVM) and random forest, the new model introduced with ensemble learning and Bayesian optimization improves the evaluation accuracy by 3.15%, and the evaluation results with percent error within 10% account for 84.09%. It can be seen that, the proposed model can be well applied to the field of intelligent house price evaluation, and has more accurate evaluation results.

Key words: multi-source data, feature selection, Light Gradient Boosting Machine (LightGBM), ensemble learning, Bayesian optimization, intelligent evaluation of house price

摘要: 针对传统房价评估方法中存在的数据源单一、过分依赖主观经验、考虑因素理想化等问题,提出一种基于多源数据和集成学习的智能评估方法。首先,从多源数据中构造特征集,并利用Pearson相关系数与序列前向选择法提取最优特征子集;然后,基于构造的特征,以Bagging集成策略作为结合方法集成多个轻量级梯度提升机(LightGBM),并利用贝叶斯优化算法对模型进行优化;最后,将该方法应用于房价评估问题,实现房价的智能评估。在真实的房价数据集上进行的实验表明,相较于支持向量机(SVM)、随机森林等传统模型,引入集成学习和贝叶斯优化的新模型的评估精度提升了3.15%,并且百分误差在10%以内的评估结果占比84.09%。说明所提模型能够很好地应用于房价评估领域,得到的评估结果更准确。

关键词: 多源数据, 特征选择, 轻量级梯度提升机, 集成学习, 贝叶斯优化, 房价智能评估

CLC Number: