《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (4): 1099-1106.DOI: 10.11772/j.issn.1001-9081.2023050557

• 人工智能 • 上一篇    

重加权的对抗变分自编码器及其在工业因果效应估计中的应用

李宗禹1,2, 强思维3, 郭晓波3, 朱振峰1,2()   

  1. 1.北京交通大学 计算机与信息技术学院,北京 100044
    2.北京交通大学 信息科学研究所,北京 100044
    3.蚂蚁集团网商银行,北京 100020
  • 收稿日期:2023-05-10 修回日期:2023-07-18 接受日期:2023-07-24 发布日期:2023-08-03 出版日期:2024-04-10
  • 通讯作者: 朱振峰
  • 作者简介:李宗禹(1998—),男,河北衡水人,硕士研究生,主要研究方向:因果效应估计、因果推理
    强思维(1989—),男,湖北武汉人,硕士研究生,主要研究方向:因果推断、机器学习、数据挖掘
    郭晓波(1986—),男,河北栾城人,博士研究生,主要研究方向:互联网金融、机器学习、数据挖掘
    朱振峰(1974—),男,黑龙江鸡西人,教授,博士生导师,博士,主要研究方向:计算机视觉、机器学习、人工智能。zfzhu@bjtu.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(61976018)

Re-weighted adversarial variational autoencoder and its application in industrial causal effect estimation

Zongyu LI1,2, Siwei QIANG3, Xiaobo GUO3, Zhenfeng ZHU1,2()   

  1. 1.School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China
    2.Institute of Information Science,Beijing Jiaotong University,Beijing 100044,China
    3.Ant Group MYbank,Beijing 100020,China
  • Received:2023-05-10 Revised:2023-07-18 Accepted:2023-07-24 Online:2023-08-03 Published:2024-04-10
  • Contact: Zhenfeng ZHU
  • About author:LI Zongyu, born in 1998, M. S. candidate. His research interests include causal effect estimation, causal inference.
    QIANG Siwei, born in 1989, M. S. candidate. His research interests include causal inference, machine learning, data mining.
    GUO Xiaobo, born in 1986, Ph. D. candidate. His research interests include Internet finance, machine learning, data mining.
    ZHU Zhenfeng, born in 1974, Ph. D., professor. His research interests include computer vision, machine learning, artifical intelligence.
  • Supported by:
    National Natural Science Foundation of China(61976018)

摘要:

反事实预测和选择偏差是因果效应估计中的重大挑战。为对潜在协变量的复杂混杂分布进行有效表征,同时增强反事实预测泛化能力,提出一种面向工业因果效应估计应用的重加权对抗变分自编码器网络(RVAENet)模型。针对混杂分布去偏问题,借鉴域适应思想,采用对抗学习机制对由变分自编码器(VAE)获得的隐含变量进行表示学习的分布平衡;在此基础上,通过学习样本倾向性权重对样本进行重加权,进一步缩小实验组(Treatment)与对照组(Control)样本间的分布差异。实验结果表明,在工业真实场景数据集的两个场景下,所提模型的提升曲线下的面积(AUUC)比TEDVAE(Treatment Effect with Disentangled VAE)分别提升了15.02%、16.02%;在公开数据集上,所提模型的平均干预效果(ATE)和异构估计精度(PEHE)普遍取得最优结果。

关键词: 因果效应估计, 重加权, 变分自编码器, 反事实预测, 选择偏差, 因果学习

Abstract:

Counterfactual prediction and selection bias are major challenges in causal effect estimation. To effectively represent the complex mixed distribution of potential covariant and enhance the generalization ability of counterfactual prediction, a Re-weighted adversarial Variational AutoEncoder Network (RVAENet) model was proposed for industrial causal effect estimation. To address bias problem in mixed distribution, the idea of domain adaptation was adopted, and an adversarial learning mechanism was used to balance the representation learning distribution of the latent variables obtained by the Variational AutoEncoder (VAE). Furthermore, the sample propensity weights were learned to re-weight the samples, reducing the distribution difference between the treatment group and the control group. The experimental results show that, in two scenarios of the industrial real-world datasets, the Areas Under Uplift Curve (AUUC) of the proposed model are improved by 15.02% and 16.02% compared to TEDVAE (Treatment Effect with Disentangled VAE). On the public datasets, the proposed model generally achieves optimal results for Average Treatment Effect (ATE) and Precision in Estimation of Heterogeneous Effect (PEHE).

Key words: causal effect estimation, re-weighting, Variational AutoEncoder (VAE), counterfactual prediction, selection bias, causal learning

中图分类号: