Hybrid Causal Model Learning Algorithm Based on Relational Data

doi:10.11772/j.issn.1001-9081.2025111450

Journal of Computer Applications

Received:2025-12-08 Revised:2026-01-31 Accepted:2026-02-03 Online:2026-02-10 Published:2026-02-10
Supported by:
National Natural Science Foundation of China

基于关系型数据的混合因果模型学习算法

闫琳¹,²,钱宇华³,刘赛雄⁴,李珏¹

1. 山西大学大数据科学与产业研究院
2. 演化科学智能山西省重点实验室
3. 山西大学
4. 山西大学大数据科学与产业研究院、山西大学演化科学智能山西省重点实验室

通讯作者: 钱宇华
基金资助:
国家自然科学基金重点项目

Abstract

Abstract: Abstract: Relationships in the real world involve interactions among various entity types, and Relational Causal Model (RCM) provides a clear depiction of such relationships. Learning causal relationships from relational causal model is crucial for supporting business decision-making in complex scenarios. Most existing algorithms rely on the oracle relational conditional independence to discover causal relationships, failing to learn from relational data; algorithms designed to learn causal dependencies from relational data typically adopt constraint-based approaches, but their performance is limited by finite sample sizes, resulting in relatively low recall and F1 score. To address these issues, a hybrid algorithm based on constraint and scoring (RCSH) was proposed. Undirected dependencies were first identified using a heuristic algorithm, and an undirected relational causal model was constructed. The Relational Bivariate Orientation (RBO) rule was then applied to orient the model. After the search space was restricted, a greedy hill-climbing algorithm was employed to improve sensitivity to long relational paths and multi-attribute dependencies under limited sample sizes. In the comparison experiments with Robust Relational Causal Discovery (RRCD), the proposed algorithm achieved improvements of approximately 12.8% in recall and 3.31% in F1-score, showing a steady upward trend as the dataset size increased. Furthermore, the applicability and effectiveness of RCSH were validated on real-world datasets.

Key words: Keywords: relational causal model, structure learning, causal discovery, relational data, hybrid algorithm

摘要： 摘要: 真实世界中的关系涉及多种实体类型间的交互，关系因果模型(RCM)形象地刻画了这类关系。研究如何从关系因果模型中学习因果关系对复杂场景中的业务决策具有重要意义。现有算法大多依赖于先验知识(Oracle)的关系条件独立性检验来建立和确定因果关系，无法从关系型数据中学习因果；而已有的从关系型数据中学习因果的算法采用基于约束的方式，受到有限数据样本量的限制，导致其算法召回率和F1分数不是很高。基于上述问题，本文提出约束和打分相结合的混合算法(RCSH)。该算法首先通过启发式算法获取无向依赖，构建无向关系因果模型；然后利用关系双变量定向规则(RBO)对该关系因果模型进行定向，在限制搜索空间之后，引入贪婪爬山算法，缓解了已有算法在有限数据量样本下的对长关系路径和多属性依赖的低敏感性问题。合成数据集上的实验结果表明，与鲁棒关系因果发现算法(RRCD)相比，RCSH算法的召回率提升了约12.8%，F1分数提高了约3.31%，且随着数据规模的增大表现出稳步提升的趋势。同时，RCSH算法在真实数据集上也验证了其适用性与有效性。

关键词: 关键词: 关系因果模型, 结构学习, 因果发现, 关系型数据, 混合算法

CLC Number:

中图分类号:TP311

闫琳钱宇华刘赛雄李珏. 基于关系型数据的混合因果模型学习算法[J]. 《计算机应用》唯一官方网站, DOI: 10.11772/j.issn.1001-9081.2025111450.

[1]	Qijian CAI, Wei TAN. Semantic graph enhanced multi-modal recommendation algorithm [J]. Journal of Computer Applications, 2025, 45(2): 421-427.
[2]	Xianglan WU, Yang XIAO, Mengying LIU, Mingming LIU. Text-to-SQL model based on semantic enhanced schema linking [J]. Journal of Computer Applications, 2024, 44(9): 2689-2695.
[3]	Junxing XIANG, Yonghong WU. Hybrid salp swarm and butterfly optimization algorithm combined with neighborhood centroid opposition-based learning [J]. Journal of Computer Applications, 2023, 43(3): 820-826.
[4]	Qize REN, Hongjie JIA, Dongyu CHEN. Large-scale subspace clustering algorithm with Local structure learning [J]. Journal of Computer Applications, 2023, 43(12): 3747-3754.
[5]	LI Mengmeng, QIN Wei, LIU Yi, DIAO Xingchun. Hybrid ant colony optimization algorithm with brain storm optimization [J]. Journal of Computer Applications, 2021, 41(8): 2412-2417.
[6]	Xuanyi LI, Yun ZHOU. BNSL-FIM： Bayesian network structure learning algorithm based on frequent item mining [J]. Journal of Computer Applications, 2021, 41(12): 3475-3479.
[7]	CAI Ruichu, BAI Yiming, QIAO Jie, HAO Zhifeng. Causal inference method based on confounder hidden compact representation model [J]. Journal of Computer Applications, 2021, 41(10): 2793-2798.
[8]	LU Ling, YANG Wu, LIU Xu, LI Yan. Stance detection method based on entity-emotion evolution belief net [J]. Journal of Computer Applications, 2017, 37(5): 1402-1406.
[9]	BAI Liang WANG Lei. Optimization model and algorithm for production order acceptance problem of hot-rolled bar [J]. Journal of Computer Applications, 2014, 34(8): 2419-2423.
[10]	MA Jin XIE Jiang DAI Dongbo TAN Jun ZHANG Wu. Parallelism of adaptive Hungary greedy algorithm for biomolecular networks alignment [J]. Journal of Computer Applications, 2013, 33(12): 3321-3325.
[11]	ZHAO Xuewu LIU Guangliang CHENG Xindang JI Junzhong. Bayesian network structure learning algorithm based on topological order and quantum genetic algorithm [J]. Journal of Computer Applications, 2013, 33(06): 1595-1603.
[12]	. Fingerprint image enhancement using mixed filters [J]. Journal of Computer Applications, 2008, 28(7): 1892-1895.
[13]	Dong WANG . Strategy for improving the performance of chained Lin-Kernighan algorithm [J]. Journal of Computer Applications, 2007, 27(11): 2826-2829.
[14]	XU Hai-xia,TIAN Zheng,MENG Fan. Unsupervised segmentation of SAR image based on multiscale stochastic model [J]. Journal of Computer Applications, 2005, 25(10): 2367-2369.
[15]	NIE Wen-guang, LIU Wei-yi, YANG Yun-tao, YANG Ming. Algorithm of Bayesian network structural learning based on information theory [J]. Journal of Computer Applications, 2005, 25(01): 1-3.