Traffic prediction based on spatio-temporal bottleneck attention enhanced by pre-trained language model

doi:10.11772/j.issn.1001-9081.2026010088

Abstract

Abstract: To address high computational complexity and difficulty of modeling complex spatio-temporal dependencies in large-scale traffic flow prediction, a spatio-temporal bottleneck attention model enhanced by Pre-trained Language Model (PLM), termed STBA-PLM, was proposed. First, traffic flow data were mapped into spatio-temporally aware node representations through spatio-temporal tokenization. Then, a bottleneck attention encoding module integrating spatial regions and temporal patterns was constructed to compress large-scale node-level representations into constant-size regional representations, thereby decoupling PLM input sequence length from number of nodes. Based on this, the compressed regional representations were fed into PLM for deep feature extraction, and a symmetric decoder was employed to accurately reconstruct node-level representations. Meanwhile, a temporal difference iteration module was designed to explicitly model differences between historical and future temporal patterns, enhancing model’s capability in temporal pattern learning. Experimental results on PEMS04, PEMS07, and PEMS08 datasets show that STBA-PLM outperforms models such as PDFormer, SFADNet, and ST-LLM in both prediction accuracy and computational efficiency. Compared with ST-LLM, a PLM-based model that focuses more on spatial relationships, the proposed model reduces RMSE, MAE, and MAPE by at least 2.6%, 4.1%, and 9.6%, respectively. When number of nodes reaches 1000, training time is reduced by 93.3%. Results demonstrate that the proposed model effectively reduces computational complexity while capturing spatio-temporal dependencies and temporal patterns in traffic data, achieving a good balance between prediction accuracy and computational efficiency, making it suitable for large-scale road network traffic flow prediction tasks.

摘要： 针对大规模路网交通流量预测任务中计算复杂度高与复杂时空特征建模困难的问题，提出一种基于预训练语言模型(PLM)增强时空瓶颈注意力的交通流预测模型STBA-PLM。该模型首先通过时空分词器将流量映射为具有时空感知的节点表征；随后构建融合空间区域与时间模式的瓶颈注意力编码模块，将大规模的节点级表征压缩为常数级时空区域表征，使PLM输入序列长度与节点规模解耦；在此基础上，利用PLM对压缩后的区域表征进行深层特征提取，并通过对称的解码器实现节点空间的精确还原；同时，设计时间差分迭代模块，显式建模历史与预测之间的时间模式差异，增强模型对时间模式的建模能力。在PEMS04、PEMS07和PEMS08数据集上的实验结果表明，STBA-PLM在预测精度和运算效率上均优于PDFormer、SFADNet及ST-LLM等模型。相较于更侧重空间关系的PLM模型ST-LLM，本文模型的RMSE、MAE和MAPE指标分别至少降低2.6%、4.1%和9.6%，在节点规模为1000时训练时间降低93.3%。所提模型在有效降低计算复杂度的同时，能够充分建模交通数据中的时空依赖与时间模式，适用于大规模路网交通流量预测任务。

CLC Number:

TP391.4

柴艺函王语雁杜圣东胡节. 基于预训练语言模型增强时空瓶颈注意力的交通预测[J]. 《计算机应用》唯一官方网站, DOI: 10.11772/j.issn.1001-9081.2026010088.

[1]	WANG Xin, AN Junxiu, MAO Ke. Image captioning with block-prototype contrastive alignment based on dynamic semantic mapping [J]. Journal of Computer Applications, 0, (): 0-0.
[2]	. Scene recognition method based on structured co-occurrence representation learning [J]. Journal of Computer Applications, 0, (): 0-0.
[3]	. Attention-guided symmetric positive definite second-order representation for facial expression recognition [J]. Journal of Computer Applications, 0, (): 0-0.
[4]	CHEN Xiaolei, AN Qianqian. Salient object detection-driven viewport prediction for 360-degree live video streaming [J]. Journal of Computer Applications, 0, (): 0-0.
[5]	. Red kidney bean leaf disease detection method based on Mamba feature extraction and improved YOLOv11 [J]. Journal of Computer Applications, 0, (): 0-0.
[6]	. Noninvasive fetal electrocardiogram signal extraction method based on Mamba-UNETR [J]. Journal of Computer Applications, 0, (): 0-0.
[7]	. Multimodal bio-coupling correlation driven audio-visual deepfake detection [J]. Journal of Computer Applications, 0, (): 0-0.
[8]	. UAV remote sensing image small object detection algorithm based on improved RT-DETR [J]. Journal of Computer Applications, 0, (): 0-0.
[9]	. Collaborative perception method based on closed-loop trajectory sharing [J]. Journal of Computer Applications, 0, (): 0-0.
[10]	Wenchao MING, Suzhen LIN, Zanxia JIN. Multi-band image captioning method based on scene concept-guided feature fusion [J]. Journal of Computer Applications, 2026, 46(5): 1560-1567.
[11]	Chi ZHANG, Xianjing MENG, Changhao DOU, Qian WANG, Leilei GENG, Xiaoming XI. MD-FVR： cascaded finger vein recognition network based on multi-domain feature fusion [J]. Journal of Computer Applications, 2026, 46(5): 1658-1666.
[12]	Wen PENG, Bokai ZHANG, Jinwei LIN. Chromosome cascaded classification framework integrating image texture enhancement and super-resolution [J]. Journal of Computer Applications, 2026, 46(5): 1647-1657.
[13]	Miaomiao YUAN, Yihong CHU, Guanjun YIN, Chunhua DENG. High-precision recognition method for imperfect grain images based on TransNeXt [J]. Journal of Computer Applications, 2026, 46(5): 1684-1691.
[14]	Binhong XIE, Erdan ZHU, Rui ZHANG. Appearance-motion collaborative modeling for video anomaly detection [J]. Journal of Computer Applications, 2026, 46(5): 1551-1559.
[15]	Yuanhao HE, Jun ZHAO. Defect detection algorithm for train bearing rollers based on FHC-DETR [J]. Journal of Computer Applications, 2026, 46(5): 1624-1633.

Traffic prediction based on spatio-temporal bottleneck attention enhanced by pre-trained language model

基于预训练语言模型增强时空瓶颈注意力的交通预测

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics