Remote sensing image building extraction network based on dual promotion of semantic and detailed features

doi:10.11772/j.issn.1001-9081.2024030387

Abstract

Abstract:

Accurate edge information extraction is crucial for building segmentation. Current approaches often simply fuse multi-scale detailed features with semantic features or design complex loss functions to guide the network’s focus on edge information， ignoring the mutual promotion effect between semantic and detailed features. To address these issues， a remote sensing image building extraction network based on dual promotion of semantic and detailed features was developed. The structure of the proposed network was similar to the framework of U-Net. The shallow high-resolution detailed feature maps were extracted in the encoder， and the deep Semantic and Detail Feature dual Facilitation module（SDFF） was embedded in the backbone network in the decoder， so as to enable the network to have both good semantic and detail feature extraction capabilities. After that， channel fusion was performed on semantic and detailed features， and combined with edge loss supervision of images with varying resolutions， the ability to extract building details and the generalization of the network were enhanced. Experimental results demonstrate that compared to various mainstream methods such as U-Net and Dual-Stream Detail-Concerned Network （DSDCNet）， the proposed network achieves superior semantic segmentation results on WHU and Massachusetts buildings （Massachusetts） datasets， showing better preservation of building edge features and effective improvement of building segmentation accuracy in remote sensing images.

Key words: remote sensing image, feature fusion, edge extraction, deep learning, semantic segmentation

摘要：

提取准确的边缘信息对分割建筑物至关重要。将多尺度细节与语义特征进行简单融合，或者设计复杂的损失函数引导网络关注边缘信息是当前较常见的方法，然而这些方法很少关注语义和细节特征的相互促进作用。针对该问题，提出一种基于语义和细节特征双促进的遥感影像建筑物提取网络。所提网络的结构类似U-Net框架，在编码端提取浅层高分辨率细节特征图，在解码端将深层的语义与细节特征双促进模块（SDFF）嵌入主干网络中，从而使网络同时具备较好的语义特征和细节特征的提取能力。之后对语义和细节特征进行通道融合，并结合不同分辨率影像的边缘损失监督，提高网络对建筑物细节的提取能力和泛化性。实验结果表明：与U-Net和双路细节关注网络（DSDCNet）等多种主流方法相比，所提网络在WHU数据集和马萨诸塞州建筑物（Massachusetts）数据集上均取得了最佳的语义分割结果。可见，所提网络能更好地保留建筑物边缘特征，有效提升遥感影像中的建筑物分割精度。

关键词: 遥感影像, 特征融合, 边缘提取, 深度学习, 语义分割

CLC Number:

TP751.1

Yang ZHOU, Hui LI. Remote sensing image building extraction network based on dual promotion of semantic and detailed features[J]. Journal of Computer Applications, 2025, 45(4): 1310-1316.

周阳, 李辉. 基于语义和细节特征双促进的遥感影像建筑物提取网络[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1310-1316.

Figures/Tables 11

Fig. 1 Structure of DMFFNet

Fig. 2 Structure of high-resolution detail extraction module

Fig. 3 Structure of ASPP module

Fig. 4 Structure of semantic and detail feature dual facilitation module

Fig. 5 Structure of feature fusion module

Fig. 6 Edge loss weight

Tab. 1 Comparison of results of different networks on WHU dataset

网络	精确率	召回率	F1	IoU
U-Net^［23］	93.36	94.02	93.68	88.13
HRNet^［24］	94.10	93.12	93.15	88.72
PSPNet^［25］	94.25	94.06	94.12	88.85
DSDCNet^［17］	95.57	95.43	95.50	91.39
DMFFNet	95.90	95.51	95.71	91.76

Fig. 7 Extraction results of different models on WHU dataset

Tab. 2 Comparison of results of different networks on Massachusetts dataset

模型	精确率	召回率	F1	IoU
U-Net	82.54	77.92	80.61	66.89
HRNet	83.31	84.41	83.86	72.20
PSPNet	85.26	82.72	83.97	72.37
DSDCNet	85.38	84.45	84.91	73.78
DMFFNet	86.01	85.11	85.59	74.76

Fig. 8 Extraction results of different models on Massachusetts dataset

Tab. 3 Results of ablation experiments on Massachusetts dataset

编号	DRM	SDFF	Loss1	Loss2	F1/%	IoU/%
1	√	×	√	×	84.91	73.78
2	√	×	×	√	85.22	73.97
3	×	√	√	×	85.38	74.22
4	×	√	×	√	85.59	74.76

References 25

1	SWAN B， LAVERDIERE M， YANG H L， et al. Iterative Self-Organizing SCEne-LEvel Sampling （ISOSCELES） for large-scale building extraction ［J］. GIScience and Remote Sensing， 2022， 59（1）： 1-16.
2	WANG L， FANG S， MENG X， et al. Building extraction with Vision Transformer ［J］. IEEE Transactions on Geoscience and Remote Sensing， 2022， 60： No.5625711.
3	WEI S， ZHANG T， JI S， et al. BuildMapper： a fully learnable framework for vectorized building contour extraction ［J］. ISPRS Journal of Photogrammetry and Remote Sensing， 2023， 197： 87-104.
4	WEI S， ZHANG T， YU D， et al. From lines to Polygons： polygonal building contour extraction from High-Resolution remote sensing imagery ［J］. ISPRS Journal of Photogrammetry and Remote Sensing， 2024， 209： 213-232.
5	LI X， YAO X， FANG Y. Building-A-Nets： robust building extraction from high-resolution remote sensing images with adversarial networks ［J］. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing， 2018， 11（10）： 3680-3687.
6	ZORZI S， FRAUNDORFER F. Regularization of building boundaries in satellite images using adversarial and regularized losses ［C］// Proceedings of the 2019 IEEE International Geoscience and Remote Sensing Symposium. Piscataway： IEEE， 2019： 5140-5143.
7	DING L， TANG H， LIU Y， et al. Adversarial shape learning for building extraction in VHR remote sensing images ［J］. IEEE Transactions on Image Processing， 2022， 31： 678-690.
8	QIN X， ZHANG Z， HUANG C， et al. BASNet： boundary-aware salient object detection ［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 7471-7481.
9	CHEN S， SHI W， ZHOU M， et al. CGSANet： a contour-guided and local structure-aware encoder-decoder network for accurate building extraction from very high-resolution remote sensing imagery［J］. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing， 2022， 15： 1526-1542.
10	GUO H， SHI Q， DU B， et al. Scene-driven multitask parallel attention network for building extraction in high-resolution remote sensing images ［J］. IEEE Transactions on Geoscience and Remote Sensing， 2021， 59（5）： 4287-4306.
11	BISCHKE B， HELBER P， FOLZ J， et al. Multi-task learning for segmentation of building footprints with deep neural networks ［C］// Proceedings of the 2019 IEEE International Conference on Image Processing. Piscataway： IEEE， 2019： 1480-1484.
12	ZHU Q， LIAO C， HU H， et al. MAP-Net： multiple attending path neural network for building footprint extraction from remote sensed imagery ［J］. IEEE Transactions on Geoscience and Remote Sensing， 2021， 59（7）： 6169-6181.
13	李星华，白学辰，李正军，等. 面向高分影像建筑物提取的多层次特征融合网络［J］. 武汉大学学报（信息科学版）， 2022， 47（8）：1236-1244.
	LI X H， BAI X C， LI Z J， et al. High-resolution image building extraction based on multi-level feature fusion network ［J］. Geomatics and Information Science of Wuhan University， 2022， 47（8）：1236-1244.
14	杨潇宇，汪西莉. 结合多尺度注意力和边缘监督的遥感图像建筑物分割模型［J］. 激光与光电子学进展， 2021， 59（22）： No.2228004.
	YANG X Y， WANG X L. Building segmentation model of remote sensing image combining multiscale attention and edge supervision ［J］. Laser and Optoelectronics Progress， 2021， 59（22）： No.2228004.
15	金澍，关沫，边玉婵，等. 基于改进U-Net的遥感影像建筑物提取方法［J］. 激光与光电子学进展， 2023， 60（4）： No.0401002.
	JIN S， GUAN M， BIAN Y C， et al. Building extraction from remote sensing images based on improved U-Net ［J］. Laser and Optoelectronics Progress， 2023， 60（4）： No.0401002.
16	DENG W， SHI Q， LI J. Attention-gate-based encoder-decoder network for automatical building extraction ［J］. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing， 2021， 14： 2611-2620.
17	张卓尔，潘俊，舒奇迪. 基于双路细节关注网络的遥感影像建筑物提取［J］. 武汉大学学报（信息科学版）， 2024， 49（3）： 376-388.
	ZHANG Z E， PAN J， SHU Q D. Building extraction based on dual-stream detail-concerned network ［J］. Geomatics and Information Science of Wuhan University， 2024， 49（3）： 376-388.
18	HU J， SHEN L， SUN G. Squeeze-and-excitation networks ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 7132-7141.
19	TIAN Z， HE T， SHEN C， et al. Decoders matter for semantic segmentation： data-dependent decoding enables flexible feature aggregation ［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 3121-3130.
20	WEI S， JI S， LU M. Toward automatic building footprint delineation from aerial images using CNN and regularization ［J］. IEEE Transactions on Geoscience and Remote Sensing， 2020， 58（3）： 2178-2189.
21	JI S， WEI S， LU M. Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set ［J］. IEEE Transactions on Geoscience and Remote Sensing， 2019， 57（1）： 574-586.
22	MNIH V. Machine learning for aerial image labeling ［D］. Toronto： University of Toronto， 2013.
23	RONNEBERGER O， FISCHER P， BROX T. U-Net： convolutional networks for biomedical image segmentation ［C］// Proceedings of the 2015 International Conference on Medical Image Computing and Computer-Assisted Intervention， LNCS 9351. Cham： Springer， 2015： 234-241.
24	SUN K， XIAO B， LIU D， et al. Deep high-resolution representation learning for human pose estimation ［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 5686-5696.
25	ZHAO H， SHI J， QI X， et al. Pyramid scene parsing network［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 6230-6239.

[1]	Shiyue GUO, Jianwu DANG, Yangping WANG, Jiu YONG. 3D hand pose estimation combining attention mechanism and multi-scale feature fusion [J]. Journal of Computer Applications, 2025, 45(4): 1293-1299.
[2]	Lihu PAN, Shouxin PENG, Rui ZHANG, Zhiyang XUE, Xuzhen MAO. Video anomaly detection for moving foreground regions [J]. Journal of Computer Applications, 2025, 45(4): 1300-1309.
[3]	Yiding WANG, Zehao WANG, Yaoli LI, Shaoqing CAI, Yuan YUAN. Multi-scale 2D-Adaboost microscopic image recognition algorithm of Chinese medicinal materials powder [J]. Journal of Computer Applications, 2025, 45(4): 1325-1332.
[4]	Zhenhua XUE, Qiang LI, Chao HUANG. Vision foundation model-driven pixel-level image anomaly detection method [J]. Journal of Computer Applications, 2025, 45(3): 823-831.
[5]	Ruilong CHEN, Tao HU, Youjun BU, Peng YI, Xianjun HU, Wei QIAO. Stacking ensemble adversarial defense method for encrypted malicious traffic detection model [J]. Journal of Computer Applications, 2025, 45(3): 864-871.
[6]	Yan YANG, Feng YE, Dong XU, Xuejie ZHANG, Jin XU. Construction of digital twin water conservancy knowledge graph integrating large language model and prompt learning [J]. Journal of Computer Applications, 2025, 45(3): 785-793.
[7]	Qiurun HE, Jie HU, Bo PENG, Tianyuan LI. Fabric defect detection algorithm based on context information and multi-scale feature fusion [J]. Journal of Computer Applications, 2025, 45(2): 640-646.
[8]	Yan LI, Guanhua YE, Yawen LI, Meiyu LIANG. Enterprise ESG indicator prediction model based on richness coordination technology [J]. Journal of Computer Applications, 2025, 45(2): 670-676.
[9]	Handa MA, Yadong WU. Multi-domain spatiotemporal hierarchical graph neural network for air quality prediction [J]. Journal of Computer Applications, 2025, 45(2): 444-452.
[10]	Miaolei DENG, Yupei KAN, Chuanchuan SUN, Haihang XU, Shaojun FAN, Xin ZHOU. Summary of network intrusion detection systems based on deep learning [J]. Journal of Computer Applications, 2025, 45(2): 453-466.
[11]	Songsen YU, Zhifan LIN, Guopeng XUE, Jianyu XU. Lightweight large-format tile defect detection algorithm based on improved YOLOv8 [J]. Journal of Computer Applications, 2025, 45(2): 647-654.
[12]	Danni DING, Bo PENG, Xi WU. VPNet： fatty liver ultrasound image classification method inspired by ventral pathway [J]. Journal of Computer Applications, 2025, 45(2): 662-669.
[13]	Tianqi ZHANG, Shuang TAN, Xiwen SHEN, Juan TANG. Image watermarking method combining attention mechanism and multi-scale feature [J]. Journal of Computer Applications, 2025, 45(2): 616-623.
[14]	Zirong HONG, Guangqing BAO. Review of radar automatic target recognition based on ensemble learning [J]. Journal of Computer Applications, 2025, 45(2): 371-382.
[15]	Zhongwei ZHANG, Jun WANG, Shudong LIU, Zhiheng WANG. Object detection in remote sensing image based on multi-scale feature fusion and weighted boxes fusion [J]. Journal of Computer Applications, 2025, 45(2): 633-639.