Focused on the challenges of edge information loss and incomplete segmentation of large lesions in endoscopic semantic segmentation networks, a Boundary-Cross Supervised semantic Segmentation Network (BCS-SegNet) with Decoupled Residual Self-Attention (DRA) was proposed. Firstly, DRA was introduced to enhance the network’s ability to learn distantly related lesions. Secondly, a Cross Level Fusion (CLF) module was constructed to combine multi-level feature maps within the encoding structure in a pairwise way, so as to realize the fusion of image details and semantic information at low computational cost. Finally, multi-directional and multi-scale 2D Gabor transform was utilized to extract edge information, and spatial attention was used to weight edge features in the feature maps, so as to supervise decoding process of the segmentation network, thereby providing more accurate intra-class segmentation consistency at pixel level. Experimental results demonstrate that on ISIC2018 dermoscopy and Kvasir-SEG/CVC-ClinicDB colonoscopy datasets, BCS-SegNet achieves the mIoU (mean Intersection over Union) and Dice coefficient of 84.27%, 90.68% and 79.24%, 87.91%, respectively; on the self-built esophageal endoscopy dataset, BCS-SegNet achieves the mIoU of 82.73% and Dice coefficient of 90.84%, while the above mIoU is increased by 3.30% over that of U-net and 4.97% over that of UCTransNet. It can be seen that the proposed network can realize visual effects such as more complete segmentation regions and clearer edge details.