Proliferation of multimodal harmful content on social media harms public interests and disrupts social order severely at the same time, highlighting the urgent need for effective detection methods of this content. The existing researches rely on pre-trained models to extract and fuse multimodal features, often neglect the limitations of general semantics in harmful content detection tasks, and fail to consider complex, dynamic combinations of harmful content. Therefore, a multimodal harmful content detection method based on weakly Supervised modality semantic enhancement (weak-S) was proposed. In the proposed method, weakly supervised modality information was introduced to facilitate the harmful semantic alignment of multimodal features, and a low-rank bilinear pooling-based multimodal gated integration mechanism was designed to differentiate the contributions of various information. Experimental results show that the proposed method achieves the F1 value improvements of 2.2 and 3.2 percentage points, respectively, on Harm-P and MultiOFF datasets, outperforming SOTA (State-Of-The-Art) models and validating the significance of weakly supervised modality semantics in multimodal harmful content detection. Additionally, the proposed method has improvement in generalization performance for multimodal exaggeration detection tasks.