Search Result

Select

No-reference image quality assessment algorithm based on saliency features and cross-attention mechanism

Yang DENG, Tao ZHAO, Kai SUN, Tong TONG, Qinquan GAO

Journal of Computer Applications 2025, 45 (12): 3995-4003. DOI: 10.11772/j.issn.1001-9081.2024121866

Abstract （57）

HTML （1）

PDF （1393KB）（12）

Save

Image data in actual business scenarios usually presents the characteristics of rich content and complex distortion performance， which is a great challenge to the generalization of objective Image Quality Assessment （IQA） algorithms. In order to solve this problem， a No-Reference IQA （NR-IQA） algorithm was proposed， which is mainly composed of three sub-networks： Feature Extraction Network （FEN）， Feature Fusion Network （FFN）， and Adaptive Prediction Network （APN）. Firstly， the global view， local patch， and saliency view of the sample were input into the FEN together， and the global distortion， local distortion， and saliency features were extracted by Swim Transformer. Then， the cascaded Transformer encoder was used to fuse the global distortion features and local distortion features， and the potential correlation patterns of the two were explored. Inspired by the human visual attention mechanism， the saliency features were used in the FFN to activate the attention module， so that the module was able to pay additional attention to the visual salient region， so as to improve the semantic parsing ability of the algorithm. Finally， the prediction score was calculated by the dynamically constructed MultiLayer Perceptron （MLP） regression network. Experimental results on main stream synthetic and real-world distortion datasets show that compared with the DSMix （Distortion-induced Sensitivity map-guided Mixed augmentation） algorithm， the proposed algorithm improves the Spearman Rank-order Correlation Coefficient （SRCC） by 4.3% on TID2013 dataset， and the Pearson Linear Correlation Coefficient （PLCC） by 1.4% on KonIQ dataset. The proposed algorithm also demonstrates excellent generalization ability and interpretability， which can deal with the complex distortion performance in business scenarios effectively， and can make adaptive prediction according to the individual characteristics of the sample.

Table and Figures | Reference | Related Articles | Metrics

Select

Image colorization algorithm based on foreground semantic information

WU Lidan, XUE Yuyang, TONG Tong, DU Min, GAO Qinquan

Journal of Computer Applications 2021, 41 (7): 2048-2053. DOI: 10.11772/j.issn.1001-9081.2020081184

Abstract （548）

PDF （4553KB）（427）

Save

An image can be divided into foreground part and background part, while the foreground is often the visual center. Due to the large categories and complex situations of foreground part, the image colorization is difficult, thus the foreground part of an image may suffer from poor colorization and detail loss problems. To solve these problems, an image colorization algorithm based on foreground semantic information was proposed to improve the image colorization effect and achieve the purpose of natural overall image color and rich content color. First, the foreground network was used to extract the low-level features and high-level features of the foreground part. Then these features were integrated into the foreground subnetwork to eliminate the influence of background color information and emphasize the foreground color information. Finally, the network was continuously optimized by the generation loss and pixel-level color loss, so as to guide the generation of high-quality images. Experimental results show that after introducing the foreground semantic information, the proposed algorithm improves Peak Signal-to-Noise Ratio (PSNR) and Learned Perceptual Image Patch Similarity (LPIPS), effectively solving the problems of dull color, detail loss and low contrast in the colorization of the central visual regions; compared with other algorithms, the proposed algorithm achieves a more natural colorization effect on the overall image and a significant improvement on the content part.

Reference | Related Articles | Metrics

Select

Bamboo strip surface defect detection method based on improved CenterNet

GAO Qinquan, HUANG Bingcheng, LIU Wenzhe, TONG Tong

Journal of Computer Applications 2021, 41 (7): 1933-1938. DOI: 10.11772/j.issn.1001-9081.2020081167

Abstract （1091）

PDF （1734KB）（694）

Save

In bamboo strip surface defect detection, the bamboo strip defects have different shapes and messy imaging environment, and the existing target detection model based on Convolutional Neural Network (CNN) does not take advantage of the neural network when facing such specific data; moreover, the sources of bamboo strips are complicated and there exist other limited conditions, so that it is impossible to collect all types of data, resulting in a small amount of bamboo strip defect data that CNN cannot fully learn. To address these problems, a special detection network aiming at bamboo strip defects was proposed. The basic framework of the proposed network is CenterNet. In order to improve the detection performance of CenterNet in less bamboo strip defect data, an auxiliary detection module based on training from scratch was designed:when the network started training, the CenterNet part that uses the pre-training model was frozen, and the auxiliary detection module was trained from scratch according to the defect characteristics of the bamboo strips; when the loss of the auxiliary detection module stabilized, the module was intergrated with the pre-trained main part by a connection method of attention mechanism. The proposed detection network was trained and tested on the same training sets with CenterNet and YOLO v3 which is currently commonly used in industrial detection. Experimental results show that on the bamboo strip defect detection dataset, the mean Average Precision (mAP) of the proposed method is 16.45 and 9.96 percentage points higher than those of YOLO v3 and CenterNet, respectively. The proposed method can effectively detect the different shaped defects of bamboo strips without increasing too much time consumption, and has a good effect in actual industrial applications.

Reference | Related Articles | Metrics

Select

Blurred video frame interpolation method based on deep voxel flow

LIN Chuanjian, DENG Wei, TONG Tong, GAO Qinquan

Journal of Computer Applications 2020, 40 (3): 819-824. DOI: 10.11772/j.issn.1001-9081.2019081474

Abstract （616）

PDF （1085KB）（575）

Save

Motion blur has an extremely negative effect on video frame interpolation. In order to handle this problem, a novel blurred video frame interpolation method was proposed. Firstly, a multi-task fusion convolutional neural network was proposed, which consists of a deblurring module and a frame interpolation module. In the deblurring module, based on the deep Convolutional Neural Network (CNN) with stack of ResBlocks, motion blur removal of two input frames was implemented by extracting and learning the deep blur features. And the frame interpolation module was used to estimate voxel flow between two consecutive frames after blur removal, then the obtained voxel flow was used to guide the trilinear interpolation of the pixels to synthesize the intermediate frame. Secondly, a large blurred video simulation dataset was made, and a “first separate and then combine” “from coarse to fine” training strategy was proposed, experimental results show that this strategy promotes the effective convergence of the multi-task fusion network. Finally, compared with the simple combination of the state-of-the-art deblurring and frame interpolation algorithms, experimental metrics show that the intermediate frame synthesized by the proposed method has the peak-to-noise ratio increased by 1.41 dB, the structural similarity improved by 0.020, and the interpolation error decreased by 1.99， at least. Visual comparison and reconstructed sequences show that the proposed model performs good frame rate up conversion effect for blurred videos, in other words, two blurred consecutive frames can be reconstructed end-to-end to three sharp and visually smooth frames by the model.

Reference | Related Articles | Metrics

Select

Video compression artifact removal algorithm based on adaptive separable convolution network

NIE Kehui, LIU Wenzhe, TONG Tong, DU Min, GAO Qinquan

Journal of Computer Applications 2019, 39 (5): 1473-1479. DOI: 10.11772/j.issn.1001-9081.2018081801

Abstract （668）

PDF （1268KB）（437）

Save

The existing optical flow estimation methods, which are frequently used in video quality enhancement and super-resolution reconstruction tasks, can only estimate the linear motion between pixels. In order to solve this problem, a new multi-frame compression artifact removal network architecture was proposed. The network consisted of motion compensation module and compression artifact removal module. With the traditional optical flow estimation algorithms replaced with the adaptive separable convolution, the motion compensation module was able to handle with the curvilinear motion between pixels, which was not able to be well solved by optical flow methods. For each video frame, a corresponding convolutional kernel was generated by the motion compensation module based on the image structure and the local displacement of pixels. After that, motion offsets were estimated and pixels were compensated in the next frame by means of local convolution. The obtained compensated frame and the original next frame were combined together as input for the compression artifact removal module. By fusing different pixel information of the two frames, the compression artifacts of the original frame were removed. Compared with the state-of-the-art Multi-Frame Quality Enhancement (MFQE) algorithm on the same training and testing datasets, the proposed network has the improvement of Peak Signal-to-Noise Ratio (Δ PSNR) increased by 0.44 dB at most and 0.32 dB on average. The experimental results demonstrate that the proposed network performs well in removing video compression artifacts.

Reference | Related Articles | Metrics

Select

Compression method of super-resolution convolutional neural network based on knowledge distillation

GAO Qinquan, ZHAO Yan, LI Gen, TONG Tong

Journal of Computer Applications 2019, 39 (10): 2802-2808. DOI: 10.11772/j.issn.1001-9081.2019030516

Abstract （853）

PDF （1103KB）（875）

Save

Aiming at the deep structure and high computational complexity of current network models based on deep learning for super-resolution image reconstruction, as well as the problem that the networks can not operate effectively on resource-constrained devices caused by the high storage space requirement for the network models, a super-resolution convolutional neural network compression method based on knowledge distillation was proposed. This method utilizes a teacher network with large parameters and good reconstruction effect as well as a student network with few parameters and poor reconstruction effect. Firstly the teacher network was trained; then knowledge distillation method was used to transfer knowledge from teacher network to student network; finally the reconstruction effect of the student network was improved without changing the network structure and the parameters of the student network. The Peak Signal-to-Noise Ratio (PSNR) was used to evaluate the quality of reconstruction in the experiments. Compared to the student network without knowledge distillation method, the student network using the knowledge distillation method has the PSNR increased by 0.53 dB, 0.37 dB, 0.24 dB and 0.45 dB respectively on four public test sets when the magnification times is 3. Without changing the structure of student network, the proposed method significantly improves the super-resolution reconstruction effect of the student network.

Reference | Related Articles | Metrics