Graph structure data are widely found in the real world. However, they often face a shortage of labeled data in practical applications. Methods for Few-Shot Learning (FSL) on graph data aim to classify data with a few labeled samples. Although these methods have good performance in Few-Shot Node Classification (FSNC) tasks, there are still the following problems: high-quality labeled data are difficult to obtain, generalization ability is insufficient in the parameter initialization process, the topology structure information in graph is not fully mined. To address these problems, a Few-Shot Node Classification model based Graph Data Augmentation (GDA-FSNC) was proposed. There are four modules in GDA-FSNC: a graph data pre-processing module based on structural similarity, a parameter initialization module, a parameter fine-tuning module, and an adaptive pseudo-label generation module. In the graph data pre-processing module, an adjacency matrix enhancement method based on structural similarity was used to obtain more graph structural information. In the parameter initialization module, to enhance the diversity of information during the model training process, a mutual teaching-based data augmentation method was used to make each model learn different patterns and features from the other models. In the adaptive pseudo-label generation module, appropriate pseudo-label generation techniques were selected automatically according to the characteristics of different datasets, thereby generating high-quality pseudo-label data. Experimental results on seven real datasets show that the proposed model performs better than the state-of-the-art FSL models such as Meta-GNN, GPN(Graph Prototypical Network), and IA-FSNC (Information Augmentation for Few-Shot Node Classification) in classification accuracy. For example, compared to the baseline model IA-FSNC, The classification accuracy of the proposed model has been improved by at least 0.27 percentage points in the 2-way 1-shot setting of the small dataset and by at least 2.06 percentage points in the 5-way 1-shot setting of the large datasets. It can be seen that GDA-FSNC has better classification performance and generalization ability in few-shot scenarios.
An infrared small target method based on information compensation was proposed to address the problem that infrared small targets are prone to losing texture detail information during network iteration, which decreased accuracy of target localization and contour segmentation. Firstly, Image Feature Extraction (IFE) module was used to encode shallow details and deep semantic features of infrared image. Secondly, a Multi-level Information Compensation (MIC) module was constructed to perform information compensation to down-sampled features in the encoding stage by aggregating features from adjacent levels. Thirdly, Global Target Response (GTR) module was introduced to compensate the limitation of convolutional locality by incorporating global contextual information of feature map. Finally, Asymmetric Cross-Fusion (ACF) module was constructed to fuse shallow and deep features, thereby preserving texture and positional information during target decoding, thus achieving detection of infrared small targets. Experimental results of training and testing on publicly available NUAA-SIRST (Nanjing University of Aeronautics and Astronautics-Single-frame InfraRed Small Target) and NUDT-SIRST (National University of Defense Technology-Single-frame InfraRed Small Target) mixed datasets show that compared to methods such as UIUNet (U-Net in U-Net Network), LSPM (Local Similarity Pyramid Modules), and DNANet (Dense Nested Attention Network), the proposed method achieves improvements of 9.2, 8.9, and 5.5 percentage points in Intersection over Union (IoU), respectively, and 6.0, 5.4, and 3.1 percentage points in F1-Score, respectively. The above demonstrates that the proposed method enables accurate detection and effective segmentation of small targets in complex infrared background images.
Aiming at the obstacle avoidance and trajectory smoothness problem of multi-robot path following and formation in crowd environment, a multi-robot path following and formation algorithm based on deep reinforcement learning was proposed. Firstly, a pedestrian danger priority mechanism was established, which was combined with reinforcement learning to design a danger awareness network to enhance the safety of multi-robot formation. Subsequently, a virtual robot was introduced as the reference target for multiple robots, thus transforming path following into tracking control of the virtual robot by the multiple robots, with the purpose of enhancing the smoothness of the robot trajectories. Finally, quantitative and qualitative analysis was conducted through simulation experiments to compare the proposed algorithm with existing ones. The experimental results show that compared with the existing point-to-point path following algorithms, the proposed algorithm has excellent obstacle avoidance performance in crowd environments, which ensures the smoothness of multi-robot motion trajectories.
To solve the problem of large variation caused by different distances between monitoring camera and crowd in the crowd analysis tasks, a crowd counting algorithm with multi-scale fusion based on normal inverse Gamma distribution was proposed, named MSF (Multi-Scale Fusion crowd counting) algorithm. Firstly, the common features were extracted with the traditional backbone, and then the pedestrian information of different scales was obtained with the multi-scale information extraction module. Secondly, a crowd density estimation module and an uncertainty estimation module for evaluating the reliability of the prediction results of each scale were contained in each scale network. Finally, more accurate density regression results were obtained by dynamically fusing the multi-scale prediction results based on the reliability in the multi-scale prediction fusion module. The experimental results show that after the expansion of the existing method Converged Scene Recognition Network (CSRNet) by multi-scale trusted fusion, the Mean Absolute Error (MAE) and Mean Squared Error (MSE) of crowd counting on UCF-QNRF dataset are significantly decreased by 4.43% and 1.37%, respectively, which verifies the rationality and effectiveness of MSF algorithm. In addition, different from the existing methods, the MSF algorithm can not only predict the crowd density, but also provide the reliability of the prediction during the deployment stage, so that the inaccurate areas predicted by the algorithm can be timely warned in practical applications, reducing the wrong prediction risks in subsequent analysis tasks.
Interstitial Lung Disease (ILD) segmentation labels are highly costly, leading to small sample sizes in existing datasets and resulting in poor performance of trained models. To address this issue, a segmentation algorithm for ILD based on multi-task learning was proposed. Firstly, a multi-task segmentation model was constructed based on U-Net. Then, the generated lung segmentation labels were used as auxiliary task labels for multi-task learning. Finally, a method of dynamically weighting the multi-task loss functions was used to balance the losses of the primary task and the secondary task. Experimental results on a self-built ILD dataset show that the Dice Similarity Coefficient (DSC) of the multi-task segmentation model reaches 82.61%, which is 2.26 percentage points higher than that of U-Net. The experimental results demonstrate that the proposed algorithm can improve the segmentation performance of ILD and can assist clinical doctors in ILD diagnosis.
Existing learning-based single-image deraining networks mostly focus on the effect of rain streaks in rainy images on visual imaging, while ignoring the effect of fog on visual imaging due to the increase of humidity in the air in rainy environments, thus causing problems such as low generation quality and blurred texture detail information in the derained images. To address these problems, an asymmetric unsupervised end-to-end image deraining network model was proposed. It mainly consists of rain and fog removal network, rain and fog feature extraction network and rain and fog generation network, which form two different data domain mapping conversion modules: Rain-Clean-Rain and Clean-Rain-Clean. The above three sub-networks constituted two parallel transformation paths: the rain removal path and the rain-fog feature extraction path. In the rain-fog feature extraction path, a rain-fog-aware extraction network based on global and local attention mechanisms was proposed to learn rain-fog related features by using the global self-similarity and local discrepancy existing in rain-fog features. In the rain removal path, a rainy image degradation model and the above extracted rain-fog related features were introduced as priori knowledge to enhance the ability of rain-fog image generation, so as to constrain the rain-fog removal network and improve its mapping conversion capability from rain data domain to rain-free data domain. Extensive experiments on different rain image datasets show that compared to state-of-the-art deraining method CycleDerain, the Peak Signal-to-Noise Ratio (PSNR) is improved by 31.55% on the synthetic rain-fog dataset HeavyRain. The proposed model can adapt to different rainy scenarios, has better generalization, and can better recover the details and texture information of images.
For the channel access and resource allocation problem in the p-persistent Mobile Ad hoc NETwork (MANET), an adaptive channel access and resource allocation algorithm with low complexity was proposed. Firstly, considering the characteristics of MANET, the optimization problem was formulated to maximize the channel utility of each node. Secondly, the formulated problem was then transformed into a Markov decision process and the state, action, as well as the reward function were defined. Finally, the network parameters were trained based on policy gradient to optimize the competition probability, priority growth factor, and the number of communication nodes. Simulation experiment results indicate that the proposed algorithm can significantly improve the performance of p-persistent CSMA (Carrier Sense Multiple Access) protocol. Compared with the scheme with fixed competition probability and predefined p-value, the proposed algorithm can improve the channel utility by 45% and 17%, respectively. The proposed algorithm can also achieve higher channel utility compared to the scheme with fixed number of communication nodes when the total number of nodes is less than 35. Most importantly, with the increase of packet arrival rate, the proposed algorithm can fully utilize the channel resource to reduce the idle period of time slot.
At present, most accelerated Magnetic Resonance Imaging (MRI) reconstruction algorithms reconstruct undersampled amplitude images and use real-value convolution for feature extraction, without considering that the MRI data itself is complex, which limits the feature extraction ability of MRI complex data. In order to improve the feature extraction ability of single slice MRI complex data, and thus reconstruct single slice MRI images with clearer details, a Complex Convolution Dual-Domain Cascade Network (ComConDuDoCNet) was proposed. The original undersampled MRI data was used as input, and Residual Feature Aggregation (RFA) blocks were used to alternately extract the dual domain features of the MRI data, ultimately reconstructing the Magnetic Resonance (MR) images with clear texture details. Complex convolution was used as a feature extractor for each RFA block. Different domains were cascaded through Fourier transform or inverse transform, and data consistency layer was added to achieve data fidelity. A large number of experiments were conducted on publicly available knee joint dataset. The comparison results with the Dual-task Dual-domain Network (DDNet) under three different sampling masks with a sampling rate of 20% show that: under the two-dimensional Gaussian sampling mask, the proposed algorithm decreases Normalized Root Mean Square Error (NRMSE) by 13.6%, increases Peak Signal-to-Noise Ratio (PSNR) by 4.3%, and increases Structural SIMilarity (SSIM) by 0.8%; under the Poisson sampling mask, the proposed algorithm decreases NRMSE by 11.0%, increases PSNR by 3.5%, and increases SSIM by 0.1%; under the radial sampling mask, the proposed algorithm decreases NRMSE by 12.3%, increases PSNR by 3.8%, and increases SSIM by 0.2%. The experimental results show that ComConDuDoCNet, combined with complex convolution and dual-domain learning, can reconstruct MR images with clearer details and more realistic visual effects.
A method based on Siamese network and Transformer was proposed to address the low accuracy problem of infrared dim small target tracking. First, a multi-feature extraction cascading moduling was constructed to separately extract the deep features of the infrared dim small target template frame and the search frame, and concatenate them with their corresponding HOG features at the dimension level. Second, a multi-head attention mechanism Transformer was introduced to perform cross-correlation operations between the template feature map and the search feature map, generating a response map. Finally, the target’s center position in the image and the regression bounding box were obtained through the response map upsampling network and bounding box prediction network to complete the tracking of the infrared dim small targets. Test results on a dataset of 13 655 infrared images show that compared with KeepTrack tracking method, the success rate is improved by 5.9 percentage points and the precision is improved by 1.8 percentage points; compared with TransT (Transformer Tracking) method, the success rate is improved by 14.2 percentage points and the precision is improved by 14.6 percentage points. The proposed method is proved to be more accurate in tracking infrared dim small targets in complex backgrounds.
When multiple agents performing path finding in large-scale warehousing environment, the existing algorithms have problems that agents are prone to fall into congestion areas and it take a long time. In response to the above problem, an improved Conflict-Based Search (CBS) algorithm was proposed. Firstly, the existing single warehousing environment modeling method was optimized. Based on the traditional grid based modeling, which is easy to solve path conflicts, a hybrid modeling method of grid-heat map was proposed, and congestion areas in the warehouse were located through a heat map, thereby addressing the issue of multiple agents prone to falling into congestion areas. Then, an improved CBS algorithm was employed to solve the Multi-Agent Path Finding (MAPF) problems in large-scale warehousing environment. Finally, a Heat Map for Explicit Estimation Conflict-Based Search (HM-EECBS) algorithm was proposed. Experimental results show that on warehouse-20-40-10-2-2 large map set, when the number of agents is 500, compared with Explicit Estimation Conflict-Based Search (EECBS) algorithm and Lazy Constraints Addition for MAPF (LaCAM) algorithm, HM-EECBS algorithm has the solution time reduced by about 88% and 73% respectively; when there is 5%,10% area congestion in warehouse, the success rate of HM-EECBS algorithm is increased by about 49% and 20% respectively, which illustrates that the proposed algorithm is suitable for solving MAPF problems in large-scale and congested warehousing and logistics environments.
A complex causal relationship extraction model based on prompt enhancement and Bi-Graph ATtention network (BiGAT) — PE-BiGAT (Prompt Enhancement and Bi-Graph Attention Network) was proposed to address the issues of insufficient external information and information transmission forgetting caused by the high density and long sentence patterns of complex causal sentences. Firstly, the result entities from the sentence were extracted and combined with the prompt learning template to form the prompt information, and the prompt information was enhanced through an external knowledge base. Then, the prompt information was input into the BiGAT, the attention layer was combined with syntax and semantic dependency graphs, and the biaffine attention mechanism was used to alleviate feature overlapping and enhance the model’s perception of relational features. Finally, all causal entities in the sentence were predicted iteratively by the classifier, and all causal pairs in the sentence were analyzed through a scoring function. Experimental results on SemEval-2010 task 8 and AltLex datasets show that compared with RPA-GCN (Relationship Position and Attention?Graph Convolutional Network), the proposed model improves the F1 score by 1.65 percentage points, with 2.16 and 4.77 percentage points improvements in chain causal and multi-causal sentences, which confirming that the proposed model has an advantage in dealing with complex causal sentences.
Aiming at the drawbacks of standard Grey Wolf Optimizer (GWO) algorithm, such as slow convergence and being easy to fall into local optimum, an improved Grey Wolf Optimizer with Two Headed Wolves guide (GWO-THW) algorithm was proposed by utilizing a dual nonlinear convergence factor strategy. Firstly, the chaotic Cubic mapping was used to initialize the population for improving the uniformity and diversity of the population distribution. And the wolves were divided into hunter wolves and scout wolves through the average fitness values. The different convergence factors were used to two types of wolves to seek after and round up their prey under the leadership of their respective leader wolf. Secondly, an adaptive weight factor of position updating was designed to improve the search speed and accuracy. Meanwhile, a Levy flight strategy was employed to randomly update the positions of wolves for jumping out of local optimum, when no prey was found in a certain period of time. Ten benchmark functions were selected to test the performance and effectiveness of GWO-THW. Experimental results show that compared with standard GWO and related variants, GWO-THW achieves higher optimization accuracy and faster convergence on eight benchmark functions, especially on the multi-peak functions, the algorithm can converge to the ideal optimal value within 200 iterations, indicating that GWO-THW has better optimization performance.
Constraint handling strategies of the existing constrained multi-objective algorithms fail to solve the problems with large infeasible regions effectively, resulting in population stagnation at the edge of infeasible regions. Besides, the higher requirements are proposed for the global search ability and the maintenance of diversity of the algorithms by the discontinuous problems with constraints. To solve the above problems, a Constrained Multi-Objective Evolutionary Algorithm based on Multi-Stage Search (CMOEA-MSS) was proposed, with different search strategies used in three stages. To make the population across large infeasible regions and approximate Pareto front quickly, in the first stage, a convergence indicator was used to guide the population search without considering the constraints. In the second stage, a set of uniformly distributed weight vectors were utilized to maintain the population diversity, and an improved epsilon constraint handling strategy was presented to retain high-quality solutions in infeasible regions. In the third stage, the constraint dominance principle was adopted, and the search preference would focus on the feasible regions to ensure the feasibility of the final solution set. CMOEA-MSS was compared with NSGA-Ⅱ+ARSBX (Nondominated Sorting Genetic Algorithm Ⅱ using Adaptive Rotation-based Simulated Binary crossover) and other algorithms on MW and DASCMOP test sets. Experimental results show that CMOEA-MSS obtains the best IGD (Inverted Generational Distance) values on seven test problems and the best HV (HyperVolume) values on five test problems on MW test set, and obtains the best IGD values on three test problems, the second best IGD values on two test problems and the best HV values on five test problems on DASCMOP test set. It can be seen that CMOEA-MSS has obvious advantages in solving discontinuous and multi-modal constrained multi-objective problems.
Deep learning based algorithms such as YOLO (You Only Look Once) and Faster Region-Convolutional Neural Network (Faster R-CNN) require a huge amount of training data to ensure the precision of the model, and it is difficult to obtain data and the cost of labeling data is high in many scenarios. And due to the lack of massive training data, the detection range is limited. Aiming at the above problems, a few-shot object Detection algorithm based on Siamese Network was proposed, namely SiamDet, with the purpose of training an object detection model with certain generalization ability by using a few annotated images. Firstly, a Siamese network based on depthwise separable convolution was proposed, and a feature extraction network ResNet-DW was designed to solve the overfitting problem caused by insufficient samples. Secondly, an object detection algorithm SiamDet was proposed based on Siamese network, and based on ResNet-DW, Region Proposal Network (RPN) was introduced to locate the interested objects. Thirdly, binary cross entropy loss was introduced for training, and contrast training strategy was used to increase the distinction among categories. Experimental results show that SiamDet has good object detection ability for few-shot objects, and SiamDet improves AP50 by 4.1% on MS-COCO 20-way 2-shot and 2.6% on PASCAL VOC 5-way 5-shot compared with the suboptimal algorithm DeFRCN (Decoupled Faster R-CNN).
The Multi-Object Tracking (MOT) task needs to track multiple objects at the same time and ensures the continuity of object identities. To solve the problems in the current MOT process, such as object occlusion, object ID Switch (IDSW) and object loss, the Transformer-based MOT model was improved, and a multi-object tracking method based on dual-decoder Transformer was proposed. Firstly, a set of trajectories was generated by model initialization in the first frame, and in each frame after the first one, attention was used to establish the association between frames. Secondly, the dual-decoder was used to correct the tracked object information. One decoder was used to detect the objects, and the other one was used to track the objects. Thirdly, the histogram template matching was applied to find the lost objects after completing the tracking. Finally, the Kalman filter was utilized to track and predict the occluded objects, and the occluded results were associated with the newly detected objects to ensure the continuity of the tracking results. In addition, on the basis of TrackFormer, the modeling of apparent statistical characteristics and motion features was added to realize the fusion between different structures. Experimental results on MOT17 dataset show that compared with TrackFormer, the proposed algorithm has the IDentity F1 Score (IDF1) increased by 0.87 percentage points, the Multiple Object Tracking Accuracy (MOTA) increased by 0.41 percentage points, and the IDSW number reduced by 16.3%. The proposed method also achieves good results on MOT16 and MOT20 datasets. Consequently, the proposed method can effectively deal with the object occlusion problem, maintain object identity information, and reduce object identity loss.
Magnetic Resonance Imaging (MRI) is widely used in the diagnosis of complex diseases because of its non-invasiveness and good soft tissue contrast. Due to the low speed of MRI, most of the acceleration is currently performed by highly undersampled Magnetic Resonance (MR) signals in k-space. However, the representative algorithms often have the problem of blurred details when reconstructing highly undersampled MR images. Therefore, a highly undersampled MR image reconstruction algorithm based on Residual Graph Convolutional Neural nETwork (RGCNET) was proposed. Firstly, auto-encoding technology and Graph Convolutional neural Network (GCN) were used to build a generator. Secondly, the undersampled image was input into the feature extraction (encoder) network to extract features at the bottom layer. Thirdly, the high-level features of MR images were extracted by the GCN block. Fourthly, the initial reconstructed image was generated through the decoder network. Finally, the final high-resolution reconstructed image was obtained through a dynamic game between the generator and the discriminator. Test results on FastMRI dataset show that at 10%, 20%, 30%, 40% and 50% sampling rates, compared with spatial orthogonal attention mechanism based MRI reconstruction algorithm SOGAN(Spatial Orthogonal attention Generative Adversarial Network), the proposed algorithm decreases 3.5%, 26.6%, 23.9%, 13.3% and 14.3% on Normalized Root Mean Square Error (NRMSE), increases 1.2%, 8.7%, 6.9%, 2.9% and 3.2% on Peak Signal-to-Noise Ratio (PSNR) and increases 0.8%, 2.9%, 1.5%, 0.5% and 0.5% on Structural SIMilarity (SSIM) respectively. At the same time, subjective observation also proves that the proposed algorithm can preserve more details and have more realistic visual effects.
A Filtered Back-Projection (FBP) ultrasonic tomography reconstruction algorithm based on sparse representation was proposed to solve the difficulty of traditional ultrasonic Lamb wave in detecting and vividly describing the delamination defects composite materials. Firstly, the Lamb wave time-of-flight signals in the composite plate with defect were used as the projection values, the one-dimensional Fourier transform of the projection was equivalent to the two-dimensional Fourier transform of the original image, and the FBP reconstructed image was obtained by convolution with the filter function and projection along different directions. Then, the sparse super-resolution model was constructed and jointly trained by constructing a dictionary of low-resolution image blocks and high-resolution image blocks in order to strengthen the sparse similarity between low- and high-resolution blocks and real image blocks, and a complete dictionary was constructed using low- and high-resolution blocks. Finally, the images obtained by FBP were substituted into the constructed dictionary to obtain the complete high-resolution images. Experimental results show that the proposed algorithm improves Peak Signal-to-Noise Ratio (PSNR), Structural Similarity (SSIM), and Edge Structural Similarity (ESSIM) values in the reconstructed image by 9.22%, 2.90%, 80.77%, and 4.75%, 1.52%, 16.5%, respectively compared with the linear interpolation and bicubic spline interpolation algorithms. The proposed algorithm can detect delamination defects in composite materials, improve the resolution of the obtained images with delamination defects and enhance the edge details of the images.
Aiming at the problems in Multi-scale Generative Adversarial Networks Image Inpainting algorithm (MGANII), such as unstable training in the process of image inpainting, poor structural consistency, insufficient details and textures of the inpainted image, an image inpainting algorithm of multi-scale generative adversarial network was proposed based on multi-feature fusion. Firstly, aiming at the problems of poor structural consistency and insufficient details and textures, a Multi-Feature Fusion Module (MFFM) was introduced in the traditional generator, and a perception-based feature reconstruction loss function was introduced to improve the ability of feature extraction in the dilated convolutional network, thereby supplying more details and texture features for the inpainted image. Then, a perception-based feature matching loss function was introduced into local discriminator to enhance the discrimination ability of the discriminator, thereby improving the structural consistency of the inpainted image. Finally, a risk penalty term was introduced into the adversarial loss function to meet the Lipschitz continuity condition, so that the network was able to converge rapidly and stably in the training process. On the dataset CelebA, compared with MANGII, the proposed multi-feature fusion image inpainting algorithm can converges faster. Meanwhile, the Peak Signal-to-Noise Ratio (PSNR) and Structural SIMilarity (SSIM) of the images inpainted by the proposed algorithm are improved by 0.45% to 8.67% and 0.88% to 8.06% respectively compared with those of the images inpainted by the baseline algorithms, and Frechet Inception Distance score (FID) of the images inpainted by the proposed algorithm is reduced by 36.01% to 46.97% than the images inpainted by the baseline algorithms. Experimental results show that the inpainting performance of the proposed algorithm is better than that of the baseline algorithms.
Aiming at the problems of low efficiency, low accuracy, excessive occupancy of human resources and intelligent classification algorithm miniaturization deployment requirements in China Customs risk control methods at this stage, a customs risk control method based on an improved Butterfly Feedback neural Network Version 2 (BFNet-V2) was proposed. Firstly, the Filling in Code (FC) algorithm was used to realize the semantic replacement of the customs tabular data to the analog image. Then, the analog image data was trained by using the BFNet-V2. The regular neural network structure was composed of left and right links, different convolution kernels and blocks, and small block design, and the residual short path was added to improve the overfitting and gradient disappearance. Finally, a Historical momentum Adaptive moment estimation algorithm (H-Adam) was proposed to optimize the gradient descent process and achieve a better adaptive learning rate adjustment, and classify customs data. Xception (eXtreme inception), Mobile Network (MobileNet), Residual Network (ResNet), and Butterfly Feedback neural Network (BF-Net) were selected as the baseline network structures for comparison. The Receiver Operating Characteristic curve (ROC) and the Precision-Recall curve (PR) of the BFNet-V2 contain the curves of the baseline network structures. Taking Transfer Learning (TL) as an example, compared with the four baseline network structures, the classification accuracy of BFNet-V2 increases by 4.30%,4.34%,4.10% and 0.37% respectively. In the process of classifying real-label data, the misjudgment rate of BFNet-V2 reduces by 70.09%,57.98%,58.36% and 10.70%, respectively. The proposed method was compared with eight classification methods including shallow and deep learning methods, and the accuracies on three datasets increase by more than 1.33%. The proposed method can realize automatic classification of tabular data and improve the efficiency and accuracy of customs risk control.
Aiming at the problem that the current image defect detection models have poor detection effect on tail categories in long-tail defect datasets, a GGW-DND Loss (Gradient-Guide Weighted-Deferred Negative Gradient decay Loss) was proposed. First, the positive and negative gradients were re-weighted according to the cumulative gradient ratio of the classification nodes in the detector in order to reduce the suppressed state of tail classifier. Then, once the model was optimized to a certain stage, the negative gradient generated by each node was sharply reduced to enhance the generalization ability of the tail classifier. Experimental results on the self-made image defect dataset and NEU-DET (NEU surface defect database for Defect dEtection Task) show that the mean Average Precision (mAP) for tail categories of the proposed loss is better than that of Binary Cross Entropy Loss (BCE Loss), the former is increased by 32.02 and 7.40 percentage points respectively, and compared with EQL v2 (EQualization Loss v2), the proposed loss has the mAP increased by 2.20 and 0.82 percentage points respectively, verifying that the proposed loss can effectively improve the detection performance of the network for tail categories.
Concerning the problems of the high cost of massive data storage and low efficiency of data traceability verification in the Internet of Things (IoT) system, a data trusted traceability method based on Merkel Mountain Range (MMR), named MMRBCV (Merkle Mountain Range BlockChain Verification), was proposed. Firstly, Inter-Planetary File System (IPFS) was used to realize the storage of the IoT data. Secondly, the consortium blockchains and private blockchains were adopted to design a double-blockchain structure to realize reliable recording of the data flow process. Finally, based on the MMR, a block structure was constructed to realize the rapid verification of lightweight IoT nodes in the process of data traceability. Experimental results show that MMRBCV reduces the amount of data downloaded during data tracing, and the data verification time is related to the structure of MMR. When MMR forms a perfect binary tree, the data verification time is short. When the block height is 200 000, MMRBCV’s maximum verification time is about 10 ms, which is about 72% shorter than that of Simplified Payment Verification (SPV) (about 36 ms), indicating that the proposed method improves the verification efficiency effectively.
As the core of steel production, hot rolling process has demands of strict production continuity and complex production technology. The random arrival of rush orders and urgent delivery requirements have adverse impacts on production continuity and quality stability. Aiming at those kind of dynamic events of rush order insertion, a hot rolling rescheduling optimization method was proposed. Firstly, the influence of order disturbance factor on the scheduling scheme was analyzed, and a mathematical model of hot rolling rescheduling was established with the optimization objective of minimizing the weighted sum of tardiness of orders and jump penalty of slabs. Then, an Estimation of Distribution Algorithm (EDA) for hot rolling rescheduling was designed. In this algorithm, aiming at the insertion processing of rush orders, an integer encoding scheme was proposed based on the insertion position, the probability model based on the characteristics of the model was designed, and the fitness function based on the penalty value was defined by considering the targets and constraints comprehensively. The feasibility and validity of the model and the algorithm were verified by the simulation experiment on the actual production data.
To solve the problems of image detail loss and unclear texture caused by interference factors such as noise, imaging technology and imaging principles in the medical Magnetic Resonance Imaging (MRI) process, a multi-receptive field generative adversarial network for medical MRI image super-resolution reconstruction was proposed. First, the multi-receptive field feature extraction block was used to obtain the global feature information of the image under different receptive fields. In order to avoid the loss of detailed texture due to too small or too large receptive fields, each set of features was divided into two groups, and one of which was used to feedback global feature information under different scales of receptive fields, and the other group was used to enrich the local detailed texture information of the next set of features; then, the multi-receptive field feature extraction block was used to construct feature fusion group, and spatial attention module was added to each feature fusion group to adequately obtain the spatial feature information of the image, reducing the loss of shallow and local features in the network, and achieving a more realistic degree in the details of the image. Secondly, the gradient map of the low-resolution image was converted into the gradient map of the high-resolution image to assist the reconstruction of the super-resolution image. Finally, the restored gradient map was integrated into the super-resolution branch to provide structural prior information for super-resolution reconstruction, which was helpful to generate high quality super-resolution images. The experimental results show that compared with the Structure-Preserving Super-Resolution with gradient guidance (SPSR) algorithm, the proposed algorithm improves the Peak Signal-to-Noise Ratio (PSNR) by 4.8%, 2.7% and 3.5% at ×2, ×3 and ×4 scales, respectively, and the reconstructed medical MRI images have richer texture details and more realistic visual effects.
In order to solve the problems such as high time cost, inaccuracy and influence of parameter setting on algorithm performance when optimizing parameters of Convolutional Neural Network (CNN) by traditional manual methods, a variable Convolutional AutoEncoder (CAE) method based on Teaching-Learning-Based Optimization (TLBO) was proposed. In the algorithm, a variable-length individual encoding strategy was designed to quickly construct the CAE structure, and stack CAEs to a CNN. In addition, the excellent individual structure information was fully utilized to guide the algorithm to search the regions with more possibility, thereby improving the algorithm performance. Experimental results show that the classification accuracy of the proposed algorithm achieves 89.84% when solving medical image classification problems, which is higher than those of traditional CNN and similar neural networks. The proposed algorithm solves the medical image classification problems by optimizing the CAE structure and stacking CNN, and effectively improves the classification accuracy of medical image classification.
Because the traditional developer recommendation methods focus on analyzing the developers’ professional abilities and the interaction information with the tasks, without considering the problem of collaboration between the developers, a developer recommendation method based on Environment-Class, Agent, Role, Group, and Object (E-CARGO) model was proposed. Firstly, the developer collaborative development process was described as a role-based collaboration problem and modeled by E-CARGO model combining the characteristics of collaborative development. Then, a fuzzy judgment matrix was established by Fuzzy Analytic Hierarchy Process (FAHP) method to obtain the developer ability index weights and weighted sum of them, thereby obtaining the set of historical comprehensive ability evaluation of the developers. Finally, in view of the uncertainty and dynamic characteristics of the developers’ comprehensive ability evaluation, the cloud model theory was used to analyze the set of historical comprehensive ability evaluation of the developers to obtain the developers’ competence for each task, and the cplex optimization package was used to solve the developer recommendation problem. Experimental results show that the proposed method can obtain the best developer recommendation results within an acceptable time range, which verifies the effectiveness of the proposed method.
Focused on the issue that huge modal difference between cross-modal person re-identification images, pixel alignment and feature alignment are commonly utilized by most of the existing methods to realize image matching. In order to further improve the accuracy of matching two modal images, a multi-input dual-stream network model based on dynamic dual-attention mechanism was designed. Firstly, the neural network was able to learn sufficient feature information in a limited number of samples by adding images of the same person taken by different cameras in each training batch. Secondly, the gray-scale image obtained by homogeneous augmentation was used as an intermediate bridge to retain the structural information of the visible light images and eliminate the color information at the same time. The use of gray-scale images weakened the network’s dependence on color information, thereby strengthening the network model’s ability to mine structural information. Finally, a Weighted Six-Directional triple Ranking (WSDR) loss suitable for images three modalities was proposed, which made full use of cross-modal triple relationship under different angles of view, optimized relative distance between multiple modal features and improved the robustness to modal changes. Experimental results on SYSU-MM01 dataset show that the proposed model increases evaluation indexes Rank-1 and mean Average Precision (mAP) by 4.66 and 3.41 percentage points respectively compared to Dynamic Dual-attentive AGgregation (DDAG) learning model.
Aiming at the problems that current valve identification methods in industry have high missed rate of overlapping targets, low detection precision, poor target encapsulation degree and inaccurate positioning of circle center, a valve identification method based on double detection was proposed. Firstly, data enhancement was used to expand the samples in a lightweight way. Then, Spatial Pyramid Pooling (SPP) and Path Aggregation Network (PAN) were added on the basis of deep convolutional network. At the same time, the anchor boxes were adjusted and the loss function was improved to extract the valve prediction boxes. Finally, the Circle Hough Transform (CHT) method was used to secondarily identify the valves in the prediction boxes to accurately identify the valve regions. The proposed method was compared with the original You Only Look Once (YOLO)v3, YOLOv4, and the traditional CHT methods, and the detection results were evaluated by jointly using precision, recall and coincidence degree. Experimental results show that the average precision and recall of the proposed method reaches 97.1% and 94.4% respectively, 2.9 percentage points and 1.8 percentage points higher than those of the original YOLOv3 method respectively. In addition, the proposed method improves the target encapsulation degree and location accuracy of target center. The proposed method has the Intersection Over Union (IOU) between the corrected frame and the real frame reached 0.95, which is 0.05 higher than that of the traditional CHT method. The proposed method improves the success rate of target capture while improving the accuracy of model identification, and has certain practical value in practical applications.
Most of the existing directed graph clustering algorithms are based on the assumption of approximate linear relationship between nodes in vector space, ignoring the existence of non-linear correlation between nodes. To address this problem, a directed graph clustering algorithm based on Kernel Nonnegative Matrix Factorization (KNMF) was proposed. First, the adjacency matrix of a directed graph was projected to the kernel space by using a kernel learning method, and the node similarity in both the original and kernel spaces was constrained by a specific regularization term. Second, the objective function of graph regularization kernel asymmetric NMF algorithm was proposed, and a clustering algorithm was derived by gradient descent method under the non-negative constraints. The algorithm accurately reveals the potential structural information in the directed graph by modeling the non-linear relationship between nodes using kernel learning method, as well as considering the directivity of the links of nodes. Finally, experimental results on the Patent Citation Network (PCN) dataset show that compared with the comparison algorithm, when the number of clusters is 2, the proposed algorithm improves the Davies-Bouldin (DB) and Distance-based Quality Function (DQF) by about 0.25 and 8% respectively, achieving better clustering quality.
Aiming at the sharp increasing of data on the cloud caused by the development and popularization of cloud native technology as well as the bottlenecks of the technology in performance and stability, a Haystack-based storage system was proposed. With the optimization in service discovery, automatic fault tolerance and caching mechanism, the system is more suitable for cloud native business and meets the growing and high-frequent file storage and read/write requirements of the data acquisition, storage and analysis industries. The object storage model used by the system satisfies the massive file storage with high-frequency reads and writes. A simple and unified application interface is provided for business using the storage system, a file caching strategy is applied to improve the resource utilization, and the rich automated tool chain of Kubernetes is adopted to make this storage system easier to deploy, easier to expand, and more stable than other storage systems. Experimental results indicate that the proposed storage system has a certain performance and stability improvement compared with the current mainstream object storage and file systems in the situation of large-scale fragmented data storage with more reads than writes.