Text-based person retrieval aims to identify specific person using textual descriptions as queries. The existing state-of-the-art methods typically design multiple alignment mechanisms to achieve correspondence among cross-modal data at both global and local levels, but they neglect the mutual influence among these mechanisms. To address this, a multi-granularity shared semantic center association mechanism was proposed to explore the promoting and inhibiting effects between global and local alignments. Firstly, a multi-granularity cross-alignment module was introduced to enhance interactions of image-sentence and local region-word, achieving multi-level alignment of the cross-modal data in a joint embedding space. Then, a shared semantic center was established and served as a learnable semantic hub, and associations among global and local features were used to enhance semantic consistency among different alignment mechanisms and promote the collaborative effect of global and local features. In the shared semantic center, the local and global cross-modal similarity relationships among image and text features were calculated, providing a complementary measure from both global and local perspectives and maximizing positive effects among multiple alignment mechanisms. Finally, experiments were carried out on CUHK-PEDES dataset. Results show that the proposed method improves the Rank-1 by 8.69 percentage points and the mean Average Precision (mAP) by 6.85 percentage points compared to the baseline method significantly. The proposed method also achieves excellent performance on ICFG-PEDES and RSTPReid datasets, significantly surpassing all the compared methods.
Remote sensing data have high spatio-temporal correlation and complex surface features, which makes the privacy protection of the data challenging. As a distributed learning method with the goal of protecting data privacy of the participants, federated learning provides an effective solution to overcome the challenges faced by remote sensing data privacy protection. However, during the training phase of federated learning models, malicious attackers may infer private information of the participants through inversion, leading to the disclosure of sensitive information. Aiming at the privacy leakage problem of remote sensing data in federated learning training, a federated learning privacy protection scheme based on local differential privacy was proposed. Firstly, the model was pre-trained, the layer importance of the model was calculated, and the privacy budget was allocated reasonably based on the layer importance. Then, local differential privacy protection was achieved by performing a crop transformation on the model update and performing adaptive random disturbance on the crop value. Finally, model correction was employed to further improve the model performance when the aggregated disturbance was updated. Theoretical analysis and simulation results show that the proposed scheme can not only provide appropriate differential privacy protection for each participant and prevent inferring privacy sensitive information through inversion effectively, but also outperform the segmentation mechanism-based disturbance scheme in accuracy on three remote sensing datasets by 3.28 to 3.93 percentage points. It can be seen that the proposed scheme guarantees model performance effectively while ensuring privacy.
Federated Learning (FL) has emerged as a promising method for training machine learning models on decentralized edge devices while protecting data privacy. However, FL systems are susceptible to Byzantine attacks, which means that a malicious client compromises the integrity of the global model. Moreover, some existing defense methods have large computational overheads. To address the above problems, an adaptive defense mechanism, namely FedAud, was proposed, which aims to reduce computational overhead of the server while ensuring robustness of the FL system against Byzantine attacks. An anomaly detection module and a reputation mechanism were integrated by FedAud to adjust the defense strategy dynamically based on historical model updates. Experimental results of FedAud evaluated using MNIST and CIFAR-10 datasets under various attack scenarios and defense methods demonstrate that FedAud reduces the execution frequency of defense methods effectively, thereby alleviating the computational burden of the server and enhancing FL efficiency, particularly in scenarios of defense methods with high computational overheads or long training cycles. Furthermore, FedAud maintains model accuracy and even improves model performance in certain cases, verifying its effectiveness in real FL deployments.
Federated Learning(FL) has experienced rapid development due to its advantages in distributed structure and privacy security. However, the fairness issues caused by large-scale FL affect the sustainability of FL systems. In response to the fairness issues in FL, recent researches on fairness in FL was reviewed systematically and analyzed deeply. Firstly, the workflow and definitions of FL were explained, and biases and fairness concepts in FL were summarized. Secondly, commonly used datasets in fairness research of FL were detailed, and the challenges faced by fairness research were discussed. Finally, the advantages, disadvantages, applicable scenarios, and experimental setting of relevant research work were summed up from four aspects: data source selection, model optimization, contribution evaluation, and incentive mechanism, and the future research directions and trends in fairness of FL were prospected.
Since the lack of inductive bias in Vision Transformer (ViT) makes it hard to learn meaningful visual representations on relatively small-scale datasets, an unsupervised person re-identification method based on self-distilled vision Transformer was proposed. Firstly, because of the modular architecture of ViT, the feature generated by any intermediate block has the same dimension, so an intermediate Transformer block was selected randomly and was fed into the classifier to obtain prediction results. Secondly, by using the Kullback-Leibler divergence between the minimized randomly selected intermediate classifier output and the final classifier output distribution, the classification prediction results of the intermediate block were constrained to be consistent with the results of the final classifier, and a self-distillation loss function was constructed based on this. Finally, the model was optimized by jointly minimizing the cluster-level contrast loss, instance-level contrast loss, and self-distillation loss. Besides by providing soft supervision from the final classifier to the intermediate block, the inductive bias was introduced to ViT model effectively, so that the model was able to learn more robust and generalized visual representations. Compared to Transformer-based Object Re-IDentification Self-Supervised Learning (TransReID-SSL), the proposed method improves the mean Average Precision (mAP) and Rank-1 by 1.2 and 0.8 percentage points respectively on Market-1501 dataset, and by 3.4 and 3.1 percentage points respectively on MSMT17 dataset. Experimental results demonstrate that the proposed method can increase the unsupervised person re-identification precision effectively.
Existing methods for age estimation typically employ ordinal regression based on Convolutional Neural Network (CNN). However, when predicting adjacent ages, CNN is difficult in capturing global feature representations, resulting in a decrease in prediction accuracy. In order to solve the problem, an age estimation method was proposed, which combined an enhanced CloFormer model with ordinal regression. Compared to traditional CNN-based ordinal regression, CloFormer, when capturing image features, can better utilize self-attention mechanism to capture relationships between different regions in an image, thereby improving the learning of feature differences between adjacent ages. In the proposed method, firstly, the CloFormer model was optimized, and then the optimized CloFormer model was combined with ordinal regression to better utilize the age sequence information, achieving more precise age estimation. Subsequently, through end-to-end optimization training of the improved CloFormer model and ordinal regression model, the proposed method was able to better learn the relationships between facial features and age sequences. Finally, comparative experiments were conducted on multiple publicly available datasets. Experimental results show that on CACD, AFAD, and UTKFace datasets, the Root Mean Square Error (RMSE) of the proposed method is 7.36, 4.62, and 8.28, respectively. In comparison to existing age estimation methods such as Ordinal Regression with CNN (OR-CNN) and COnsistent RAnk Logits (CORAL), the RMSEs are reduced by 0.25 and 0.05 respectively on CACD dataset, 0.18 and 0.03 respectively on AFAD dataset, and 0.97 and 0.53 respectively on UTKFace dataset, illustrating that the proposed method has better age estimation results.
Federated learning is a distributed learning approach for solving the data sharing problem and privacy protection problem in machine learning, in which multiple parties jointly train a machine learning model and protect the privacy of data. However, there are security threats inherent in federated learning, which makes federated learning face great challenges in practical applications. Therefore, analyzing the attacks faced by federation learning and the corresponding defensive measures are crucial for the development and application of federation learning. First, the definition, process and classification of federated learning were introduced, and the attacker model in federated learning was introduced. Then, the possible attacks in terms of both robustness and privacy of federated learning systems were introduced, and the corresponding defense measures were introduced as well. Furthermore, the shortcomings of the defense schemes were also pointed out. Finally, a secure federated learning system was envisioned.
Focused shape restoration realizes 3D shape reconstruction by modeling the potential relationship between scene depth and defocus blur. However, the existing 3D shape reconstruction network cannot effectively utilize the sequential correlation of image sequences for representation learning. Therefore, a depth network framework based on spatial correlation features of multi-depth image sequences, namely 3D Spatial Correlation Horizon Analysis Model (3D SCHAM), was proposed for 3D shape reconstruction, by which not only the edge features could be accurately captured from the focus region to the defocus region in a single image frame, but also the spatial dependence features between different image frames could be utilized effectively. Firstly, the temporal continuous model for 3D shape reconstruction was constructed by constructing a network with composite extension of depth, width and receptive field to determine the single point depth results. Secondly, an attention module based on spatial correlation was introduced to fully learn the spatial dependence relationships of “adjacency” and “distance” between frames. In addition, residual-reversal bottleneck was used for resampling to maintain semantic richness across scales. Experimental results on DDFF 12-Scene real scene dataset show that compared with DfFintheWild model, the accuracy of 3D SCHAM model at three thresholds 1.25,1 . 25 2 , 1 . 25 3 is improved by 15.34%, 3.62% and 0.86% respectively, verifying the robustness of 3D SCHAM in real scenes.
Aiming at the problems of excessive added noise scale and error accumulation during iteration in existing dynamic social network privacy protection, a method named PGU-DNDP (Partial Graph Updating in Dynamic social Network based on Differential Privacy) was proposed. Firstly, the update sequences in the network snapshot graph set were collected through a temporal trade-off dynamic community discovery algorithm. Secondly, a static graph publishing method was used to obtain the initial generated graph. Finally, based on the generated graph of the previous moment and the update sequence of the current moment, the partial graph update was completed. The partial update method could reduce the excessive noise caused by the full graph perturbation and optimize the time cost, thus avoiding the intensive situation of synthetic graph. In addition, an edge updating strategy was proposed in the partial update, which combined the adaptive perturbation with a downsampling mechanism to reduce the cumulative error in the iterative process through privacy amplification, thus improving the synthetic graph accuracy effectively. Experimental results on three synthetic datasets and two real-world dynamic datasets show that PGU-DNDP can ensure the privacy requirements of dynamic social networks while retaining higher data utility than mainstream static graph generation method PrivGraph (differentially Private Graph data publication by exploiting community information).
As a distributed machine learning method, federated learning can fully exploit the value in the data while protecting data privacy. However, as the traditional federated learning training method only selects participating clients randomly, it is difficult to adapt to Not Identically and Independently Distributed (Non-IID) datasets. To solve the problems of low accuracy and slow convergence of federated learning models under Non-IID data, a Federated learning Client Selection method based on Label Classification (FedLCCS) was proposed. Firstly, the client dataset labels were classified and sorted according to the frequency statistics results. Then, clients with high-frequency labels were selected to participate in training. Finally, models with different accuracy were obtained by adjusting own parameters. Experimental results on MNIST, Fashion-MNIST and Cifar-10 datasets show that the two baseline methods, Federated Averaging (FedAvg) and Federated Proximal (FedProx), after combining with FedLCCS are better than the original ones under the initial dataset label selection ratio. The minimum accuracy improvements are 9.13 and 6.53 percentage points, the minimum convergence speed improvements are 57.41% and 18.52%, and the minimum running time reductions are 7.60% and 17.62%. The above verifies that FedLCCS can optimize the accuracy, convergence speed and running efficiency of federated models, and can train models with different accuracy to meet diversified demands.
Unsupervised anomaly detection methods based on feature embedding often use patch-level features to localize anomalies. Patch-level features are competitive in image-level anomaly detection tasks, but suffer from insufficient accuracy in pixel-level localization. To address this issue, MemAD, a pixel-level anomaly detection method composed of a multi-scale memory bank and a segmentation network, was proposed. Firstly, a pre-trained feature extraction network was used to extract features from normal samples in the training set, thereby constructing a normal sample feature memory bank at three scales. Then, during the training of the segmentation network, difference features between simulated pseudo-anomaly sample features and the nearest normal sample features in the memory bank were calculated, thereby further guiding the segmentation network to learn how to locate anomalous pixels. Experimental results show that MemAD achieves image-level and pixel-level AUC (Area Under the Receiver Operating Characteristic curve) of 0.980 and 0.974 respectively on MVTec AD (MVTec Anomaly Detection) dataset, outperforming most existing methods and confirming the accuracy of the proposed method in pixel-level anomaly localization.
Malicious traffic detection is one of the key technologies to deal with network security challenges. Aiming at the problems of insufficient local labeled data and degradation of co-trained model performance due to non-Independent and Identical Distribution (non-IID) when using federated learning for malicious traffic detection, a semi-supervised federated learning-based malicious traffic detection model was constructed. The proposed model was trained effectively by information extracted from unlabeled data with the help of semi-supervised learning techniques of pseudo-labeling and consistency regularization terms. At the same time, a nonlinear function was designed to dynamically adjust the weights of the client's local supervised and unsupervised losses during aggregation to make full use of unlabeled data and improve accuracy of the model. To reduce the impact of non-IID problems on performance of the global model, a federated aggregation algorithm FedLD (Federated-Loss-Data) was proposed, which adaptively adjusted the weights of different client models in the global model aggregation process through a weight calculation method that combined training loss and data volume. Experimental results show that on NSL-KDD dataset, the proposed model can achieve higher detection accuracy when labeled data is limited. Compared with the baseline model FedSem (Federated Semi-supervised), the proposed model has the detection accuracy increased by 4.11 percentage points, and the recall in Normal, Denial-of-Service (DoS), Probe and other categories also increased by 1.65 to 7.66 percentage points, verifying that the proposed model is more suitable for applications in the field of malicious traffic detection.
Federated Learning (FL) is a distributed machine learning approach that allows different participants to train a machine model collaboratively using their respective local datasets, addressing issues such as data island and user privacy protection. However, due to the inherent distributed nature of FL, it is more susceptible to backdoor attacks, posing greater challenges in practical applications of FL. Therefore, a deep understanding of backdoor attacks and defense methods in FL environment is crucial for the advancement of this field. Firstly, the definition, process, and classification of federated learning, as well as the definition of backdoor attacks, were introduced. Then, detailed representation and analysis were performed on both backdoor attacks and defense schemes in FL environment. Moreover, comparisons of backdoor attacks and defense methods were conducted. Finally, the development of backdoor attacks and defense methods in the FL environment were prospected.
In the era of digital economy, data publication plays a crucial role in data sharing. Histogram data publication is a common method for data publication. However, histogram data publication faces privacy leakage issues. To address this concern, research has been conducted on histogram data publication methods based on Differential Privacy (DP). Firstly, a brief description of DP and histogram properties, as well as the research on histogram publication methods for both static datasets and streaming data in the past five years both at home and abroad, was provided, and the balance among the grouping number and types of histograms, noise and grouping errors in static data, as well as privacy budget allocation problem, were discussed. Secondly, the issues of data sampling, data prediction, and sliding windows for dynamic data grouping were explored. Additionally, for the DP histogram publication methods oriented to interval tree structures were investigated, the original data was transformed into tree structures, and the discussions about tree-structured data noise addition, tree-structure based optimization, and privacy budget allocation for tree structures were conducted. Moreover, the feasibility and privacy aspects of published histogram data, as well as the issues of query range and accuracy of published histogram data, were discussed. Finally, comparative analysis was conducted on relevant algorithms and their advantages and disadvantages were summarized, quantitative analysis and applicable scenarios for some algorithms were provided, and the future research directions of DP-based histograms in various data scenarios were prospected.
Existing robotic grasping operations are usually performed under well-illuminated conditions with clear object details and high regional contrast. At the same time, for low-light conditions caused by night and occlusion, where the objects’ visual features are weak, the detection accuracies of existing robotic grasp detection models decrease dramatically. In order to improve the representation ability of sparse and weak grasp features in low-light scenarios, a grasp detection model incorporating visual feature enhancement mechanism was proposed to use the visual enhancement sub-task to impose feature enhancement constraints on grasp detection. In grasp detection module, the U-Net like encoder-decoder structure was adopted to achieve efficient feature fusion. In low-light enhancement module, the texture and color information was respectively extracted from local and global level, thereby balancing the object details and visual effect in feature enhancement. In addition, two low-light grasp datasets called low-light Cornell dataset and low-light Jacquard dataset were constructed as new benchmark dataset of low-light grasp and used to conduct the comparative experiments. Experimental results show that the accuracies of the proposed low-light grasp detection model are 95.5% and 87.4% on the benchmark datasets respectively, which are 11.1, 1.2 percentage points higher on low-light Cornell dataset and 5.5, 5.0 percentage points higher on low-light Jacquard dataset than those of the existing grasp detection models, including Generative Grasping Convolutional Neural Network (GG-CNN), and Generative Residual Convolutional Neural Network (GR-ConvNet), indicating that the proposed model has good grasp detection performance.
LifeLong learning (LLL), as an emerging method, breaks the limitations of traditional machine learning and gives the models the ability to accumulate, optimize and transfer knowledge in the learning process like human beings. In recent years, with the wide application of deep learning, more and more studies attempt to solve catastrophic forgetting problem in deep neural networks and get rid of the stability-plasticity dilemma, as well as apply LLL methods to a wide varieties of real-world scenarios to promote the development of artificial intelligence from weak to strong. Aiming at the field of computer vision, firstly, LLL methods were classified into four types in image classification tasks: data-driven methods, optimization process based methods, network structure based methods and knowledge combination based methods. Then, typical applications of LLL methods in other visual tasks and related evaluation indicators were introduced. Finally, the deficiencies of LLL methods at current stage were discussed, and the future development directions of LLL methods were proposed.
To address the problem that the development of machine learning requires a large number of real datasets with both data security and availability, an improved K-anonymity privacy protection algorithm based on Random Forest (RF) was proposed, namely RFK-anonymity privacy protection. Firstly, the sensitivity of each attribute value was predicted by RF algorithm. Secondly, the attribute values were clustered according to different sensitivities by using the k-means clustering algorithm, and the data was hidden to different degrees by using the K-anonymity algorithm according to the sensitivity clusters of attribution. Finally, data tables with different hiding degrees were selected by different users according to their needs. Experimental results show that in Adult datasets,compared with the data processed by K-anonymity algorithm, the accuracies of the data processed by the RFK-anonymity privacy protection algorithm are increased by 0.5 and 1.6 percentage points at thresholds of 3 and 4, respectively; compared with the data processed by (p,α, k)-anonymity algorithm, the accuracies of the data processed by the proposed algorithm are improved by 0.4 and 1.9 percentage points at thresholds of 4 and 5. It can be seen that RFK-anonymity privacy protection algorithm can effectively improve the availability of data on the basis of protecting the privacy and security of data, and it is more suitable for classification and prediction in machine learning.
In response to the inability of existing 3D shape reconstruction models to effectively fuse global spatio-temporal information, a Depth Focus Volume (DFV) module was proposed to retain the transition information of focus and defocus, on this basis, a Global Spatio-Temporal Feature Coupling (GSTFC) model was proposed to extract local and global spatio-temporal feature information of multi-depth-of-field image sequences. Firstly, the 3D-ConvNeXt module and 3D convolutional layer were interspersed in the shrinkage path to capture multi-scale local spatio-temporal features. Meanwhile, the 3D-SwinTransformer module was added to the bottleneck module to capture the global correlations of local spatio-temporal features of multi-depth-of-field image sequences. Then, the local spatio-temporal features and global correlations were fused into global spatio-temporal features through the adaptive parameter layer, which were input into the expansion path to guide and generate focus volume. Finally, the sequence weight information of the focus volume was extracted by DFV and the transition information of focus and defocus was retained to obtain the final depth map. Experimental results show that GSTFC decreases the Root Mean Square Error (RMSE) index by 12.5% on FoD500 dataset compared with the state-of-the-art All-in-Focus Depth Net (AiFDepthNet) model, and retains more depth-of-field transition relationships compared with the traditional Robust Focus Volume Regularization in Shape from Focus (RFVR-SFF) model.
Drug synthesis reactions, especially asymmetric reactions, are the key components of modern pharmaceutical chemistry. Chemists have invested a lot in manpower and resources to identify various chemical reaction patterns in order to achieve efficient synthesis and asymmetric catalysis. The latest researches of quantum mechanical computing and machine learning algorithms in this field have proved the great potential of accurate virtual screening and learning the existing drug synthesis reaction data by computers. However, the existing methods only use few single-modal data, and can only use the common machine learning methods due to the limitation of not enough data. This hinders their universal application in a wider range of scenarios. Therefore, two screening models of drug synthesis reaction integrating multimodal data were proposed for virtual screening of reaction yield and enantioselectivity. At the same time, a 3D conformation descriptor based on Boltzmann distribution was also proposed to combine the 3D spatial information of molecules with quantum mechanical properties. These two multimodal data fusion models were trained and verified in two representative organic synthesis reactions (C-N cross coupling reaction and N, S-acetal formation). The R2(R-squared) of the former is increased by more than 1 percentage point compared with those of the baseline methods in most data splitting, and the MAE(Mean Absolute Error) of the latter is decreased by more than 0.5 percentage points compared with those of the baseline methods in most data splitting. It can be seen that the models based on multimodal data fusion will bring good performance in different tasks of organic reaction screening.
Exsiting machine learning-based methods for Distributed Denial-of-Service (DDoS) attack detection continue to increase in detection difficulty and cost when facing more and more complex network traffic and constantly increased data structures. To address these issues, a random forest DDoS attack detection method that integrates feature selection was proposed. In this method, the mean impurity algorithm based on Gini coefficient was used as the feature selection algorithm to reduce the dimensionality of DDoS abnormal traffic samples, thereby reducing training cost and improving training accuracy. Meanwhile, the feature selection algorithm was embedded into the single base learner of random forest, and the feature subset search range was reduced from all features to the features corresponding to a single base learner, which improved the coupling of the two algorithms and improved the model accuracy. Experimental results show that the model trained by the random forest DDoS attack detection method that integrates feature selection has a recall increased by 21.8 percentage points and an F1-score increased by 12.0 percentage points compared to the model before improvement under the premise of limiting decision tree number and training sample size, and both of them are also better than those of the traditional random forest detection scheme.
Multi-modal medical images can provide clinicians with rich information of target areas (such as tumors, organs or tissues). However, effective fusion and segmentation of multi-modal images is still a challenging problem due to the independence and complementarity of multi-modal images. Traditional image fusion methods have difficulty in addressing this problem, leading to widespread research on deep learning-based multi-modal medical image segmentation algorithms. The multi-modal medical image segmentation task based on deep learning was reviewed in terms of principles, techniques, problems, and prospects. Firstly, the general theory of deep learning and multi-modal medical image segmentation was introduced, including the basic principles and development processes of deep learning and Convolutional Neural Network (CNN), as well as the importance of the multi-modal medical image segmentation task. Secondly, the key concepts of multi-modal medical image segmentation was described, including data dimension, preprocessing, data enhancement, loss function, and post-processing, etc. Thirdly, different multi-modal segmentation networks based on different fusion strategies were summarized and analyzed. Finally, several common problems in medical image segmentation were discussed, the summary and prospects for future research were given.
At present, most deep learning models are difficult to deal with the classification of bird sound under complex background noise. Because bird sound has the continuity characteristic in time domain and high-low characteristic in frequency domain, a fusion model of homologous spectrogram features was proposed for bird sound classification under complex background noise. Firstly, Convolutional Neural Network (CNN) was used to extract Mel-spectrogram features of bird sound. Then, the time domain and frequency domain dimensions of the same Mel-spectrogram feature were compressed to 1 by specific convolution and down-sampling operations, so that frequency domain feature with only high-low characteristics and the time domain feature with only continuous characteristics were obtained. Based on the above operation to extract frequency domain and time domain features, the features of Mel-spectrogram were extracted both in time domain and frequency domain, the time-frequency domain features with continuity and high-low characteristics were obtained. Then the self-attention mechanism was applied to the obtained time domain, frequency domain and time-frequency domain features, strengthening their own characteristics. Finally, the results of these three homologous spectrogram features after decision fusion were used for bird sound classification. The proposed model was used for audio classification of 8 bird species on Xeno-canto website, achieved the better result in the comparison experiment with the Mean Average Precision (MAP) of 0.939. The experimental results show that the proposed model can deal with the problem of the poor classification effect of bird sound under complex background noise.
Aiming at the problem that the improved federated average algorithm based on analytic hierarchy process was affected by subjective factors when calculating its data quality, an improved federated weighted average algorithm was proposed to process multi-source data from the perspective of data quality. Firstly, the training samples were divided into pre-training samples and pre-testing samples. Then, the accuracy of the initial global model on the pre-training data was used as the quality weight of the data source. Finally, the quality weight was introduced into the federated average algorithm to reupdate the weights in the global model. The simulation results show that the model trained by the improved federal weighted average algorithm get the higher accuracy compared with the model trained by the traditional federal average algorithm, which is improved by 1.59% and 1.24% respectively on equally divided and unequally divided datasets. At the same time, compared with the traditional multi-party data retraining method, although the accuracy of the proposed model is slightly reduced, the security of data and model is improved.
In order to protect data privacy while ensuring data availability in clustering analysis, a privacy protection clustering scheme based on Local Differential Privacy (LDP) technique called LDPK-Prototypes (LDP K-Prototypes) was proposed. Firstly, the hybrid dataset was encoded by users. Then, a random response mechanism was used to disturb the sensitive data, and after collecting the users’ disturbed data, the original dataset was recovered by the third party to the maximum extent. After that, the K-Prototypes clustering algorithm was performed. In the clustering process, the initial clustering center was determined by the dissimilarity measure method, and the new distance calculation formula was redefined by the entropy weight method. Theoretical analysis and experimental results show that compared with the ODPC (Optimizing and Differentially Private Clustering) algorithm based on the Centralized Differential Privacy (CDP) technique, the proposed scheme has the average accuracy on Adult and Heart datasets improved by 2.95% and 12.41% respectively, effectively improving the clustering usability. Meanwhile, LDPK-Prototypes expands the difference between data, effectively avoids local optimum, and improves the stability of the clustering algorithm.
Focusing on the problems of the large time consumption of manual detection and the insufficient precision of the current detection methods of elongated pavement distress, a two-stage elongated pavement distress detection method, named Epd RCNN (Elongated pavement distress Region-based Convolutional Neural Network), which could accurately locate and classify the distress was proposed according to the weak semantic characteristics and abnormal geometric properties of the distress. Firstly, for the weak semantic characteristics of elongated pavement distress, a backbone network that reused low-level features and repeatedly fused the features of different stages was proposed. Secondly, in the training process, the high-quality positive samples for network training were generated by the anchor box mechanism conforming to the geometric property distribution of the distress. Then, the distress bounding boxes were predicted on a single high-resolution feature map, and a parallel cascaded dilated convolution module was used to this feature map to improve its multi-scale feature representation ability. Finally, for different shapes of region proposals, the region proposal features conforming to the distress geometric properties were extracted by the proposal feature improvement module composed of deformable Region of Interest Pooling (RoI Pooling) and spatial attention module. Experimental results show that the proposed method has the mean Average Precision (mAP) of 0.907 on images with sufficient illumination, the mAP of 0.891 on images with illumination problems and the comprehensive mAP of 0.899, indicating that the proposed method has good detection performance and robustness to illumination.
Differentiable ARchiTecture Search (DARTS) can design neural network architectures efficiently and automatically. However, there is a performance “wide gap” between the construction method of super network and the design of derivation strategy in it. To solve the above problem, a differentiable neural architecture search algorithm with constraint in optimal search space was proposed. Firstly, the training process of the super network was analyzed by using the architecture parameters associated with the candidate operations as the quantitative indicators, and it was found that the invalid candidate operation none occupied the architecture parameter with the maximum weight in deviation architecture, which caused that architectures obtained by the algorithm had poor performance. Aiming at this problem, an optimized search space was proposed. Then, the difference between the super network of DARTS and derivation architecture was analyzed, the architecture entropy was defined based on architecture parameters, and this architecture entropy was used as the constraint of the objective function of DARTS, so as to promote the super network to narrow the difference with the derivation strategy. Finally, experiments were conducted on CIFAR-10 dataset. The experimental results show that the searched architecture by the proposed algorithm achieved 97.17% classification accuracy in these experiments, better than the comparison algorithms in accuracy, parameter quantity and search time comprehensively. The proposed algorithm is effective and improves classification accuracy of searched architecture on CIFAR-10 dataset.
Methods of parallel computation are used in validating topology of polygons stored in simple feature model. This paper designed and implemented a parallel algorithm of validating topology of polygons stored in simple feature model. The algorithm changed the master-slave strategy based on characteristics of topology validation and generated threads in master processor to implement task parallelism. Running time of computing and writing topology errors was hidden in this way. MPI and PThread were used to achieve the combination of processes and threads. The land use data of 5 cities in Jiangsu, China, was used to check the performance of this algorithm. After testing, this parallel algorithm is able to validate topology of massive polygons stored in simple feature model correctly and efficiently. Compared with master-slave strategy, the speedup of this algorithm increases by 20%.
To get a reasonable deployment and a communication relay link model of Unmanned Aerial Vehicle (UAV), and extend the data transmission distance, the Improved Bellman-Ford (IBF) algorithm and the Improved Dijkstra Algorithm (IDA) were proposed considering communication blind area and limited number of available UAVs. The UAV deployment problem was modeled as a All Hop Optimal Path (AHOP) problem, in which the IBF algorithm was used to generate a set of reachable records, and the solutions were got by accessing the records reversely; Then the IDA algorithm changed the connection weights of edges in each iteration process and found the path which decreased the hops of relay link, hence the feasible solution of UAV relay deployment problem was got. The simulation analysis illustrates that IBF and IDA can provide effective solutions of relay link deployment, and the time performance of the proposed algorithms are superior to Bellman-Ford (BF) algorithm.