Classic Federated Learning (FL) algorithms are difficult to achieve good results in scenarios where data is highly heterogeneous. In Personalized FL (PFL), a new solution was proposed aiming at the problem of data heterogeneity in federated learning, which is to “tailor” a dedicated model for each client. In this way, the models had good performance. However, it brought the difficulty in extending federated learning to new clients at the same time. Focusing on the challenges of performance and scalability in PFL, FedDual, a FL model with dual stream neural network structure, was proposed. By adding an encoder for analyzing the personalized characteristics of clients, this model was not only able to have the performance of personalized models, but also able to be extended to new clients easily. Experimental results show that compared to the classic Federated Averaging (FedAvg) algorithm on datasets such as MNIST and FashionMNIST, FedDual obviously improves the accuracy; on CIFAR10 dataset, FedDual improves the accuracy by more than 10 percentage points, FedDual achieves “plug and play” for new clients without decrease of the accuracy, solving the problem of difficult scalability for new clients.
Time slicing methods in dynamic networks greatly influence the accuracy of community evolution analysis results. As communities vary nonlinearly with time and network topology, both the existing uniform time slicing method and network topology variance-based nonuniform time slicing method are unsatisfactory in capturing community evolution events. Therefore, a nonuniform time slicing method based on prediction of community variance was proposed, where the community variance is quantitatively described by the difference between the community modularity expected to be achieved by the updated network and the community modularity obtained by directly applying the community detection results of the network before changing. Firstly, the prediction model of community modularity was established on the basis of time series analysis. Secondly, with the established model, the expected community modularity of the updated network was predicted, and the prediction value of community variance was obtained. Finally, once the prediction value surpassed a previously set threshold, a new time slice was generated. Experimental results on two real network datasets show that compared with the traditional uniform time slicing method and the nonuniform time slicing method based on network topology variance, on the dynamic network dataset Arxiv HEP-PH, the proposed method identifies community disappearance events 1.10 days and 1.30 days earlier, respectively, and identifies the community forming events 8.34 days and 3.34 days earlier, respectively, and the total number of identified community shrinking and growing events increased by 10 and 1 respectively. On Sx?MathOverflow dataset, the proposed method identifies community disappearance events 3.30 days and 1.80 days earlier, and identifies the community forming events 6.41 days and 2.97 days earlier respectively, and the total number of identified community shrinking and growing events increased by 15 and 7, respectively.
The recognition of spam is one of the main tasks in natural language processing. The traditional methods are based on text features or word frequency, which recognition accuracies mainly depend on the presence or absence of specific keywords. When there are no keywords or errors in recognizing keywords in the spam, the traditional methods have poor recognition performance. Neural network-based methods were proposed. Recognition training and testing were conducted on complex spam. The spams that cannot be recognized by traditional methods were collected and the same amount of normal information was randomly selected from spam messages, advertisement and spam email datasets to form three new datasets without duplicate data. Three models were proposed based on convolutional neural network and recurrent neural network and tested on three new datasets for spam recognition. The experimental results show that the neural network-based models learned better semantic features from the text and achieved the accuracies of more than 98% on all three datasets, which are significantly higher than those of the traditional methods, such as Naive Bayes (NB), Random Forest (RF) and Support Vector Machine (SVM). The experimental results also show that different neural networks are suitable for text classification with different lengths. The models composed of recurrent neural networks are good at recognizing text with sentence length, the models composed of convolutional neural networks are good at recognizing text with paragraph length, and the models composed of both neural networks are good at recognizing text with chapter length.