Most Read articles

    Published in last 1 year |  In last 2 years |  In last 3 years |  All

    Please wait a minute...
    For Selected: Toggle Thumbnails
    Federated learning survey:concepts, technologies, applications and challenges
    Tiankai LIANG, Bi ZENG, Guang CHEN
    Journal of Computer Applications    2022, 42 (12): 3651-3662.   DOI: 10.11772/j.issn.1001-9081.2021101821
    Abstract968)   HTML13)    PDF (2464KB)(603)       Save

    Under the background of emphasizing data right confirmation and privacy protection, federated learning, as a new machine learning paradigm, can solve the problem of data island and privacy protection without exposing the data of all participants. Since the modeling methods based on federated learning have become mainstream and achieved good effects at present, it is significant to summarize and analyze the concepts, technologies, applications and challenges of federated learning. Firstly, the development process of machine learning and the inevitability of the appearance of federated learning were elaborated, and the definition and classification of federated learning were given. Secondly, three federated learning methods (including horizontal federated learning, vertical federated learning and federated transfer learning) which were recognized by the industry currently were introduced and analyzed. Thirdly, concerning the privacy protection issue of federated learning, the existing common privacy protection technologies were generalized and summarized. In addition, the recent mainstream open-source frameworks were introduced and compared, and the application scenarios of federated learning were given at the same time. Finally, the challenges and future research directions of federated learning were prospected.

    Table and Figures | Reference | Related Articles | Metrics
    Stock market volatility prediction method based on graph neural network with multi-attention mechanism
    Xiaohan LI, Jun WANG, Huading JIA, Liu XIAO
    Journal of Computer Applications    2022, 42 (7): 2265-2273.   DOI: 10.11772/j.issn.1001-9081.2021081487
    Abstract564)   HTML11)    PDF (2246KB)(198)       Save

    Stock market is an essential element of financial market, therefore, the study on volatility of stock market plays a significant role in taking effective control of financial market risks and improving returns on investment. For this reason, it has attracted widespread attention from both academic circle and related industries. However, there are multiple influencing factors for stock market. Facing the multi-source and heterogeneous information in stock market, it is challenging to find how to mine and fuse multi-source and heterogeneous data of stock market efficiently. To fully explain the influence of different information and information interaction on the price changes in stock market, a graph neural network based on multi-attention mechanism was proposed to predict price fluctuation in stock market. First of all, the relationship dimension was introduced to construct heterogeneous subgraphs for the transaction data and news text of stock market, and multi-attention mechanism was adopted for fusion of the graph data. Then, the graph neural network Gated Recurrent Unit (GRU) was applied to perform graph classification. On this basis, prediction was made for the volatility of three important indexes: Shanghai Composite Index, Shanghai and Shenzhen 300 Index, Shenzhen Component Index. Experimental results show that from the perspective of heterogeneous information characteristics, compared with the transaction data of stock market, the news information of stock market has the lagged influence on stock volatility; from the perspective of heterogeneous information fusion, compared with algorithms such as Support Vector Machine (SVM), Random Forest (RF) and Multiple Kernel k-Means (MKKM) clustering, the proposed method has the prediction accuracy improved by 17.88 percentage points, 30.00 percentage points and 38.00 percentage points respectively; at the same time, the quantitative investment simulation was performed according to the model trading strategy.

    Table and Figures | Reference | Related Articles | Metrics
    Multiscale residual UNet based on attention mechanism to realize breast cancer lesion segmentation
    Shengqin LUO, Jinyi CHEN, Hongjun LI
    Journal of Computer Applications    2022, 42 (3): 818-824.   DOI: 10.11772/j.issn.1001-9081.2021040948
    Abstract543)   HTML29)    PDF (1860KB)(144)       Save

    Concerning the characteristics of breast cancer in Magnetic Resonance Imaging (MRI), such as different shapes and sizes, and fuzzy boundaries, an algorithm based on multiscale residual U Network (UNet) with attention mechanism was proposed in order to avoid error segmentation and improve segmentation accuracy. Firstly, the multiscale residual units were used to replace two adjacent convolution blocks in the down-sampling process of UNet, so that the network could pay more attention to the difference of shape and size. Then, in the up-sampling stage, layer-crossed attention was used to guide the network to focus on the key regions, avoiding the error segmentation of healthy tissues. Finally, in order to enhance the ability of representing the lesions, the atrous spatial pyramid pooling was introduced as a bridging module to the network. Compared with UNet, the proposed algorithm improved the Dice coefficient, Intersection over Union (IoU), SPecificity (SP) and ACCuracy (ACC) by 2.26, 2.11, 4.16 and 0.05 percentage points, respectively. The experimental results show that the algorithm can improve the segmentation accuracy of lesions and effectively reduce the false positive rate of imaging diagnosis.

    Table and Figures | Reference | Related Articles | Metrics
    Stock trend prediction method based on temporal hypergraph convolutional neural network
    Xiaojie LI, Chaoran CUI, Guangle SONG, Yaxi SU, Tianze WU, Chunyun ZHANG
    Journal of Computer Applications    2022, 42 (3): 797-803.   DOI: 10.11772/j.issn.1001-9081.2021050748
    Abstract542)   HTML33)    PDF (742KB)(269)       Save

    Traditional stock prediction methods are mostly based on time-series models, which ignore the complex relations among stocks, and the relations often exceed pairwise connections, such as stocks in the same industry or multiple stocks held by the same fund. To solve this problem, a stock trend prediction method based on temporal HyperGraph Convolutional neural Network (HGCN) was proposed, and a hypergraph model based on financial investment facts was constructed to fit multiple relations among stocks. The model was composed of two major components: Gated Recurrent Unit (GRU) network and HGCN. GRU network was used for performing time-series modeling on historical data to capture long-term dependencies. HGCN was used to model high-order relations among stocks to learn intrinsic relation attributes, and introduce the multiple relation information among stocks into traditional time-series modeling for end-to-end trend prediction. Experiments on real dataset of China A-share market show that compared with existing stock prediction methods, the proposed model improves prediction performance, e.g. compared with the GRU network, the proposed model achieves the relative increases in ACC and F1_score of 9.74% and 8.13%, respectively, and is more stable. In addition, the simulation back-testing results show that the trading strategy based on the proposed model is more profitable, with an annual return of 11.30%, which is 5 percentage points higher than that of Long Short-Term Memory (LSTM) network.

    Table and Figures | Reference | Related Articles | Metrics
    Safety helmet wearing detection algorithm based on improved YOLOv5
    Jin ZHANG, Peiqi QU, Cheng SUN, Meng LUO
    Journal of Computer Applications    2022, 42 (4): 1292-1300.   DOI: 10.11772/j.issn.1001-9081.2021071246
    Abstract511)   HTML23)    PDF (7633KB)(286)       Save

    Aiming at the problems of strong interference and low detection precision of the existing safety helmet wearing detection, an algorithm of safety helmet detection based on improved YOLOv5 (You Only Look Once version 5) model was proposed. Firstly, for the problem of different sizes of safety helmets, the K-Means++ algorithm was used to redesign the size of the anchor box and match it to the corresponding feature layer. Secondly, the multi-spectral channel attention module was embedded in the feature extraction network to ensure that the network was able to learn the weight of each channel autonomously and enhance the information dissemination between the features, thereby strengthening the network ability to distinguish foreground and background. Finally, images of different sizes were input randomly during the training iteration process to enhance the generalization ability of the algorithm. Experimental results show as follows: on the self-built safety helmet wearing detection dataset, the proposed algorithm has the mean Average Precision (mAP) reached 96.0%, the the Average Precision (AP) of workers wearing safety helmet reached 96.7%, and AP of workers without safety helmet reached 95.2%. Compared with the YOLOv5 algorithm, the proposed algorithm has the mAP of helmet safety-wearing detection increased by 3.4 percentage points, and it meets the accuracy requirement of helmet safety-wearing detection in construction scenarios.

    Table and Figures | Reference | Related Articles | Metrics
    Research progress on binary code similarity search
    Bing XIA, Jianmin PANG, Xin ZHOU, Zheng SHAN
    Journal of Computer Applications    2022, 42 (4): 985-998.   DOI: 10.11772/j.issn.1001-9081.2021071267
    Abstract509)   HTML102)    PDF (841KB)(453)       Save

    With the rapid development of Internet of Things (IoT) and industrial Internet, the research of cyberspace security has been paid more and more attention by industry and academia. Because the source code cannot be obtained, binary code similarity search has become a key core technology for vulnerability mining and malware code analysis. Firstly, the basic concepts of binary code similarity search and the framework of binary code similarity search system were introduced. Secondly, the development status of binary code technology about syntax similarity search, semantic similarity search and pragmatic similarity search were discussed. Then, the existing solutions were summarized and compared from the perspectives of binary hash, instruction sequence, graph structure, basic block semantics, feature learning, debugging information recovery and advanced semantic recognition of functions. Finally, the future development direction and prospect of binary code similarity search were looked forward to.

    Table and Figures | Reference | Related Articles | Metrics
    Time series classification by LSTM based on multi-scale convolution and attention mechanism
    Yinglü XUAN, Yuan WAN, Jiahui CHEN
    Journal of Computer Applications    2022, 42 (8): 2343-2352.   DOI: 10.11772/j.issn.1001-9081.2021061062
    Abstract508)   HTML44)    PDF (711KB)(269)       Save

    The multi-scale features of time series contain abundant category information which has different importance for classification. However, the existing univariate time series classification models conventionally extract series features by convolutions with a fixed kernel size, resulting in being unable to acquire and focus on important multi-scale features effectively. In order to solve the above problem, a Multi-scale Convolution and Attention mechanism (MCA) based Long Short-Term Memory (LSTM) model (MCA-LSTM) was proposed, which was capable of concentrating and fusing important multi-scale features to achieve more accurate classification effect. In this structure, by using LSTM, the transmission of series information was controlled through memory cells and gate mechanism, and the correlation information of time series was extracted fully; by using Multi-scale Convolution Module (MCM), the multi-scale features of the series were extracted through Convolutional Neural Networks (CNNs) with different kernel sizes; by using Attention Module (AM), the channel information was fused to obtain the importance of features and assign attention weights, which enabled the network to focus on important time series features. Experimental results on 65 univariate time series datasets of UCR archive show that compared with the state-of-the-art time series classification methods: Unsupervised Scalable Representation Learning-FordA (USRL-FordA), Unsupervised Scalable Representation Learning-Combined (1-Nearest Neighbor) (USRL-Combined (1-NN)), Omni-Scale Convolutional Neural Network (OS-CNN), Inception-Time and Robust Temporal Feature Network for time series classification (RTFN),MCA-LSTM has the Mean Error (ME) reduced by 7.48, 9.92, 2.43, 2.09 and 0.82 percentage points, respectively; and achieved the highest Arithmetic Mean Rank (AMR) and Geometric Mean Rank (GMR), which are 2.14 and 3.23 respectively. These results fully demonstrate the effectiveness of MCA-LSTM in the classification of univariate time series.

    Table and Figures | Reference | Related Articles | Metrics
    Survey on imbalanced multi‑class classification algorithms
    Mengmeng LI, Yi LIU, Gengsong LI, Qibin ZHENG, Wei QIN, Xiaoguang REN
    Journal of Computer Applications    2022, 42 (11): 3307-3321.   DOI: 10.11772/j.issn.1001-9081.2021122060
    Abstract490)   HTML62)    PDF (1861KB)(356)       Save

    Imbalanced data classification is an important research content in machine learning, but most of the existing imbalanced data classification algorithms foucus on binary classification, and there are relatively few studies on imbalanced multi?class classification. However, datasets in practical applications usually have multiple classes and imbalanced data distribution, and the diversity of classes further increases the difficulty of imbalanced data classification, so the multi?class classification problem has become a research topic to be solved urgently. The imbalanced multi?class classification algorithms proposed in recent years were reviewed. According to whether the decomposition strategy was adopted, imbalanced multi?class classification algorithms were divided into decomposition methods and ad?hoc methods. Furthermore, according to the different adopted decomposition strategies, the decomposition methods were divided into two frameworks: One Vs. One (OVO) and One Vs. All (OVA). And according to different used technologies, the ad?hoc methods were divided into data?level methods, algorithm?level methods, cost?sensitive methods, ensemble methods and deep network?based methods. The advantages and disadvantages of these methods and their representative algorithms were systematically described, the evaluation indicators of imbalanced multi?class classification methods were summarized, the performance of the representative methods were deeply analyzed through experiments, and the future development directions of imbalanced multi?class classification were discussed.

    Table and Figures | Reference | Related Articles | Metrics
    Review of image classification algorithms based on convolutional neural network
    Changqing JI, Zhiyong GAO, Jing QIN, Zumin WANG
    Journal of Computer Applications    2022, 42 (4): 1044-1049.   DOI: 10.11772/j.issn.1001-9081.2021071273
    Abstract470)   HTML45)    PDF (605KB)(327)       Save

    Convolutional Neural Network (CNN) is one of the important research directions in the field of computer vision based on deep learning at present. It performs well in applications such as image classification and segmentation, target detection. Its powerful feature learning and feature representation capability are admired by researchers increasingly. However, CNN still has problems such as incomplete feature extraction and overfitting of sample training. Aiming at these issues, the development of CNN, classical CNN network models and their components were introduced, and the methods to solve the above issues were provided. By reviewing the current status of research on CNN models in image classification, the suggestions were provided for further development and research directions of CNN.

    Table and Figures | Reference | Related Articles | Metrics
    Research progress of blockchain‑based federated learning
    Rui SUN, Chao LI, Wei WANG, Endong TONG, Jian WANG, Jiqiang LIU
    Journal of Computer Applications    2022, 42 (11): 3413-3420.   DOI: 10.11772/j.issn.1001-9081.2021111934
    Abstract466)   HTML12)    PDF (1086KB)(321)       Save

    Federated Learning (FL) is a novel privacy?preserving learning paradigm that can keep users' data locally. With the progress of the research on FL, the shortcomings of FL, such as single point of failure and lack of credibility, are gradually gaining attention. In recent years, the blockchain technology originated from Bitcoin has achieved rapid development, which pioneers the construction of decentralized trust and provides a new possibility for the development of FL. The existing research works on blockchain?based FL were reviewed, the frameworks for blockchain?based FL were compared and analyzed. Then, key points of FL solved by the combination of blockchain and FL were discussed. Finally, the application prospects of blockchain?based FL were presented in various fields, such as Internet of Things (IoT), Industrial Internet of Things (IIoT), Internet of Vehicles (IoV) and medical services.

    Table and Figures | Reference | Related Articles | Metrics
    Review of applications of natural language processing in text sentiment analysis
    Yingjie WANG, Jiuqi ZHU, Zumin WANG, Fengbo BAI, Jian GONG
    Journal of Computer Applications    2022, 42 (4): 1011-1020.   DOI: 10.11772/j.issn.1001-9081.2021071262
    Abstract456)   HTML58)    PDF (783KB)(286)       Save

    Text sentiment analysis has gradually become an important part of Natural Language Processing(NLP) in the fields of systematic recommendation and acquisition of user sentiment information, as well as public opinion reference for the government and enterprises. The methods in the field of sentiment analysis were compared and summarized by literature research. Firstly, literature investigation was carried out on the methods of sentiment analysis from the dimensions of time and method. Then, the main methods and application scenarios of sentiment analysis were summarized and compared. Finally, the advantages and disadvantages of each method were analyzed. According to the analysis results, in the face of different task scenarios, there are mainly three sentiment analysis methods: sentiment analysis based on emotion dictionary, sentiment analysis based on machine learning and sentiment analysis based on deep learning. The method based on multi-strategy mixture has become the trend of improvement. Literature investigation shows that there is still room for improvement in the techniques and methods of text sentiment analysis, and it has a large market and development prospects in e-commerce, psychotherapy and public opinion monitoring.

    Table and Figures | Reference | Related Articles | Metrics
    Network representation learning model based on node attribute bipartite graph
    Le ZHOU, Tingting DAI, Chun LI, Jun XIE, Boce CHU, Feng LI, Junyi ZHANG, Qiao LIU
    Journal of Computer Applications    2022, 42 (8): 2311-2318.   DOI: 10.11772/j.issn.1001-9081.2021060972
    Abstract456)   HTML118)    PDF (843KB)(360)       Save

    It is an important task to carry out reasoning and calculation on graph structure data. The main challenge of this task is how to represent graph-structured knowledge so that machines can easily understand and use graph structure data. After comparing the existing representation learning models, it is found that the models based on random walk methods are likely to ignore the special effect of attributes on the association between nodes. Therefore, a hybrid random walk method based on node adjacency and attribute association was proposed. Firstly the attribute weights were calculated through the common attribute distribution among adjacent nodes, and the sampling probability from the node to each attribute was obtained. Then, the network information was extracted from adjacent nodes and non-adjacent nodes with common attributes respectively. Finally, the network representation learning model based on node attribute bipartite graph was constructed, and the node vector representations were obtained through the above sampling sequence learning. Experimental results on Flickr, BlogCatalog and Cora public datasets show that the Micro-F1 average accuracy of node classification by the node vector representations obtained by the proposed model is 89.38%, which is 2.02 percentage points higher than that of GraphRNA (Graph Recurrent Networks with Attributed random walk) and 21.12 percentage points higher than that of classical work DeepWalk. At the same time, by comparing different random walk methods, it is found that increasing the sampling probabilities of attributes that promote node association can improve the information contained in the sampling sequence.

    Table and Figures | Reference | Related Articles | Metrics
    Transformer based U-shaped medical image segmentation network: a survey
    Journal of Computer Applications    DOI: 10.11772/j.issn.1001-9081.2022040530
    Accepted: 26 July 2022

    Performance interference analysis and prediction for distributed machine learning jobs
    Hongliang LI, Nong ZHANG, Ting SUN, Xiang LI
    Journal of Computer Applications    2022, 42 (6): 1649-1655.   DOI: 10.11772/j.issn.1001-9081.2021061404
    Abstract411)   HTML91)    PDF (1121KB)(379)       Save

    By analyzing the problem of job performance interference in distributed machine learning, it is found that performance interference is caused by the uneven allocation of GPU resources such as memory overload and bandwidth competition, and to this end, a mechanism for quickly predicting performance interference between jobs was designed and implemented, which can adaptively predict the degree of job interference according to the given GPU parameters and job types. First, the GPU parameters and interference rates during the operation of distributed machine learning jobs were obtained through experiments, and the influences of various parameters on performance interference were analyzed. Second, some GPU parameter-interference rate models were established by using multiple prediction technologies to analyze the job interference rate errors. Finally, an adaptive job interference rate prediction algorithm was proposed to automatically select the prediction model with the smallest error for a given equipment environment and job set to predict the job interference rates quickly and accurately. By selecting five commonly used neural network tasks, experiments were designed on two GPU devices and the results were analyzed. The results show that the proposed Adaptive Interference Prediction (AIP) mechanism can quickly complete the selection of prediction model and the performance interference prediction without providing any pre-assumed information, it has comsumption time less than 300 s and achieves prediction error rate in the range of 2% to 13%, which can be applied to scenarios such as job scheduling and load balancing.

    Table and Figures | Reference | Related Articles | Metrics
    Popular science text classification model enhanced by knowledge graph
    Wangjing TANG, Bin XU, Meihan TONG, Meihuan HAN, Liming WANG, Qi ZHONG
    Journal of Computer Applications    2022, 42 (4): 1072-1078.   DOI: 10.11772/j.issn.1001-9081.2021071278
    Abstract391)   HTML36)    PDF (1056KB)(192)       Save

    Popular science text classification aims to classify the popular science articles according to the popular science classification system. Concerning the problem that the length of popular science articles often exceeds 1 000 words, which leads to the model hard to focus on key points and causes poor classification performance of the traditional models, a model for long text classification combining knowledge graph to perform two-level screening was proposed to reduce the interference of topic-irrelevant information and improve the performance of model classification. First, a four-step method was used to construct a knowledge graph for the domains of popular science. Then, this knowledge graph was used as a distance monitor to filter out irrelevant information through training sentence filters. Finally, the attention mechanism was used to further filter the information of the filtered sentence set, and the attention-based topic classification model was completed. Experimental results on the constructed Popular Science Classification Dataset (PSCD) show that the text classification algorithm model based on the domain knowledge graph information enhancement has higher F1-Score. Compared with the TextCNN model and the BERT (Bidirectional Encoder Representations from Transformers) model, the proposed model has the F1-Score increased by 2.88 percentage points and 1.88 percentage points respectively, verifying the effectiveness of knowledge graph to long text information screening.

    Table and Figures | Reference | Related Articles | Metrics
    Chinese event detection based on data augmentation and weakly supervised adversarial training
    Ping LUO, Ling DING, Xue YANG, Yang XIANG
    Journal of Computer Applications    2022, 42 (10): 2990-2995.   DOI: 10.11772/j.issn.1001-9081.2021081521
    Abstract384)   HTML37)    PDF (720KB)(220)       Save

    The existing event detection models rely heavily on human-annotated data, and supervised deep learning models for event detection task often suffer from over-fitting when there is only limited labeled data. Methods of replacing time-consuming human annotation data with auto-labeled data typically rely on sophisticated pre-defined rules. To address these issues, a BERT (Bidirectional Encoder Representations from Transformers) based Mix-text ADversarial training (BMAD) method for Chinese event detection was proposed. In the proposed method, a weakly supervised learning scene was set on the basis of data augmentation and adversarial learning, and a span extraction model was used to solve event detection task. Firstly, to relieve the problem of insufficient data, various data augmentation methods such as back-translation and Mix-Text were applied to augment data and create weakly supervised learning scene for event detection. And then an adversarial training mechanism was applied to learn with noise and improve the robustness of the whole model. Several experiments were conducted on commonly used real-world dataset Automatic Context Extraction (ACE) 2005. The results show that compared with algorithms such as Nugget Proposal Network (NPN), Trigger-aware Lattice Neural Network (TLNN) and Hybrid-Character-Based Neural Network (HCBNN), the proposed method has the F1 score improved by at least 0.84 percentage points.

    Table and Figures | Reference | Related Articles | Metrics
    New computing power network architecture and application case analysis
    Zheng DI, Yifan CAO, Chao QIU, Tao LUO, Xiaofei WANG
    Journal of Computer Applications    2022, 42 (6): 1656-1661.   DOI: 10.11772/j.issn.1001-9081.2021061497
    Abstract376)   HTML45)    PDF (1584KB)(166)       Save

    With the proliferation of Artificial Intelligence (AI) computing power to the edge of the network and even to terminal devices, the computing power network of end-edge-supercloud collaboration has become the best computing solution. The emerging new opportunities have spawned the deep integration between end-edge-supercloud computing and the network. However, the complete development of the integrated system is unsolved, including adaptability, flexibility, and valuability. Therefore, a computing power network for ubiquitous AI named ACPN was proposed with the assistance of blockchain. In ACPN, the end-edge-supercloud collaboration provides infrastructure for the framework, and the computing power resource pool formed by the infrastructure provides safe and reliable computing power for the users, the network satisfies users’ demands by scheduling resources, and the neural network and execution platform in the framework provide interfaces for AI task execution. At the same time, the blockchain guarantees the reliability of resource transaction and encourage more computing power contributors to join the platform. This framework provides adaptability for users of computing power network, flexibility for resource scheduling of networking computing power, and valuability for computing power providers. A clear description of this new computing power network architecture was given through a case.

    Table and Figures | Reference | Related Articles | Metrics
    Survey of event extraction
    Chunming MA, Xiuhong LI, Zhe LI, Huiru WANG, Dan YANG
    Journal of Computer Applications    2022, 42 (10): 2975-2989.   DOI: 10.11772/j.issn.1001-9081.2021081542
    Abstract367)   HTML71)    PDF (3054KB)(268)       Save

    The event that the user is interested in is extracted from the unstructured information, and then displayed to the user in a structured way, that is event extraction. Event extraction has a wide range of applications in information collection, information retrieval, document synthesis, and information questioning and answering. From the overall perspective, event extraction algorithms can be divided into four categories: pattern matching algorithms, trigger lexical methods, ontology-based algorithms, and cutting-edge joint model methods. In the research process, different evaluation methods and datasets can be used according to the related needs, and different event representation methods are also related to event extraction research. Distinguished by task type, meta-event extraction and subject event extraction are the two basic tasks of event extraction. Among them, meta-event extraction has three methods based on pattern matching, machine learning and neural network respectively, while there are two ways to extract subjective events: based on the event framework and based on ontology respectively. Event extraction research has achieved excellent results in single languages such as Chinese and English, but cross-language event extraction still faces many problems. Finally, the related works of event extraction were summarized and the future research directions were prospected in order to provide guidelines for subsequent research.

    Table and Figures | Reference | Related Articles | Metrics
    Lightweight object detection algorithm based on improved YOLOv4
    Zhifeng ZHONG, Yifan XIA, Dongping ZHOU, Yangtian YAN
    Journal of Computer Applications    2022, 42 (7): 2201-2209.   DOI: 10.11772/j.issn.1001-9081.2021050734
    Abstract355)   HTML8)    PDF (5719KB)(299)       Save

    YOLOv4 (You Only Look Once version 4) object detection network has complex structure, many parameters, high configuration required for training and low Frames Per Second (FPS) for real-time detection. In order to solve the above problems, a lightweight object detection algorithm based on YOLOv4, named ML-YOLO (MobileNetv3Lite-YOLO), was proposed. Firstly, MobileNetv3 was used to replace the backbone feature extraction network of YOLOv4, which greatly reduced the amount of backbone network parameters through the depthwise separable convolution in MobileNetv3. Then, a simplified weighted Bi-directional Feature Pyramid Network (Bi-FPN) structure was used to replace the feature fusion network of YOLOv4. Therefore, the object detection accuracy was optimized by the attention mechanism in Bi-FPN. Finally, the final prediction box was generated through the YOLOv4 decoding algorithm, and the object detection was realized. Experimental results on VOC (Visual Object Classes) 2007 dataset show that the mean Average Precision (mAP) of the ML-YOLO algorithm reaches 80.22%, which is 3.42 percentage points lower than that of the YOLOv4 algorithm, and 2.82 percentage points higher than that of the YOLOv5m algorithm; at the same time, the model size of the ML-YOLO algorithm is only 44.75 MB, compared with the YOLOv4 algorithm, it is reduced by 199.54 MB, and compared with the YOLOv5m algorithm, it is only 2.85 MB larger. Experimental results prove that the proposed ML-YOLO model greatly reduces the size of the model compared with the YOLOv4 model while maintaining a higher detection accuracy, indicating that the proposed algorithm can meet the lightweight and accuracy requirements of mobile or embedded devices for object detection.

    Table and Figures | Reference | Related Articles | Metrics
    Federated learning algorithm for communication cost optimization
    Sai ZHENG, Tianrui LI, Wei HUANG
    Journal of Computer Applications    2023, 43 (1): 1-7.   DOI: 10.11772/j.issn.1001-9081.2021122054
    Abstract327)   HTML17)    PDF (934KB)(209)       Save

    Federated Learning (FL) is a machine learning setting that can protect data privacy, however, the problems of high communication cost and client heterogeneity hinder the large?scale implementation of federated learning. To solve these two problems, a federated learning algorithm for communication cost optimization was proposed. First, the generative models from the clients were received and simulated data were generated by the server. Then, the simulated data were used by the server to train the global model and send it to the clients, and the final models were obtained by the clients through fine?tuning the global model. In the proposed algorithm only one round of communication between clients and the server was needed, and the fine?tuning of the client models was used to solve the problem of client heterogeneity. When the number of clients is 20, experiments were carried out on MNIST and CIFAR?10 dataset. The results show that the proposed algorithm can reduce the amount of communication data to 1/10 of that of Federated Averaging (FedAvg) algorithm on the MNIST dataset, and can reduce the amount of communication data to 1/100 of that of Federated Averaging (FedAvg) algorithm on the CIFAR-10 dataset with the premise of ensuring accuracy.

    Table and Figures | Reference | Related Articles | Metrics
    Survey of clustering based on deep learning
    Yongfeng DONG, Yahan DENG, Yao DONG, Yacong WANG
    Journal of Computer Applications    2022, 42 (4): 1021-1028.   DOI: 10.11772/j.issn.1001-9081.2021071275
    Abstract324)   HTML41)    PDF (623KB)(218)       Save

    Clustering is a technique to find the internal structure between data, which is a basic problem in many data-driven applications. Clustering performance depends largely on the quality of data representation. In recent years, deep learning is widely used in clustering tasks due to its powerful feature extraction ability, in order to learn better feature representation and improve clustering performance significantly. Firstly, the traditional clustering tasks were introduced. Then, the representative clustering methods based on deep learning were introduced according to the network structure, the existing problems were pointed out, and the applications of deep learning based clustering in different fields were presented. At last, the development of deep learning based clustering was summarized and prospected.

    Table and Figures | Reference | Related Articles | Metrics
    Adversarial example generation method based on image flipping transform
    Bo YANG, Hengwei ZHANG, Zheming LI, Kaiyong XU
    Journal of Computer Applications    2022, 42 (8): 2319-2325.   DOI: 10.11772/j.issn.1001-9081.2021060993
    Abstract320)   HTML46)    PDF (1609KB)(202)       Save

    In the face of adversarial example attack, deep neural networks are vulnerable. These adversarial examples result in the misclassification of deep neural networks by adding human-imperceptible perturbations on the original images, which brings a security threat to deep neural networks. Therefore, before the deployment of deep neural networks, the adversarial attack is an important method to evaluate the robustness of models. However, under the black-box setting, the attack success rates of adversarial examples need to be improved, that is, the transferability of adversarial examples need to be increased. To address this issue, an adversarial example method based on image flipping transform, namely FT-MI-FGSM (Flipping Transformation Momentum Iterative Fast Gradient Sign Method), was proposed. Firstly, from the perspective of data augmentation, in each iteration of the adversarial example generation process, the original input image was flipped randomly. Then, the gradient of the transformed images was calculated. Finally, the adversarial examples were generated based on this gradient, so as to alleviate the overfitting in the process of adversarial example generation and to improve the transferability of adversarial examples. In addition, the method of attacking ensemble models was used to further enhance the transferability of adversarial examples. Extensive experiments on ImageNet dataset demonstrated the effectiveness of the proposed algorithm. Compared with I-FGSM (Iterative Fast Gradient Sign Method) and MI-FGSM (Momentum I-FGSM), the average black-box attack success rate of FT-MI-FGSM on the adversarially training networks is improved by 26.0 and 8.4 percentage points under the attacking ensemble model setting, respectively.

    Table and Figures | Reference | Related Articles | Metrics
    Machine reading comprehension model based on event representation
    Yuanlong WANG, Xiaomin LIU, Hu ZHANG
    Journal of Computer Applications    2022, 42 (7): 1979-1984.   DOI: 10.11772/j.issn.1001-9081.2021050719
    Abstract313)   HTML67)    PDF (916KB)(265)       Save

    In order to truly understand a piece of text, it is very important to grasp the main clues of the original text in the process of reading comprehension. Aiming at the questions of main clues in machine reading comprehension, a machine reading comprehension method based on event representation was proposed. Firstly, the textual event graph including the representation of events, the extraction of event elements and the extraction of event relations was extracted from the reading material by clue phrases. Secondly, after considering the time elements, emotional elements of events and the importance of each word in the document, the TextRank algorithm was used to select the events related to the clues. Finally, the answers of the questions were constructed based on the selected clue events. Experimental results show that on the test set composed of the collected 339 questions of clues, the proposed method is better than the sentence ranking method based on TextRank algorithm on BiLingual Evaluation Understudy (BLEU) and Consensus-based Image Description Evaluation (CIDEr) evaluation indexes. In specific, BLEU-4 index is increased by 4.1 percentage points and CIDEr index is increased by 9 percentage points.

    Table and Figures | Reference | Related Articles | Metrics
    Time series prediction model based on multimodal information fusion
    Minghui WU, Guangjie ZHANG, Canghong JIN
    Journal of Computer Applications    2022, 42 (8): 2326-2332.   DOI: 10.11772/j.issn.1001-9081.2021061053
    Abstract312)   HTML39)    PDF (658KB)(221)       Save

    Aiming at the problem that traditional single factor methods cannot make full use of the relevant information of time series and has the poor accuracy and reliability of time series prediction, a time series prediction model based on multimodal information fusion,namely Skip-Fusion, was proposed to fuse the text data and numerical data in multimodal data. Firstly, different types of text data were encoded by pre-trained Bidirectional Encoder Representations from Transformers (BERT) model and one-hot encoding. Then, the single vector representation of the multi-text feature fusion was obtained by using the pre-trained model based on global attention mechanism. After that, the obtained single vector representation was aligned with the numerical data in time order. Finally, the fusion of text and numerical features was realized through Temporal Convolutional Network (TCN) model, and the shallow and deep features of multimodal data were fused again through skip connection. Experiments were carried out on the dataset of stock price series, Skip-Fusion model obtains the results of 0.492 and 0.930 on the Root Mean Square Error (RMSE) and daily Return (R) respectively, which are better than the results of the existing single-modal and multimodal fusion models. Experimental results show that Skip-Fusion model obtains the goodness of fit of 0.955 on the R-squared, indicating that Skip-Fusion model can effectively carry out multimodal information fusion and has high accuracy and reliability of prediction.

    Table and Figures | Reference | Related Articles | Metrics
    Research on Bloom filter: a survey
    Wendi HUA, Yuan GAO, Meng LYU, Ping XIE
    Journal of Computer Applications    2022, 42 (6): 1729-1747.   DOI: 10.11772/j.issn.1001-9081.2021061392
    Abstract311)   HTML22)    PDF (3209KB)(85)       Save

    Bloom Filter (BF) is a binary vector data structure based on hashing strategy. With the idea of sharing hash collisions, the characteristic of one-way misjudgment and the very small time complexity of constant query, BF is often used to represent membership and as an “accelerator” for membership query operations. As the best mathematical tool to solve the membership query problem in computer engineering, BF has been widely used and developed in network engineering, storage system, database, file system, distributed system and some other fields. In the past few years, in order to adapt to various hardware environments and application scenarios, a large number of variant optimization schemes of BF based on the ideas of changing structure and optimizing algorithm appeared. With the development of big data era, it has become an important direction of membership query to improve the characteristics and operation logic of BF.

    Table and Figures | Reference | Related Articles | Metrics
    Improved federated weighted average algorithm
    Changyin LUO, Junyu WANG, Xuebin CHEN, Chundi MA, Shufen ZHANG
    Journal of Computer Applications    2022, 42 (4): 1131-1136.   DOI: 10.11772/j.issn.1001-9081.2021071264
    Abstract301)   HTML13)    PDF (468KB)(142)       Save

    Aiming at the problem that the improved federated average algorithm based on analytic hierarchy process was affected by subjective factors when calculating its data quality, an improved federated weighted average algorithm was proposed to process multi-source data from the perspective of data quality. Firstly, the training samples were divided into pre-training samples and pre-testing samples. Then, the accuracy of the initial global model on the pre-training data was used as the quality weight of the data source. Finally, the quality weight was introduced into the federated average algorithm to reupdate the weights in the global model. The simulation results show that the model trained by the improved federal weighted average algorithm get the higher accuracy compared with the model trained by the traditional federal average algorithm, which is improved by 1.59% and 1.24% respectively on equally divided and unequally divided datasets. At the same time, compared with the traditional multi-party data retraining method, although the accuracy of the proposed model is slightly reduced, the security of data and model is improved.

    Table and Figures | Reference | Related Articles | Metrics
    Semi-supervised representation learning method combining graph auto-encoder and clustering
    Hangyuan DU, Sicong HAO, Wenjian WANG
    Journal of Computer Applications    2022, 42 (9): 2643-2651.   DOI: 10.11772/j.issn.1001-9081.2021071354
    Abstract300)   HTML50)    PDF (1000KB)(273)       Save

    Node label is widely existed supervision information in complex networks, and it plays an important role in network representation learning. Based on this fact, a Semi-supervised Representation Learning method combining Graph Auto-Encoder and Clustering (GAECSRL) was proposed. Firstly, the Graph Convolutional Network (GCN) and inner product function were used as the encoder and the decoder respectively, and the graph auto-encoder was constructed to form an information dissemination framework. Then, the k-means clustering module was added to the low-dimensional representation generated by the encoder, so that the training process of the graph auto-encoder and the category classification of the nodes were used to form a self-supervised mechanism. Finally, the category classification of the low-dimensional representation of the network was guided by using the discriminant information of the node labels. The network representation generation, category classification, and the training of the graph auto-encoder were built into a unified optimization model, and an effective network representation result that integrates node label information was obtained. In the simulation experiment, the GAECSRL method was used for node classification and link prediction tasks. Experimental results show that compared with DeepWalk, node2vec, learning Graph Representations with global structural information (GraRep), Structural Deep Network Embedding (SDNE) and Planetoid (Predicting labels and neighbors with embeddings transductively or inductively from data), GAECSRL has the Micro?F1 index increased by 0.9 to 24.46 percentage points, and the Macro?F1 index increased by 0.76 to 24.20 percentage points in the node classification task; in the link prediction task, GAECSRL has the AUC (Area under Curve) index increased by 0.33 to 9.06 percentage points, indicating that the network representation results obtained by GAECSRL effectively improve the performance of node classification and link prediction tasks.

    Table and Figures | Reference | Related Articles | Metrics
    Network embedding method based on multi-granularity community information
    Jun HU, Zhengkang XU, Li LIU, Fujin ZHONG
    Journal of Computer Applications    2022, 42 (3): 663-670.   DOI: 10.11772/j.issn.1001-9081.2021040790
    Abstract295)   HTML57)    PDF (758KB)(237)       Save

    Most of the existing network embedding methods only preserve the local structure information of the network, while they ignore other potential information in the network. In order to preserve the community information of the network and reflect the multi-granularity characteristics of the network community structure, a network Embedding method based on Multi-Granularity Community information (EMGC) was proposed. Firstly, the network’s multi-granularity community structure was obtained, the node embedding and the community embedding were initialized. Then, according to the node embedding at previous level of granularity and the community structure at this level of granularity, the community embedding was updated, and the corresponding node embedding was adjusted. Finally, the node embeddings under different community granularities were spliced to obtain the network embedding that fused the community information of different granularities. Experiments on four real network datasets were carried out. Compared with the methods that do not consider community information (DeepWalk, node2vec) and the methods that consider single-granularity community information (ComE, GEMSEC), EMGC’s AUC value on link prediction and F1 score on node classification are generally better than those of the comparison methods. The experimental results show that EMGC can effectively improve the accuracy of subsequent link prediction and node classification.

    Table and Figures | Reference | Related Articles | Metrics
    Traffic sign detection algorithm based on improved attention mechanism
    Xinyu ZHANG, Sheng DING, Zhipei YANG
    Journal of Computer Applications    2022, 42 (8): 2378-2385.   DOI: 10.11772/j.issn.1001-9081.2021061005
    Abstract287)   HTML24)    PDF (1664KB)(197)       Save

    In some scenes, the low resolution, coverage and other environmental factors of traffic signs lead to missed and false detections in object detection tasks. Therefore, a traffic sign detection algorithm based on improved attention mechanism was proposed. First of all, in response to the problem of low image resolution due to damage, lighting and other environmental impacts of traffic signs, which leaded to the limited extraction of image feature information by the network, an attention module was added to the backbone network to enhance the key features of the object area. Secondly, the local features between adjacent channels in the feature map had a certain correlation due to the overlap of the receptive fields, a one-dimensional convolution of size k was used to replace the fully connected layer in the channel attention module to aggregate different channel information and reduce the number of additional parameters. Finally, the receptive field module was introduced in the medium- and small-scale feature layers of Path Aggregation Network (PANet) to increase the receptive field of the feature map to fuse the context information of the object area and improve the network’s ability to detect traffic signs. Experimental results on CSUST Chinese Traffic Sign Detection Benchmark (CCTSDB) dataset show that the proposed improved You Only Look Once v4 (YOLOv4) algorithm achieve an average detection speed with a small amount of parameters introduced and the detection speed is not much different from that of the original algorithm. The mean Accuracy Precision (mAP) reached 96.88%, which was increased by 1.48%; compared with the lightweight network YOLOv5s, with the single frame detection speed of 10?ms slower, the mAP of the proposed algorithm is 3.40 percentage points higher than that of YOLOv5s, and the speed reached 40?frame/s, indicating that the algorithm meets the real-time requirements of object detection completely.

    Table and Figures | Reference | Related Articles | Metrics
    Decision optimization of traffic scenario problem based on reinforcement learning
    Fei LUO, Mengwei BAI
    Journal of Computer Applications    2022, 42 (8): 2361-2368.   DOI: 10.11772/j.issn.1001-9081.2021061012
    Abstract282)   HTML15)    PDF (735KB)(120)       Save

    The traditional reinforcement learning algorithm has limitations in convergence speed and solution accuracy when solving the taxi path planning problem and the traffic signal control problem in traffic scenarios. Therefore, an improved reinforcement learning algorithm was proposed to solve this kind of problems. Firstly, by applying the optimized Bellman equation and Speedy Q-Learning (SQL) mechanism, and introducing experience pool technology and direct strategy, an improved reinforcement learning algorithm, namely Generalized Speedy Q-Learning with Direct Strategy and Experience Pool (GSQL-DSEP), was proposed. Then, GSQL-DSEP algorithm was applied to optimize the path length in the taxi path planning decision problem and the total waiting time of vehicles in the traffic signal control problem. The error of GSQL-DSEP algorithm was reduced at least 18.7% than those of the algorithms such as Q-learning, SQL, Generalized Speedy Q-Learning (GSQL) and Dyna-Q, the decision path length determined by GSQL-DSEP algorithm was reduced at least 17.4% than those determined by the compared algorithms, and the total waiting time of vehicles determined by GSQL-DSEP algorithm was reduced at most 51.5% than those determined by compared algorithms for the traffic signal control problem. Experimental results show that, GSQL-DSEP algorithm has advantages in solving traffic scenario problems over the compared algorithms.

    Table and Figures | Reference | Related Articles | Metrics
    Real‑time detection method of traffic information based on lightweight YOLOv4
    Keyou GUO, Xue LI, Min YANG
    Journal of Computer Applications    2023, 43 (1): 74-80.   DOI: 10.11772/j.issn.1001-9081.2021101849
    Abstract276)   HTML5)    PDF (3019KB)(204)       Save

    Aiming at the problem of vehicle objection detection in daily road scenes, a real?time detection method of traffic information based on lightweight YOLOv4 (You Only Look Once version 4) was proposed. Firstly, a multi?scene and multi?period vehicle object dataset was constructed, which was preprocessed by K?means++ algorithm. Secondly, a lightweight YOLOv4 detection model was proposed, in which the backbone network was replaced by MobileNet?v3 to reduce the number of parameters of the model, and the depth separable convolution was introduced to replace the standard convolution in the original network. Finally, combined with label smoothing and annealing cosine algorithms, the activation function Leaky Rectified Linear Unit (LeakyReLU) was used to replace the original activation function in the shallow network of MobileNet?v3 in order to optimize the convergence effect of the model. Experimental results show that the lightweight YOLOv4 has the weight file of 56.4 MB, the detection rate of 85.6 FPS (Frames Per Second), and the detection precision of 93.35%, verifying that the proposed method can provide the reference for the real?time traffic information detection and its applications in real road scenes.

    Table and Figures | Reference | Related Articles | Metrics
    Animation video generation model based on Chinese impressionistic style transfer
    Wentao MAO, Guifang WU, Chao WU, Zhi DOU
    Journal of Computer Applications    2022, 42 (7): 2162-2169.   DOI: 10.11772/j.issn.1001-9081.2021050836
    Abstract268)   HTML4)    PDF (5691KB)(60)       Save

    At present, Generative Adversarial Network (GAN) has been used for image animation style transformation. However, most of the existing GAN-based animation generation models mainly focus on the extraction and generation of realistic style with the targets of Japanese animations and American animations. Very little attention of the model is paid to the transfer of impressionistic style in Chinese-style animations, which limits the application of GAN in the domestic animation production market. To solve the problem, a new Chinese-style animation GAN model, namely Chinese Cartoon GAN (CCGAN), was proposed for the automatic generation of animation videos with Chinese impressionistic style by integrating Chinese impressionistic style into GAN model. Firstly, by adding the inverted residual blocks into the generator, a lightweight deep neural network model was constructed to reduce the computational cost of video generation. Secondly, in order to extract and transfer the characteristics of Chinese impressionistic style, such as sharp image edges, abstract content structure and stroke lines with ink texture, the gray-scale style loss and color reconstruction loss were constructed in the generator to constrain the high-level semantic consistency in style between the real images and the Chinese-style sample images. Moreover, in the discriminator, the gray-scale adversarial loss and edge-promoting adversarial loss were constructed to constrain the reconstructed image for maintaining the same edge characteristics of the sample images. Finally, the Adam algorithm was used to minimize the above loss functions to realize style transfer, and the reconstructed images were combined into video. Experimental results show that, compared with the current representative style transfer models such as CycleGAN and CartoonGAN, the proposed CCGAN can effectively learn the Chinese impressionistic style from Chinese-style animations such as Chinese Choir and significantly reduce the computational cost, indicating that the proposed CCGAN is suitable for the rapid generation of animation videos with large quantities.

    Table and Figures | Reference | Related Articles | Metrics
    Ship detection algorithm based on improved RetinaNet
    Wenjun FAN, Shuguang ZHAO, Lizheng GUO
    Journal of Computer Applications    2022, 42 (7): 2248-2255.   DOI: 10.11772/j.issn.1001-9081.2021050831
    Abstract265)   HTML6)    PDF (4946KB)(76)    PDF(mobile) (3371KB)(48)    Save

    At present, the target detection technology based on deep learning algorithm has achieved the remarkable results in ship detection of Synthetic Aperture Radar (SAR) images. However, there is still the problem of poor detection effect of small target ships and densely arranged ships near shore. To solve the above problem, a new ship detection algorithm based on improved RetinaNet was proposed. On the basis of traditional RetinaNet algorithm, firstly, the convolution in the residual block of feature extraction network was improved to grouped convolution, thereby increasing the network width and improving the feature extraction ability of the network. Then, the attention mechanism was added in the last two stages of feature extraction network to make the network more focus on the target area and improve the target detection ability. Finally, the Soft Non-Maximum Suppression (Soft-NMS) was added to the algorithm to reduce the missed detection rate of the algorithm for the detection of densely arranged ships near shore. Experimental results on High-Resolution SAR Images Dataset (HRSID) and SAR Ship Detection Dataset (SSDD) show that, the proposed algorithm effectively improves the detection effect of small target ships and near-shore ships, is superior in detection precision and speed compared with the current excellent object detection models such as Faster Region-based Convolutional Neural Network (R-CNN), You Only Look Once version 3 (YOLOv3) and CenterNet.

    Table and Figures | Reference | Related Articles | Metrics
    Solving dynamic traveling salesman problem by deep reinforcement learning
    Haojie CHEN, Jiangting FAN, Yong LIU
    Journal of Computer Applications    2022, 42 (4): 1194-1200.   DOI: 10.11772/j.issn.1001-9081.2021071253
    Abstract264)   HTML9)    PDF (795KB)(113)       Save

    Designing a unified solution to the combinational optimization problems of undesigned heuristic algorithms has become a research hotspot in the field of machine learning. At present, mature technologies are mainly aiming at static combinatorial optimization problems, but the combinational optimization problems with dynamic changes are not fully solved. In order to solve above problems, a lightweight model called Dy4TSP (Dynamic model for Traveling Salesman Problems) was proposed, which combined multi-head-attention mechanism with distributed reinforcement learning to solve the traveling salesman problem on a dynamic graph. Firstly, the node representation vector from graph convolution neural network was processed by the prediction network based on multi-head-attention mechanism. Then, the distributed reinforcement learning algorithm was used to quickly predict the possibility that each node in the graph was output as the optimal solution, and the optimal solution space of the problems in different possibilities were comprehensively explored. Finally, the action decision sequence which could meet the specific reward function in real time was generated by the trained model. The model was evaluated on three typical combinatorial optimization problems, and the experimental results showed that the solution qualities of the proposed model are 0.15 to 0.37 units higher than those of the open source solver LKH3 (Lin-Kernighan-Helsgaun 3), and are significantly better than those of the latest algorithms such as Graph Attention Network with Edge Embedding (EGATE). The proposed model can reach an optimal path gap of 0.1 to 1.05 in other dynamic traveling salesman problems, and the results are slightly better.

    Table and Figures | Reference | Related Articles | Metrics
    Partially explainable non-negative matrix tri-factorization algorithm based on prior knowledge
    Lu CHEN, Xiaoxia ZHANG, Hong YU
    Journal of Computer Applications    2022, 42 (3): 671-675.   DOI: 10.11772/j.issn.1001-9081.2021040927
    Abstract260)   HTML25)    PDF (600KB)(156)       Save

    Non-negative Matrix Tri-Factorization (NMTF) is an important part of the latent factor model. Because this algorithm decomposes the original data matrix into three mutually constrained latent factor matrices, it has been widely used in research fields such as recommender systems and transfer learning. However, there is no research work on the interpretability of non-negative matrix tri-factorization. From this view, by regarding the user comment text information as prior knowledge, Partially Explainable Non-negative Matrix Tri-Factorization (PE-NMTF) algorithm was designed based on prior knowledge. Firstly, sentiment analysis technology was used by to extract the emotional polarity preferences of user comment text information. Then, the objective function and updating formula in non-negative matrix tri-factorization algorithm were changed, embedding prior knowledge into the algorithm. Finally, a large number of experiments were carried out on the Yelp and Amazon datasets for the cold start task of the recommender system and the AwA and CUB datasets for the image zero-shot task to compare the proposed algorithm with the non-negative matrix factorization and the non-negative matrix three-factor decomposition algorithms. The experimental results show that the proposed algorithm performs well on RMSE (Root Mean Square Error), NDCG (Normalized Discounted Cumulative Gain), NMI (Normalized Mutual Information), and ACC (ACCuracy), and the feasibility and effectiveness of the non-negative matrix tri-factorization were verified by using prior knowledge.

    Table and Figures | Reference | Related Articles | Metrics
    Review of mechanical fault diagnosis technology based on convolutional neural network
    Zumin WANG, Zhihao ZHANG, Jing QIN, Changqing JI
    Journal of Computer Applications    2022, 42 (4): 1036-1043.   DOI: 10.11772/j.issn.1001-9081.2021071266
    Abstract259)   HTML15)    PDF (532KB)(148)       Save

    In view of the difficulty of traditional mechanical fault diagnosis methods to solve the problem of the uncertainty of manual extraction, a large number of deep learning feature extraction methods have been proposed, which greatly promotes the development of mechanical fault diagnosis. As a typical representative of deep learning, convolution neural networks have made significant developments in image classification, target detection, image semantic segmentation and other fields. There is also a lot of literature in the field of mechanical fault diagnosis. In view of the published literature, in order to further understand the problem of mechanical fault diagnosis by using the method of convolutional neural network, on the basis of a brief introduction to the relevant theories of convolution neural network, and then from the aspects such as data input type, transfer learning, and prediction, the applications of convolution neural network in mechanical fault diagnosis were summarized. Finally, the development directions of convolution neural network and its applications in mechanical fault diagnosis were prospected.

    Table and Figures | Reference | Related Articles | Metrics
    Knowledge graph embedding model based on improved Inception structure
    Xiaopeng YU, Ruhan HE, Jin HUANG, Junjie ZHANG, Xinrong HU
    Journal of Computer Applications    2022, 42 (4): 1065-1071.   DOI: 10.11772/j.issn.1001-9081.2021071265
    Abstract254)   HTML23)    PDF (570KB)(84)       Save

    KGE(Knowledge Graph Embedding) maps entities and relationships into a low-dimensional continuous vector space, uses machine learning methods to implement relational data applications, such as knowledge analysis, reasoning, and completion. Taking ConvE (Convolution Embedding) as a representative, CNN (Convolutional Neural Network) is applied to knowledge graph embedding to capture the interactive information of entities and relationships, but the ability of the standard convolutional to capture feature interaction information is insufficient, and its feature expression ability is low. Aiming at the problem of insufficient feature interaction ability, an improved Inception structure was proposed, based on which a knowledge graph embedding model named InceE was constructed. Firstly, hybrid dilated convolution replaced standard convolution to improve the ability to capture feature interaction information. Secondly, the residual network structure was used to reduce the loss of feature information. The experiments were carried out on the datasets Kinship, FB15k, WN18 to verify the effectiveness of link prediction by InceE. Compared with ArcE and QuatRE models on the Kinship and FB15k datasets, the Hit@1 of InceE increased by 1.6 and 1.5 percentage points; compared with ConvE on the three datasets, the Hit@1 of InceE increased by 6.3, 20.8, and 1.0 percentage points. The experimental results show that InceE has a stronger ability to capture feature interactive information.

    Table and Figures | Reference | Related Articles | Metrics
    Handwritten English text recognition based on convolutional neural network and Transformer
    Xianjie ZHANG, Zhiming ZHANG
    Journal of Computer Applications    2022, 42 (8): 2394-2400.   DOI: 10.11772/j.issn.1001-9081.2021091564
    Abstract248)   HTML32)    PDF (703KB)(161)       Save

    Handwritten text recognition technology can transcribe handwritten documents into editable digital documents. However, due to the problems of different writing styles, ever-changing document structures and low accuracy of character segmentation recognition, handwritten English text recognition based on neural networks still faces many challenges. To solve the above problems, a handwritten English text recognition model based on Convolutional Neural Network (CNN) and Transformer was proposed. Firstly, CNN was used to extract features from the input image. Then, the features were input into the Transformer encoder to obtain the prediction of each frame of the feature sequence. Finally, the Connectionist Temporal Classification (CTC) decoder was used to obtain the final prediction result. A large number of experiments were conducted on the public Institut für Angewandte Mathematik (IAM) handwritten English word dataset. Experimental results show that this model obtains a Character Error Rate (CER) of 3.60% and a Word Error Rate (WER) of 12.70%, which verify the feasibility of the proposed model.

    Table and Figures | Reference | Related Articles | Metrics
    News topic text classification method based on BERT and feature projection network
    Haifeng ZHANG, Cheng ZENG, Lie PAN, Rusong HAO, Chaodong WEN, Peng HE
    Journal of Computer Applications    2022, 42 (4): 1116-1124.   DOI: 10.11772/j.issn.1001-9081.2021071257
    Abstract247)   HTML25)    PDF (1536KB)(125)       Save

    Concerning the problems of the lack of standard words, fuzzy semantics and feature sparsity in news topic text, a news topic text classification method based on Bidirectional Encoder Representations from Transformers(BERT) and Feature Projection network(FPnet) was proposed. The method includes two implementation modes. In mode 1: the multiple-layer fully connected layer features were extracted from the output of news topic text at BERT model, and the final extracted text features were purified with the combination of feature projection method, thereby strengthening the classification effect. In mode 2, the feature projection network was fused in the hidden layer inside the BERT model for feature projection, so that the classification features were enhanced and purified through the hidden layer feature projection. Experimental results on Toutiao, Sohu News, THUCNews-L、THUCNews-S datasets show that the two above modes have better performance in accuracy and macro-averaging F1 value than baseline BERT method with the highest accuracy reached 86.96%, 86.17%, 94.40% and 93.73% respectively, which proves the feasibility and effectiveness of the proposed method.

    Table and Figures | Reference | Related Articles | Metrics
    Multi-scale object detection algorithm based on improved YOLOv3
    Liying ZHANG, Chunjiang PANG, Xinying WANG, Guoliang LI
    Journal of Computer Applications    2022, 42 (8): 2423-2431.   DOI: 10.11772/j.issn.1001-9081.2021060984
    Abstract247)   HTML14)    PDF (1714KB)(143)       Save

    In order to further improve the speed and precision of multi-scale object detection, and to solve the situations such as miss detection, wrong detection and repeated detection caused by small object detection, an object detection algorithm based on improved You Only Look Once v3 (YOLOv3) was proposed to realize automatic detection of multi-scale object. Firstly, the network structure was improved in the feature extraction network, and the attention mechanism was introduced into the spatial dimensions of residual module to pay attention to small objects. Then, Dense Convulutional Network (DenseNet) was used to fully integrate shallow information of the network, and the depthwise separable convolution was used to replace the normal convolution of the backbone network, thereby reducing the number of model parameters and improving the detection speed. In the feature fusion network, the bidirectional fusion of the shallow and deep features was realized through the bidirectional feature pyramid structure, and the 3-scale prediction was changed to 4-scale prediction, which improved the learning ability of multi-scale features. In terms of loss function, Generalized Intersection over Union (GIoU) was selected as the loss function, so that the precision of identifying objects was increased, and the object miss rate was reduced. Experimental results show that on Pascal VOC datasets, the mean Average Precision (mAP) of the improved YOLOv3 algorithm is as high as 83.26%, which is 5.89 percentage points higher than that of the original YOLOv3 algorithm, and the detection speed of the improved algorithm reaches 22.0 frame/s. Compared with the original YOLOv3 algorithm on Common Objects in COntext (COCO) dataset, the improved algorithm has the mAP improved by 3.28 percentage points. At the same time, in multi-scale object detection, the mAP of the algorithm has been improved, which verifies the effectiveness of the object detection algorithm based on the improved YOLOv3.

    Table and Figures | Reference | Related Articles | Metrics
2023 Vol.43 No.1

Current Issue
Honorary Editor-in-Chief: ZHANG Jingzhong
Editor-in-Chief: XU Zongben
Associate Editor: SHEN Hengtao XIA Zhaohui
Domestic Post Distribution Code: 62-110
Foreign Distribution Code: M4616
No. 9, 4th Section of South Renmin Road, Chengdu 610041, China
Tel: 028-85224283-803
Join CCF