Most Download articles

    Published in last 1 year | In last 2 years| In last 3 years| All| Most Downloaded in Recent Month | Most Downloaded in Recent Year|

    In last 3 years
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Knowledge graph survey: representation, construction, reasoning and knowledge hypergraph theory
    TIAN Ling, ZHANG Jinchuan, ZHANG Jinhao, ZHOU Wangtao, ZHOU Xue
    Journal of Computer Applications    2021, 41 (8): 2161-2186.   DOI: 10.11772/j.issn.1001-9081.2021040662
    Abstract2852)      PDF (2811KB)(3768)       Save
    Knowledge Graph (KG) strongly support the research of knowledge-driven artificial intelligence. Aiming at this fact, the existing technologies of knowledge graph and knowledge hypergraph were analyzed and summarized. At first, from the definition and development history of knowledge graph, the classification and architecture of knowledge graph were introduced. Second, the existing knowledge representation and storage methods were explained. Then, based on the construction process of knowledge graph, several knowledge graph construction techniques were analyzed. Specifically, aiming at the knowledge reasoning, an important part of knowledge graph, three typical knowledge reasoning approaches were analyzed, which are logic rule-based, embedding representation-based, and neural network-based. Furthermore, the research progress of knowledge hypergraph was introduced along with heterogeneous hypergraph. To effectively present and extract hyper-relational characteristics and realize the modeling of hyper-relation data as well as the fast knowledge reasoning, a three-layer architecture of knowledge hypergraph was proposed. Finally, the typical application scenarios of knowledge graph and knowledge hypergraph were summed up, and the future researches were prospected.
    Reference | Related Articles | Metrics
    Summarization of natural language generation
    LI Xueqing, WANG Shi, WANG Zhujun, ZHU Junwu
    Journal of Computer Applications    2021, 41 (5): 1227-1235.   DOI: 10.11772/j.issn.1001-9081.2020071069
    Abstract2634)      PDF (1165KB)(3677)       Save
    Natural Language Generation (NLG) technologies use artificial intelligence and linguistic methods to automatically generate understandable natural language texts. The difficulty of communication between human and computer is reduced by NLG, which is widely used in machine news writing, chatbot and other fields, and has become one of the research hotspots of artificial intelligence. Firstly, the current mainstream methods and models of NLG were listed, and the advantages and disadvantages of these methods and models were compared in detail. Then, aiming at three NLG technologies:text-to-text, data-to-text and image-to-text, the application fields, existing problems and current research progresses were summarized and analyzed respectively. Furthermore, the common evaluation methods and their application scopes of the above generation technologies were described. Finally, the development trends and research difficulties of NLG technologies were given.
    Reference | Related Articles | Metrics
    Review of event causality extraction based on deep learning
    WANG Zhujun, WANG Shi, LI Xueqing, ZHU Junwu
    Journal of Computer Applications    2021, 41 (5): 1247-1255.   DOI: 10.11772/j.issn.1001-9081.2020071080
    Abstract2824)      PDF (1460KB)(3326)       Save
    Causality extraction is a kind of relation extraction task in Natural Language Processing (NLP), which mines event pairs with causality from text by constructing event graph, and play important role in applications of finance, security, biology and other fields. Firstly, the concepts such as event extraction and causality were introduced, and the evolution of mainstream methods and the common datasets of causality extraction were described. Then, the current mainstream causality extraction models were listed. Based on the detailed analysis of pipeline based models and joint extraction models, the advantages and disadvantages of various methods and models were compared. Furthermore, the experimental performance and related experimental data of the models were summarized and analyzed. Finally, the research difficulties and future key research directions of causality extraction were given.
    Reference | Related Articles | Metrics
    Review of pre-trained models for natural language processing tasks
    LIU Ruiheng, YE Xia, YUE Zengying
    Journal of Computer Applications    2021, 41 (5): 1236-1246.   DOI: 10.11772/j.issn.1001-9081.2020081152
    Abstract878)      PDF (1296KB)(3000)       Save
    In recent years, deep learning technology has developed rapidly. In Natural Language Processing (NLP) tasks, with text representation technology rising from the word level to the document level, the unsupervised pre-training method using a large-scale corpus has been proved to be able to effectively improve the performance of models in downstream tasks. Firstly, according to the development of text feature extraction technology, typical models were analyzed from word level and document level. Secondly, the research status of the current pre-trained models was analyzed from the two stages of pre-training target task and downstream application, and the characteristics of the representative models were summed up. Finally, the main challenges faced by the development of pre-trained models were summarized and the prospects were proposed.
    Reference | Related Articles | Metrics
    Survey of communication overhead of federated learning
    Xinyuan QIU, Zecong YE, Xiaolong CUI, Zhiqiang GAO
    Journal of Computer Applications    2022, 42 (2): 333-342.   DOI: 10.11772/j.issn.1001-9081.2021020232
    Abstract1798)   HTML289)    PDF (1356KB)(2320)       Save

    To solve the irreconcilable contradiction between data sharing demands and requirements of privacy protection, federated learning was proposed. As a distributed machine learning, federated learning has a large number of model parameters needed to be exchanged between the participants and the central server, resulting in higher communication overhead. At the same time, federated learning is increasingly deployed on mobile devices with limited communication bandwidth and limited power, and the limited network bandwidth and the sharply raising client amount will make the communication bottleneck worse. For the communication bottleneck problem of federated learning, the basic workflow of federated learning was analyzed at first, and then from the perspective of methodology, three mainstream types of methods based on frequency reduction of model updating, model compression and client selection respectively as well as special methods such as model partition were introduced, and a deep comparative analysis of specific optimization schemes was carried out. Finally, the development trends of federated learning communication overhead technology research were summarized and prospected.

    Table and Figures | Reference | Related Articles | Metrics
    Federated learning survey:concepts, technologies, applications and challenges
    Tiankai LIANG, Bi ZENG, Guang CHEN
    Journal of Computer Applications    2022, 42 (12): 3651-3662.   DOI: 10.11772/j.issn.1001-9081.2021101821
    Abstract2592)   HTML160)    PDF (2464KB)(1781)       Save

    Under the background of emphasizing data right confirmation and privacy protection, federated learning, as a new machine learning paradigm, can solve the problem of data island and privacy protection without exposing the data of all participants. Since the modeling methods based on federated learning have become mainstream and achieved good effects at present, it is significant to summarize and analyze the concepts, technologies, applications and challenges of federated learning. Firstly, the development process of machine learning and the inevitability of the appearance of federated learning were elaborated, and the definition and classification of federated learning were given. Secondly, three federated learning methods (including horizontal federated learning, vertical federated learning and federated transfer learning) which were recognized by the industry currently were introduced and analyzed. Thirdly, concerning the privacy protection issue of federated learning, the existing common privacy protection technologies were generalized and summarized. In addition, the recent mainstream open-source frameworks were introduced and compared, and the application scenarios of federated learning were given at the same time. Finally, the challenges and future research directions of federated learning were prospected.

    Table and Figures | Reference | Related Articles | Metrics
    Data augmentation method based on improved deep convolutional generative adversarial networks
    GAN Lan, SHEN Hongfei, WANG Yao, ZHANG Yuejin
    Journal of Computer Applications    2021, 41 (5): 1305-1313.   DOI: 10.11772/j.issn.1001-9081.2020071059
    Abstract1067)      PDF (1499KB)(1543)       Save
    In order to solve the training difficulty of small sample data in deep learning and increase the training efficiency of DCGAN (Deep Convolutional Generative Adversarial Network), an improved DCGAN algorithm was proposed to perform the augmentation of small sample data. In the method, Wasserstein distance was used to replace the loss model in the original model at first. Then, spectral normalization was added in the generation network, and discrimination network to acquire a stable network structure. Finally, the optimal noise input dimension of sample was obtained by the maximum likelihood estimation and experimental estimation, so that the generated samples became more diversified. Experimental result on three datasets MNIST, CelebA and Cartoon indicated that the improved DCGAN could generate samples with higher definition and recognition rate compared to that before improvement. In particular, the average recognition rate on these datasets were improved by 8.1%, 16.4% and 16.7% respectively, and several definition evaluation indices on the datasets were increased with different degrees, suggesting that the method can realize the small sample data augmentation effectively.
    Reference | Related Articles | Metrics
    Survey of unmanned aerial vehicle cooperative control
    MA Ziyu, HE Ming, LIU Zujun, GU Lingfeng, LIU Jintao
    Journal of Computer Applications    2021, 41 (5): 1477-1483.   DOI: 10.11772/j.issn.1001-9081.2020081314
    Abstract669)      PDF (1364KB)(1482)       Save
    Unmanned Aerial Vehicle (UAV) cooperative control means that a group of UAVs based on inter-aircraft communication complete a common mission with rational division of labor and cooperation by using swarm intelligence as the core. UAV swarm is a multi-agent system in which many UAVs with certain independence ability carry out various tasks based on local rules. Compared with a single UAV, UAV swarm has great advantages such as high efficiency, high flexibility and high reliability. In view of the latest developments of UAV cooperative control technology in recent years, firstly, the application prospect of multi-UAV technology was illustrated by giving examples from the perspectives of civil use and military use. Then, the differences and development statuses of the three mainstream cooperative control methods:consensus control, flocking control and formation control were compared and analyzed. Finally, some suggestions on delay, obstacle avoidance and endurance of cooperative control were given to provide some help for the research and development of UAV collaborative control in the future.
    Reference | Related Articles | Metrics
    Review of spatio-temporal trajectory sequence pattern mining methods
    KANG Jun, HUANG Shan, DUAN Zongtao, LI Yixiu
    Journal of Computer Applications    2021, 41 (8): 2379-2385.   DOI: 10.11772/j.issn.1001-9081.2020101571
    Abstract945)      PDF (1204KB)(1475)       Save
    With the rapid development of global positioning technology and mobile communication technology, huge amounts of trajectory data appear. These data are true reflections of the moving patterns and behavior characteristics of moving objects in the spatio-temporal environment, and they contain a wealth of information which carries important application values for the fields such as urban planning, traffic management, service recommendation, and location prediction. And the applications of spatio-temporal trajectory data in these fields usually need to be achieved by sequence pattern mining of spatio-temporal trajectory data. Spatio-temporal trajectory sequence pattern mining aims to find frequently occurring sequence patterns from the spatio-temporal trajectory dataset, such as location patterns (frequent trajectories, hot spots), activity periodic patterns, and semantic behavior patterns, so as to mine hidden information in the spatio-temporal data. The research progress of spatial-temporal trajectory sequence pattern mining in recent years was summarized. Firstly, the data characteristics and applications of spatial-temporal trajectory sequence were introduced. Then, the mining process of spatial-temporal trajectory patterns was described:the research situation in this field was introduced from the perspectives of mining location patterns, periodic patterns and semantic patterns based on spatial-temporal trajectory sequence. Finally, the problems existing in the current spatio-temporal trajectory sequence pattern mining methods were elaborated, and the future development trends of spatio-temporal trajectory sequence pattern mining method were prospected.
    Reference | Related Articles | Metrics
    Path planning method of unmanned aerial vehicle based on chaos sparrow search algorithm
    TANG Andi, HAN Tong, XU Dengwu, XIE Lei
    Journal of Computer Applications    2021, 41 (7): 2128-2136.   DOI: 10.11772/j.issn.1001-9081.2020091513
    Abstract1036)      PDF (1479KB)(1431)       Save
    Focusing on the issues of large alculation amount and difficult convergence of Unmanned Aerial Vehicle (UAV) path planning, a path planning method based on Chaos Sparrow Search Algorithm (CSSA) was proposed. Firstly, a two-dimensional task space model and a path cost model were established, and the path planning problem was transformed into a multi-dimensional function optimization problem. Secondly, the cubic mapping was used to initialize the population, and the Opposition-Based Learning (OBL) strategy was used to introduce elite particles, so as to enhance the diversity of the population and expand the scope of the search area. Then, the Sine Cosine Algorithm (SCA) was introduced, and the linearly decreasing strategy was adopted to balance the exploitation and exploration abilities of the algorithm. When the algorithm fell into stagnation, the Gaussian walk strategy was adopted to make the algorithm jump out of the local optimum. Finally, the performance of the proposed improved algorithm was verified on 15 benchmark test functions and applied to solve the path planning problem. Simulation results show that CSSA has better optimization performance than Particle Swarm Optimization (PSO) algorithm, Beetle Swarm Optimization (BSO) algorithm, Whale Optimization Algorithm (WOA), Grey Wolf Optimizer (GWO) algorithm and Sparrow Search Algorithm (SSA), and can quickly obtain a safe and feasible path with optimal cost and satisfying constraints, which proves the effectiveness of the proposed method.
    Reference | Related Articles | Metrics
    Review of application analysis and research progress of deep learning in weather forecasting
    Runting DONG, Li WU, Xiaoying WANG, Tengfei CAO, Jianqiang HUANG, Qin GUAN, Jiexia WU
    Journal of Computer Applications    2023, 43 (6): 1958-1968.   DOI: 10.11772/j.issn.1001-9081.2022050745
    Abstract1226)   HTML91)    PDF (1570KB)(1423)       Save

    With the advancement of technologies such as sensor networks and global positioning systems, the volume of meteorological data with both temporal and spatial characteristics has exploded, and the research on deep learning models for Spatiotemporal Sequence Forecasting (STSF) has developed rapidly. However, the traditional machine learning methods applied to weather forecasting for a long time have unsatisfactory effects in extracting the temporal correlations and spatial dependences of data, while the deep learning methods can extract features automatically through artificial neural networks to improve the accuracy of weather forecasting effectively, and have a very good effect in encoding long-term spatial information modeling. At the same time, the deep learning models driven by observational data and Numerical Weather Prediction (NWP) models based on physical theories are combined to build hybrid models with higher prediction accuracy and longer prediction time. Based on these, the application analysis and research progress of deep learning in the field of weather forecasting were reviewed. Firstly, the deep learning problems in the field of weather forecasting and the classical deep learning problems were compared and studied from three aspects: data format, problem model and evaluation metrics. Then, the development history and application status of deep learning in the field of weather forecasting were looked back, and the latest progress in combining deep learning technologies with NWP was summarized and analyzed. Finally, the future development directions and research focuses were prospected to provide a certain reference for future deep learning research in the field of weather forecasting.

    Table and Figures | Reference | Related Articles | Metrics
    Review of deep learning-based medical image segmentation
    CAO Yuhong, XU Hai, LIU Sun'ao, WANG Zixiao, LI Hongliang
    Journal of Computer Applications    2021, 41 (8): 2273-2287.   DOI: 10.11772/j.issn.1001-9081.2020101638
    Abstract1723)      PDF (2539KB)(1405)       Save
    As a fundamental and key task in computer-aided diagnosis, medical image segmentation aims to accurately recognize the target regions such as organs, tissues and lesions at pixel level. Different from natural images, medical images show high complexity in texture and have the boundaries difficult to judge caused by ambiguity, which is the fault of much noise due to the limitations of the imaging technology and equipment. Furthermore, annotating medical images highly depends on expertise and experience of the experts, thereby leading to limited available annotations in the training and potential annotation errors. For medical images suffer from ambiguous boundary, limited annotated data and large errors in the annotations, which makes it is a great challenge for the auxiliary diagnosis systems based on traditional image segmentation algorithms to meet the demands of clinical applications. Recently, with the wide application of Convolutional Neural Network (CNN) in computer vision and natural language processing, deep learning-based medical segmentation algorithms have achieved tremendous success. Firstly the latest research progresses of deep learning-based medical image segmentation were summarized, including the basic architecture, loss function, and optimization method of the medical image segmentation algorithms. Then, for the limitation of medical image annotated data, the mainstream semi-supervised researches on medical image segmentation were summed up and analyzed. Besides, the studies related to measuring uncertainty of the annotation errors were introduced. Finally, the characteristics summary and analysis as well as the potential future trends of medical image segmentation were listed.
    Reference | Related Articles | Metrics
    Review of online education learner knowledge tracing
    Journal of Computer Applications    DOI: 10.11772/j.issn.1001-9081.2023060852
    Online available: 07 November 2023

    Review of multi-modal medical image segmentation based on deep learning
    Meng DOU, Zhebin CHEN, Xin WANG, Jitao ZHOU, Yu YAO
    Journal of Computer Applications    2023, 43 (11): 3385-3395.   DOI: 10.11772/j.issn.1001-9081.2022101636
    Abstract1513)   HTML53)    PDF (3904KB)(1343)       Save

    Multi-modal medical images can provide clinicians with rich information of target areas (such as tumors, organs or tissues). However, effective fusion and segmentation of multi-modal images is still a challenging problem due to the independence and complementarity of multi-modal images. Traditional image fusion methods have difficulty in addressing this problem, leading to widespread research on deep learning-based multi-modal medical image segmentation algorithms. The multi-modal medical image segmentation task based on deep learning was reviewed in terms of principles, techniques, problems, and prospects. Firstly, the general theory of deep learning and multi-modal medical image segmentation was introduced, including the basic principles and development processes of deep learning and Convolutional Neural Network (CNN), as well as the importance of the multi-modal medical image segmentation task. Secondly, the key concepts of multi-modal medical image segmentation was described, including data dimension, preprocessing, data enhancement, loss function, and post-processing, etc. Thirdly, different multi-modal segmentation networks based on different fusion strategies were summarized and analyzed. Finally, several common problems in medical image segmentation were discussed, the summary and prospects for future research were given.

    Table and Figures | Reference | Related Articles | Metrics
    Review of recommendation system
    Meng YU, Wentao HE, Xuchuan ZHOU, Mengtian CUI, Keqi WU, Wenjie ZHOU
    Journal of Computer Applications    2022, 42 (6): 1898-1913.   DOI: 10.11772/j.issn.1001-9081.2021040607
    Abstract1670)   HTML145)    PDF (3152KB)(1327)       Save

    With the continuous development of network applications, network resources are growing exponentially and information overload is becoming increasingly serious, so how to efficiently obtain the resources that meet the user needs has become one of the problems that bothering people. Recommendation system can effectively filter mass information and recommend the resources that meet the users needs. The research status of the recommendation system was introduced in detail, including three traditional recommendation methods of content-based recommendation, collaborative filtering recommendation and hybrid recommendation, and the research progress of four common deep learning recommendation models based on Convolutional Neural Network (CNN), Deep Neural Network (DNN), Recurrent Neural Network (RNN) and Graph Neural Network (GNN) were analyzed in focus. The commonly used datasets in recommendation field were summarized, and the differences between the traditional recommendation algorithms and the deep learning-based recommendation algorithms were analyzed and compared. Finally, the representative recommendation models in practical applications were summarized, and the challenges and the future research directions of recommendation system were discussed.

    Table and Figures | Reference | Related Articles | Metrics
    Multimodal sentiment analysis based on feature fusion of attention mechanism-bidirectional gated recurrent unit
    LAI Xuemei, TANG Hong, CHEN Hongyu, LI Shanshan
    Journal of Computer Applications    2021, 41 (5): 1268-1274.   DOI: 10.11772/j.issn.1001-9081.2020071092
    Abstract967)      PDF (960KB)(1323)       Save
    Aiming at the problem that the cross-modality interaction and the impact of the contribution of each modality on the final sentiment classification results are not considered in multimodal sentiment analysis of video, a multimodal sentiment analysis model of Attention Mechanism based feature Fusion-Bidirectional Gated Recurrent Unit (AMF-BiGRU) was proposed. Firstly, Bidirectional Gated Recurrent Unit (BiGRU) was used to consider the interdependence between utterances in each modality and obtain the internal information of each modality. Secondly, through the cross-modality attention interaction network layer, the internal information of the modalities were combined with the interaction between modalities. Thirdly, an attention mechanism was introduced to determine the attention weight of each modality, and the features of the modalities were effectively fused together. Finally, the sentiment classification results were obtained through the fully connected layer and softmax layer. Experiments were conducted on open CMU-MOSI (CMU Multimodal Opinion-level Sentiment Intensity) and CMU-MOSEI (CMU Multimodal Opinion Sentiment and Emotion Intensity) datasets. The experimental results show that compared with traditional multimodal sentiment analysis methods (such as Multi-Attention Recurrent Network (MARN)), the AMF-BiGRU model has the accuracy and F1-Score on CMU-MOSI dataset improved by 6.01% and 6.52% respectively, and the accuracy and F1-Score on CMU-MOSEI dataset improved by 2.72% and 2.30% respectively. AMF-BiGRU model can effectively improve the performance of multimodal sentiment classification.
    Reference | Related Articles | Metrics
    Review of image classification algorithms based on convolutional neural network
    Changqing JI, Zhiyong GAO, Jing QIN, Zumin WANG
    Journal of Computer Applications    2022, 42 (4): 1044-1049.   DOI: 10.11772/j.issn.1001-9081.2021071273
    Abstract2107)   HTML180)    PDF (605KB)(1255)       Save

    Convolutional Neural Network (CNN) is one of the important research directions in the field of computer vision based on deep learning at present. It performs well in applications such as image classification and segmentation, target detection. Its powerful feature learning and feature representation capability are admired by researchers increasingly. However, CNN still has problems such as incomplete feature extraction and overfitting of sample training. Aiming at these issues, the development of CNN, classical CNN network models and their components were introduced, and the methods to solve the above issues were provided. By reviewing the current status of research on CNN models in image classification, the suggestions were provided for further development and research directions of CNN.

    Table and Figures | Reference | Related Articles | Metrics
    Review of applications of natural language processing in text sentiment analysis
    Yingjie WANG, Jiuqi ZHU, Zumin WANG, Fengbo BAI, Jian GONG
    Journal of Computer Applications    2022, 42 (4): 1011-1020.   DOI: 10.11772/j.issn.1001-9081.2021071262
    Abstract2276)   HTML185)    PDF (783KB)(1247)       Save

    Text sentiment analysis has gradually become an important part of Natural Language Processing(NLP) in the fields of systematic recommendation and acquisition of user sentiment information, as well as public opinion reference for the government and enterprises. The methods in the field of sentiment analysis were compared and summarized by literature research. Firstly, literature investigation was carried out on the methods of sentiment analysis from the dimensions of time and method. Then, the main methods and application scenarios of sentiment analysis were summarized and compared. Finally, the advantages and disadvantages of each method were analyzed. According to the analysis results, in the face of different task scenarios, there are mainly three sentiment analysis methods: sentiment analysis based on emotion dictionary, sentiment analysis based on machine learning and sentiment analysis based on deep learning. The method based on multi-strategy mixture has become the trend of improvement. Literature investigation shows that there is still room for improvement in the techniques and methods of text sentiment analysis, and it has a large market and development prospects in e-commerce, psychotherapy and public opinion monitoring.

    Table and Figures | Reference | Related Articles | Metrics
    Text multi-label classification method incorporating BERT and label semantic attention
    Xueqiang LYU, Chen PENG, Le ZHANG, Zhi’an DONG, Xindong YOU
    Journal of Computer Applications    2022, 42 (1): 57-63.   DOI: 10.11772/j.issn.1001-9081.2021020366
    Abstract1402)   HTML72)    PDF (577KB)(1228)       Save

    Multi-Label Text Classification (MLTC) is one of the important subtasks in the field of Natural Language Processing (NLP). In order to solve the problem of complex correlation between multiple labels, an MLTC method TLA-BERT was proposed by incorporating Bidirectional Encoder Representations from Transformers (BERT) and label semantic attention. Firstly, the contextual vector representation of the input text was learned by fine-tuning the self-coding pre-training model. Secondly, the labels were encoded individually by using Long Short-Term Memory (LSTM) neural network. Finally, the contribution of text to each label was explicitly highlighted with the use of an attention mechanism in order to predict the multi-label sequences. Experimental results show that compared with Sequence Generation Model (SGM) algorithm, the proposed method improves the F value by 2.8 percentage points and 1.5 percentage points on the Arxiv Academic Paper Dataset (AAPD) and Reuters Corpus Volume I (RCV1)-v2 public dataset respectively.

    Table and Figures | Reference | Related Articles | Metrics
    Sequential multimodal sentiment analysis model based on multi-task learning
    ZHANG Sun, YIN Chunyong
    Journal of Computer Applications    2021, 41 (6): 1631-1639.   DOI: 10.11772/j.issn.1001-9081.2020091416
    Abstract850)      PDF (1150KB)(1182)       Save
    Considering the issues of unimodal feature representation and cross-modal feature fusion in sequential multimodal sentiment analysis, a multi-task learning based sentiment analysis model was proposed by combining with multi-head attention mechanism. Firstly, Convolution Neural Network (CNN), Bidirectional Gated Recurrent Unit (BiGRU) and Multi-Head Self-Attention (MHSA) were used to realize the sequential unimodal feature representation. Secondly, the bidirectional cross-modal information was fused by multi-head attention. Finally, based on multi-task learning, the sentiment polarity classification and sentiment intensity regression were added as auxiliary tasks to improve the comprehensive performance of the main task of sentiment score regression. Experimental results demonstrate that the proposed model improves the accuracy of binary classification by 7.8 percentage points and 3.1 percentage points respectively on CMU Multimodal Opinion Sentiment and Emotion Intensity (CMU-MOSEI) and CMU Multimodal Opinion level Sentiment Intensity (CMU-MOSI) datasets compared with multimodal factorization model. Therefore, the proposed model is applicable for the sentiment analysis problems under multimodal scenarios, and can provide the decision supports for product recommendation, stock market forecasting, public opinion monitoring and other relevant applications.
    Reference | Related Articles | Metrics
    Survey of multimodal pre-training models
    Huiru WANG, Xiuhong LI, Zhe LI, Chunming MA, Zeyu REN, Dan YANG
    Journal of Computer Applications    2023, 43 (4): 991-1004.   DOI: 10.11772/j.issn.1001-9081.2022020296
    Abstract1466)   HTML131)    PDF (5539KB)(1163)    PDF(mobile) (3280KB)(91)    Save

    By using complex pre-training targets and a large number of model parameters, Pre-Training Model (PTM) can effectively obtain rich knowledge from unlabeled data. However, the development of the multimodal PTMs is still in its infancy. According to the difference between modals, most of the current multimodal PTMs were divided into the image-text PTMs and video-text PTMs. According to the different data fusion methods, the multimodal PTMs were divided into two types: single-stream models and two-stream models. Firstly, common pre-training tasks and downstream tasks used in validation experiments were summarized. Secondly, the common models in the area of multimodal pre-training were sorted out, and the downstream tasks of each model and the performance and experimental data of the models were listed in tables for comparison. Thirdly, the application scenarios of M6 (Multi-Modality to Multi-Modality Multitask Mega-transformer) model, Cross-modal Prompt Tuning (CPT) model, VideoBERT (Video Bidirectional Encoder Representations from Transformers) model, and AliceMind (Alibaba’s collection of encoder-decoders from Mind) model in specific downstream tasks were introduced. Finally, the challenges and future research directions faced by related multimodal PTM work were summed up.

    Table and Figures | Reference | Related Articles | Metrics
    Light-weight road image semantic segmentation algorithm based on deep learning
    HU Die, FENG Ziliang
    Journal of Computer Applications    2021, 41 (5): 1326-1331.   DOI: 10.11772/j.issn.1001-9081.2020081181
    Abstract461)      PDF (1085KB)(1100)       Save
    In order to solve the problem that the road image semantic segmentation model has huge parameter number and complex calculation in deep learning, and is not suitable for deployment on mobile terminals for real-time segmentation, a light-weighted symmetric U-shaped encoder-decoder image semantic segmentation network constructed by depthwise separable convolution was introduced, namely MUNet. First, a U-shaped encoder-decoder network was designed; then, the sparse short connection design was added in the convolution blocks; at last, the attention mechanism and Group Normalization (GN) method were introduced to reduce the amount of model parameters and calculation while improving the segmentation accuracy. For the CamVid dataset of road images, after 1 000 rounds of training, the Mean Intersection over Union (MIoU) of the segmentation results of the MUNet was 61.92% when the test image was cropped to a size of 720×720. Experimental results show that compared with the common image semantic segmentation networks such as Pyramid Scene Parsing Network (PSPNet), RefineNet, Global Convolutional Network (GCN) and DeepLabv3+, MUNet has fewer parameters and calculation with better network segmentation performance.
    Reference | Related Articles | Metrics
    Review of remote sensing image change detection
    REN Qiuru, YANG Wenzhong, WANG Chuanjian, WEI Wenyu, QIAN Yunyun
    Journal of Computer Applications    2021, 41 (8): 2294-2305.   DOI: 10.11772/j.issn.1001-9081.2020101632
    Abstract1002)      PDF (1683KB)(1086)       Save
    As a key technology of land use/land cover detection, change detection aims to detect the changed part and its type in the remote sensing data of the same region in different periods. In view of the problems in traditional change detection methods, such as heavy manual labor and poor detection results, a large number of change detection methods based on remote sensing images have been proposed. In order to further understand the change detection technology based on remote sensing images and further study on the change detection methods, a comprehensive review of change detection was carried out by sorting, analyzing and comparing a large number of researches on change detection. Firstly, the development process of change detection was described. Then, the research progress of change detection was summarized in detail from three aspects:data selection and preprocessing, change detection technology, post-processing and precision evaluation, where the change detection technology was mainly summarized from analysis unit and comparison method respectively. Finally, the summary of the problems in each stage of change detection was performed and the future development directions were proposed.
    Reference | Related Articles | Metrics
    Semantic SLAM algorithm based on deep learning in dynamic environment
    ZHENG Sicheng, KONG Linghua, YOU Tongfei, YI Dingrong
    Journal of Computer Applications    2021, 41 (10): 2945-2951.   DOI: 10.11772/j.issn.1001-9081.2020111885
    Abstract442)      PDF (1572KB)(1076)       Save
    Concerning the problem that the existence of moving objects in the application scenes will reduce the positioning accuracy and robustness of the visual Synchronous Localization And Mapping (SLAM) system, a semantic information based visual SLAM algorithm in dynamic environment was proposed. Firstly, the traditional visual SLAM front end was combined with the YOLOv4 object detection algorithm, during the extraction of ORB (Oriented FAST and Rotated BRIEF) features of the input image, the image was semantically segmented. Then, the object type was judged to obtain the area of the dynamic object in the image, and the feature points distributed on the dynamic object were eliminated. Finally, the camera pose was solved by using inter-frame matching between the processed feature points and the adjacent frames. The test results on TUM dataset show that, the accuracy of the pose estimation of this algorithm is 96.78% higher than that of ORB-SLAM2 (Orient FAST and Rotated BRIEF SLAM2) in a high dynamic environment, and the average consumption time per frame of tracking thread of the algorithm is 0.065 5 s, which is the shortest time consumption compared to those of the other SLAM algorithms used in dynamic environment. The above experimental results illustrate that the proposed algorithm can realize real-time precise positioning and mapping in dynamic environment.
    Reference | Related Articles | Metrics
    Review on privacy-preserving technologies in federated learning
    Teng WANG, Zheng HUO, Yaxin HUANG, Yilin FAN
    Journal of Computer Applications    2023, 43 (2): 437-449.   DOI: 10.11772/j.issn.1001-9081.2021122072
    Abstract1389)   HTML132)    PDF (2014KB)(1064)       Save

    In recent years, federated learning has become a new way to solve the problems of data island and privacy leakage in machine learning. Federated learning architecture does not require multiple parties to share data resources, in which participants only needed to train local models on local data and periodically upload parameters to the server to update the global model, and then a machine learning model can be built on large-scale global data. Federated learning architecture has the privacy-preserving nature and is a new scheme for large-scale data machine learning in the future. However, the parameter interaction mode of this architecture may lead to data privacy disclosure. At present, strengthening the privacy-preserving mechanism in federated learning architecture has become a new research hotspot. Starting from the privacy disclosure problem in federated learning, the attack models and sensitive information disclosure paths in federated learning were discussed, and several types of privacy-preserving techniques in federated learning were highlighted and reviewed, such as privacy-preserving technology based on differential privacy, privacy-preserving technology based on homomorphic encryption, and privacy-preserving technology based on Secure Multiparty Computation (SMC). Finally, the key issues of privacy protection in federated learning were discussed, the future research directions were prospected.

    Table and Figures | Reference | Related Articles | Metrics
    Encrypted traffic classification method based on data stream
    GUO Shuai, SU Yang
    Journal of Computer Applications    2021, 41 (5): 1386-1391.   DOI: 10.11772/j.issn.1001-9081.2020071073
    Abstract523)      PDF (948KB)(1063)       Save
    Aiming at the problems of fast classification and accurate identification of encrypted traffic in current network, a new feature extraction method for data stream was proposed. Based on the characteristics of sequential data and the law of the SSL (Secure Sockets Layer) handshake protocol, an end-to-end one-dimensional convolutional neural network model was adopted, and five-tuples were used to label the data stream. By selecting the data stream representation manner, the number of data packets, and the length of feature bytes, the key field positions of sample classification were located more accurately, and the features with little impact on sample classification were removed, so that the 784 bytes used by a single data stream during the original input were reduced to 529 bytes, which reduced 32% of the original length, and the classification of 12 encrypted traffic service types was implemented with the accuracy of 95.5%. These results show that the proposed method can reduce the original input feature dimension and improve the efficiency of data processing on the basis of ensuring the accuracy of the current research.
    Reference | Related Articles | Metrics
    Transformer based U-shaped medical image segmentation network: a survey
    Liyao FU, Mengxiao YIN, Feng YANG
    Journal of Computer Applications    2023, 43 (5): 1584-1595.   DOI: 10.11772/j.issn.1001-9081.2022040530
    Abstract1461)   HTML64)    PDF (1887KB)(1062)       Save

    U-shaped Network (U-Net) based on Fully Convolutional Network (FCN) is widely used as the backbone of medical image segmentation models, but Convolutional Neural Network (CNN) is not good at capturing long-range dependency, which limits the further performance improvement of segmentation models. To solve the above problem, researchers have applied Transformer to medical image segmentation models to make up for the deficiency of CNN, and U-shaped segmentation networks combining Transformer have become the hot research topics. After a detailed introduction of U-Net and Transformer, the related medical image segmentation models were categorized by the position in which the Transformer module was located, including only in the encoder or decoder, both in the encoder and decoder, as a skip-connection, and others, the basic contents, design concepts and possible improvement aspects about these models were discussed, the advantages and disadvantages of having Transformer in different positions were also analyzed. According to the analysis results, it can be seen that the biggest factor to decide the position of Transformer is the characteristics of the target segmentation task, and the segmentation models of Transformer combined with U-Net can make better use of the advantages of CNN and Transformer to improve segmentation performance of models, which has great development prospect and research value.

    Table and Figures | Reference | Related Articles | Metrics
    Weakly supervised fine-grained image classification algorithm based on attention-attention bilinear pooling
    LU Xinwei, YU Pengfei, LI Haiyan, LI Hongsong, DING Wenqian
    Journal of Computer Applications    2021, 41 (5): 1319-1325.   DOI: 10.11772/j.issn.1001-9081.2020071105
    Abstract371)      PDF (1945KB)(1037)       Save
    With the rapid development of artificial intelligence, the purpose of image classification is not only to identify the major categories of objects, but also to classify the images of the same category into more detailed subcategories. In order to effectively discriminate small differences between categories, a fine-grained classification algorithm was proposed based on Attention-Attention Bilinear Pooling (AABP). Firstly, the Inception V3 pre-training model was applied to extract the global image features, and the local attention region on the feature mapping was forecasted with the deep separable convolution. Then, the Weakly Supervised Data Augmentation Network (WS-DAN) was applied to feed the augmented image back into the network, so as to enhance the generalization ability of the network to prevent overfitting. Finally, the linear fusion of the further extracted attention features was performed in AABP network to improve the accuracy of the classification. Experimental results show that this method achieves accuracy of 88.51% and top5 accuracy of 97.65% on CUB-200-2011 dataset, accuracy of 89.77% and top5 accuracy of 99.27% on Stanford Cars dataset, and accuracy of 93.5% and top5 accuracy of 97.96% on FGVC-Aircraft dataset.
    Reference | Related Articles | Metrics
    Embedded road crack detection algorithm based on improved YOLOv8
    Journal of Computer Applications    DOI: 10.11772/j.issn.1001-9081.2023050635
    Online available: 01 September 2023

    Fitness action recognition method based on human skeleton feature encoding
    GUO Tianxiao, HU Qingrui, LI Jianwei, SHEN Yanfei
    Journal of Computer Applications    2021, 41 (5): 1458-1464.   DOI: 10.11772/j.issn.1001-9081.2020071113
    Abstract713)      PDF (1143KB)(1027)       Save
    Fitness action recognition is the core of the intelligent fitness system. In order to improve the accuracy and speed of fitness action recognition algorithm, and reduce the influence of the global displacement of fitness actions on the recognition results, a fitness action recognition method based on human skeleton feature encoding was proposed which included three steps:firstly, the simplified human skeleton model was constructed, and the information of skeleton model's joint point coordinates was extracted through the human pose estimation technology; secondly, the action feature region was extracted by using the human central projection method in order to eliminate the influence of the global displacement on action recognition; finally, the feature region was encoded as the feature vector and input to a multi-classifier to realize the action recognition, at the same time the length of the feature vector was optimized for improving the recognition rate and speed. Experiment results showed that the proposed method achieved the recognition rate of 97.24% on the self-built fitness dataset with 28 types of fitness actions, which verified the effectiveness of this method to recognize different types of fitness actions; on the public KTH and Weizmann datasets, the recognition rates of the proposed method were 91.67% and 90% respectively, higher than those of other similar methods.
    Reference | Related Articles | Metrics
2024 Vol.44 No.3

Current Issue
Archive
Honorary Editor-in-Chief: ZHANG Jingzhong
Editor-in-Chief: XU Zongben
Associate Editor: SHEN Hengtao XIA Zhaohui
Domestic Post Distribution Code: 62-110
Foreign Distribution Code: M4616
Address:
No. 9, 4th Section of South Renmin Road, Chengdu 610041, China
Tel: 028-85224283-803
  028-85222239-803
Website: www.joca.cn
E-mail: bjb@joca.cn
WeChat
Join CCF