Journal of Computer Applications

Effects of large-scale graph structural feature on partitioning quality

LUO Xiaoxia, SI Fengwei, LUO Xiangyu

2018, 38(1): 1-5. DOI: 10.11772/j.issn.1001-9081.2017071967

Asbtract ( )

PDF (805KB) ( )

References | Related Articles | Metrics

Focusing on how the large-scale graphs' structural features affect the partitioning quality, through the structural features of vertex degree, a method of describing large-scale graphs' structural features was proposed. Firstly, based on the real graph data, a several simulation datasets with the same numbers of vertices and edges but different structural features were generated. Through the similarity between the real graph and the simulation graph calculated by the proposed algorithm, the validity of the method for describing the structural features of the real large-scale graphs was verified. Secondly, the relationship between the structural features of the graph and the quality of partitioning was verified by the Hash algorithm and point-to-point exchange algorithm. When the point-to-point algorithm was performed for 50000 times, the number of crossed edges of a real graph with 6301 vertexes and 20777 edges was reduced by 54.32% compared with the Hash partitioning algorithm. When the two graphs with entirely different structural features in the simulation data were partitioned, the number of crossed edges were 6233 and 316 respectively. The experimental results show that the point-to-point algorithm can reduce the number of crossed edges. The larger the difference of the vertex degree distribution and the smaller the number of crossed edges are, the better the partitioning quality is. Therefore, the structural features of large graphs affect the partitioning effect, which lays the foundation for the model investigation of the relationship between structural features and partitioning quality.

Multi-type task assignment and scheduling oriented to spatial crowdsourcing

MAO Yingchi, MU Chao, BAO Wei, LI Xiaofang

2018, 38(1): 6-12. DOI: 10.11772/j.issn.1001-9081.2017071886

Asbtract ( )

PDF (1060KB) ( )

References | Related Articles | Metrics

Aiming at the quality and quantity problem of multi-type task completion in spatial crowdsourcing, a method of multi-type task assignment and scheduling was proposed. Firstly, in the task assignment process, by combining with the characteristics of multi-type tasks and users in spatial crowdsourcing and improving the greedy allocation algorithm, a Distance ε based Assignment (ε-DA) algorithm was proposed. Then the tasks were assigned to the nearby user, in order to improve the quality of task completion. Secondly, the idea of Branch and Bound Schedule (BBS) was utilized, and the task sequences were scheduled according to the size of the professional matching scores. Finally, the best sequence of tasks was found. Due to the low running speed of the scheduling algorithm of branch and bound idea, the Most Promising Branch Heuristic (MPBH) algorithm was presented. Through the MPBH algorithm, local optimization was achieved in each task allocation process. Compared with the scheduling algorithm of branch and bound idea, the running speed of the proposed algorithm was increased by 30%. The experimental results show that the proposed method can improve the quality and quantity of task completion and raise the running speed and accuracy.

Lightweight distributed social distance routing algorithm in mobile opportunistic network

YUAN Peiyan, SONG Mingyang

2018, 38(1): 13-19. DOI: 10.11772/j.issn.1001-9081.2017071824

Asbtract ( )

PDF (1213KB) ( )

References | Related Articles | Metrics

In most routing algorithms of the previous work, the flooding method is used to obtain auxiliary information, which wastes network resources. Motivated by this, a distributed social distance routing algorithm was proposed. Firstly, the stability and regularity of contact between nodes were analyzed to determine the friend relationship. Then, the social-distance between nodes was constructed by the friend relationship. In addition, a table for recording the shortest social distance to other nodes was maintained by each node, and the minimum social distance was continually updated by exchanging and comparing the information in the table. The construction of social distance only needs to exchange information among friends instead of all nodes, which can greatly reduce the time of auxiliary-information exchange. Finally, when the packet was sent to the relay node with a smaller social distance to its destination node, the delivery ratio could be significantly improved. The experimental results demonstrate that, compared with the Probabilistic Routing Protocol using History of Encounters and Transitivity (PRoPHET) algorithm, the delivery ratio of the proposed algorithm is increased by about 3%, the data packet transmission delay is reduced by about 27%, and the auxiliary-information exchange times is reduced by about 63%. Compared with the routing based on Betweenness centrality and Similarity (SimBet) algorithm, the delivery ratio of the proposed algorithm is increased by about 11%, the data packet transmission delay basically equals, the auxiliary-information exchange times is reduced by about 63%. The social distance algorithm provides a theoretic support for large-scale mobile opportunistic networks, because of its better performance in scalability.

New traffic classification method for imbalanced network data

YAN Binghao, HAN Guodong, HUANG Yajing, WANG Xiaolong

2018, 38(1): 20-25. DOI: 10.11772/j.issn.1001-9081.2017071812

Asbtract ( )

PDF (921KB) ( )

References | Related Articles | Metrics

To solve the problem existing in traffic classification that Peer-to-Peer (P2P) traffic is much more than that of non-P2P, a new traffic classification method for imbalanced network data was presented. By introducing and improving Synthetic Minority Over-sampling Technique (SMOTE) algorithm, a Mean SMOTE (M-SMOTE) algorithm was proposed to realize the balance of traffic data. On the basis of this, three kinds of machine learning classifiers:Random Forest (RF), Support Vector Machine (SVM), Back Propagation Neural Network (BPNN) were used to identify the various types of traffic. The theoretical analysis and simulation results show that, compared with the imbalanced state, the SMOTE algorithm improves the recognition accuracy of non-P2P traffic by 16.5 percentage points and raises the overall recognition rate of network traffic by 9.5 percentage points. Compared with SMOTE algorithm, the M-SMOTE algorithm further improves the recognition rate of non-P2P traffic and the overall recognition rate of network traffic by 3.2 percentage points and 2.6 percentage points respectively. The experimental results show that the way of imbalanced data classification can effectively solve the problem of low P2P traffic recognition rate caused by excessive P2P traffic, and the M-SMOTE algorithm has higher recognition accuracy rate than SMOTE.

Technology on data forwarding and routing selection for software defined vehicular Ad Hoc network

DONG Baihong, DENG Jian, ZHANG Dingjie, WU Weigang

2018, 38(1): 26-30. DOI: 10.11772/j.issn.1001-9081.2017071969

Asbtract ( )

PDF (928KB) ( )

References | Related Articles | Metrics

Since the data forwarding in Vehicular Ad Hoc Network (VANET) is inefficient, a technology on data forwarding and routing selection based on Software Defined Network (SDN) was proposed. Firstly, a hierarchical architecture of SDN based VANET, which was consist of local controller and global controller, was used to separate data forwarding from transmission control and decide the direction of data forwarding flexibly. Secondly, a vehicular routing mechanism of single road section was designed and data could be transmitted stably by predicting vehicular position and using greedy strategy. Thirdly, to achieve the goal of path disjoint which could avoid the bandwidth bottleneck between multiple demands, a road section routing mechanism was proposed, which combined the Breadth First Search (BFS) and edge set. Finally, compared with Ad Hoc On-demand Distance Vector (AODV) routing, the proposed algorithm could increase data reception rate by 40% and reduce average delay by 60%. The simulation results show that the technology on data forwarding and routing for software defined VANET can effectively improve the data delivery rate, and reduces the average packet delay.

Non-full sequence-based localization algorithm for 3D underwater sensor networks

CHE Di, NIU Qiang

2018, 38(1): 31-37. DOI: 10.11772/j.issn.1001-9081.2017071968

Asbtract ( )

PDF (1137KB) ( )

References | Related Articles | Metrics

Aiming at the problems of low accuracy and high complexity of localization algorithm in three-dimensional space, a Non-Full Sequence-based Localization (NFSL) algorithm for 3D underwater sensor networks was proposed. Different from traditional sequence-based localization algorithms, a more realistic situation where communication range of beacon nodes is not entire network was taken into consideration by NFSL. Firstly, 3D Voronoi diagram was used to divide the 3D location space and thus virtual beacon nodes as well as their rank sequences were determined. Secondly, the nearest beacon node was obtained according to the rank correlation coefficient between the unknown node sequence based on Received Signal Strength Indication (RSSI) and the beacon node sequence, and the nearest sequence table was constructed. Next, an algorithm which aimed at the similarity of sequences with unequal lengths was designed and utilized to obtain the rank correlation coefficients between the non-full sequence of unknown nodes and each sequence in the nearest sequence table. Finally, the weighted estimation of the unknown node's location was realized by taking the rank correlation coefficient as the weight. In simulation experiments, the localization accuracy of NFSL was compared with that of DV-Hop and Centroid by taking the ratio of beacon nodes, communication range, total number of nodes and network scale as variables. The extensive simulation results verified the effectiveness of the proposed algorithm. Besides, its localization accuracy significantly improves with the increasing number of beacon nodes. Compared with traditional localization algorithms, the localization accuracy of NFSL is improved by as much as 23%.

Real-time processing system for automatic weather station data on Spark Streaming architecture

ZHAO Wenfang, LIU Xulin

2018, 38(1): 38-43. DOI: 10.11772/j.issn.1001-9081.2017071903

Asbtract ( )

PDF (1144KB) ( )

References | Related Articles | Metrics

Aiming at these problems of the current data service of Automatic Weather Stations (AWS), including data processing delay, slow interactive response, and low statistical efficiency, a new method based on Spark Streaming and HBase technologies was proposed and introduced to process massive streaming AWS data by integrating stream computing framework and distributed database system. Flume was used for data collection, and data processing was conducted by Spark Streaming and data were stored into HBase. In framework of Spark, two algorithms, one for writing streaming AWS data into HBase database, the other for realizing real-time statistical calculation of different observed AWS meteorological elements were designed. Finally, a stable and high-efficient system for real-time acquisition, processing, and statistics of AWS data was developed on Cloudera platform. Based on comparative analysis and running monitoring, performances of the system were confirmed, including low latency, high I/O efficiency, stable running status and excellent load balance. The experimental results show that the response time of Spark Streaming-based real-time operational processing for AWS data can reach to millisecond level, which includes paralleled data writing into HBase, HBase-based data query and statistics on different meteorological elements. The system can fully meet needs of operational applications to AWS data, and provides effective support to weather forecast.

Distributed fault localization framework for high performance computing

GAO Jian, YU Kang, QING Peng, WEI Hongmei

2018, 38(1): 44-49. DOI: 10.11772/j.issn.1001-9081.2017071948

Asbtract ( )

PDF (981KB) ( )

References | Related Articles | Metrics

To solve the problem of high difficulty and poor real-time in fault localization for high performance computing system, a Message-Passing based Fault Localization (MPFL) framework was proposed, which included Tree-based Fault Detection (TFD) and Tree-based Fault Analysis (TFA) algorithms. Firstly, when the parallel application was initialized, the Fault Localization Tree (FLT) was obtained by logically dividing all the nodes participating in the computing, and the fault localization tasks were distributed to different nodes. Secondly, if the abnormal state of a node was detected by system components such as message-passing library and operating system, the TFD algorithm was used to analyze the FLT structure, and the node responsible for receiving the abnormal state was selected according to factors such as load balancing and performance cost. Finally, the fault was derived from the received abnormal state, which was reasoned by the node that used TFA algorithm. The rule-based event correlation and the lightweight active probing based on message-passing were used in TFA algorithm, and the accuracy of fault analysis was improved by combining these two approaches. The experimental evaluation was performed on a typical cluster, which demonstrated the capability of MPFL by locating the shutdown simulation nodes. The experimental results on the NPB-FT and NPB-IS benchmarks show that the MPFL framework has excellent performance on fault localization capability and cost saving.

Load balancing mechanism for large-scale data access system

ZHOU Yue, CHEN Qingkui

2018, 38(1): 50-55. DOI: 10.11772/j.issn.1001-9081.2017071836

Asbtract ( )

PDF (978KB) ( )

References | Related Articles | Metrics

Some problems of the current load balancing algorithms for distributed systems include:1) The role of each node in the system is fixed, and the system has no adaptability. 2) The load balancing algorithm is not universal. 3) The migration task is too large, and the load balance cycle is too long. To solve these problems, a hybrid load balancing algorithm was proposed. Firstly, a distributed receiving system model was designed, by which the system tasks were divided into three parts:receiving level, handling level and storing level. In receiving level, a home-made transmission protocol was used to improve the reception capability of the system. And then, in the load balancing algorithm, random load migration strategy was used. According to the status of the nodes, the tasks of load were randomly migrated. The problems of long load balance cycle and load moving back were solved by this strategy. Finally, the distributed control node selecting strategy was adopted to make the nodes adaptable. The experimental results show that the average delay in each layer of the system is in milliseconds, and the system load balancing takes less than 3 minutes, which proves that the load balancing mechanism has short load balance cycle and fast response, and can improve the reception capability of the distributed system.

Malware detection approach based on non-user operating sequence

LUO Wenshuang, CAO Tianjie

2018, 38(1): 56-60. DOI: 10.11772/j.issn.1001-9081.2017071835

Asbtract ( )

PDF (1013KB) ( )

References | Related Articles | Metrics

Considering rapid growth of Android malware and poor capability of detecting malware, a static detection method based on non-user operation sequences was proposed. Firstly, the Application Programming Interface (API) call information of malware was extracted by reverse engineering analysis. Secondly, the malware's function-call graph was established by using breadth-first traversal algorithm; then, non-user operation sequence was extracted from the function-call graph to form malicious behavior database. Finally, the similarity of the detected sample and non-user operation sequence in the malicious behavior database was calculated by using the edit distance algorithm for malware identification. In the detection of 360 malicious samples and 300 normal samples, the proposed method could reach the recall rate of 90.8% and the accuracy rate of 90.3%. Compared with the Android malware detection system Androguard, the recall rate of the proposed method increased by 30 percentage points in the detection of malicious samples; and compared with the FlowDroid method, the precision rate increased by 11 percentage points in the detection of normal sample and the recall rate increased by 4.4 percentage points in the detection of malicious samples. The experimental results show that the proposed method improves the recall rate of malware detection and promotes the detection effect of malware.

Ontology model for detecting Android implicit information flow

LIU Qiyuan, JIAO Jian, CAO Hongsheng

2018, 38(1): 61-66. DOI: 10.11772/j.issn.1001-9081.2017071970

Asbtract ( )

PDF (957KB) ( )

References | Related Articles | Metrics

Concerning the problem that the traditional information leakage detection technology can not effectively detect implicit information leakage in Android applications, a reasoning method of Android Implicit Information Flow (ⅡF) combining control structure ontology model and Semantic Web Rule Language (SWRL) inference rule was proposed. Firstly, the key elements that generate implicit information flow in control structure were analyzed and modeled to establish the control structure ontology model. Secondly, based on the analysis of the main reasons of implicit information leakage, criterion rules of implicit information flow based on Strict Control Dependence (SCD) were given and converted into SWRL inference rules. Finally, control structure ontology instances and SWRL inference rules were imported into the inference engine Jess for reasoning. The experimental results show that the proposed method can deduce a variety of implicit information flow based on SCD with different nature and the testing accuracy of sample set is 83.3%, and the reasoning time is in the reasonable interval when the branch number is limited. The proposed model can effectively assist traditional information leakage detection to improve its accuracy.

Hierarchical (α_ij, k, m)-anonymity privacy preservation based on multiple sensitive attributes

WANG Qiuyue, GE Lina, GENG Bo, WANG Lijuan

2018, 38(1): 67-72. DOI: 10.11772/j.issn.1001-9081.2017071863

Asbtract ( )

PDF (1111KB) ( )

References | Related Articles | Metrics

To resist existing limitations and associated attack by anonymization of single sensitive attributes, an (α_ij,k,m)-anonymity model based on greedy algorithm was proposed. Firstly, the (α_ij,k,m)-anonymity model was mainly to protect multi-sensitive attribute information. Secondly, the model for level was carried out according to the sensitive values of the sensitive attributes, if there were m sensitive attributes, there were m tables. Thirdly, each level was assigned a specific account α_ij by the model. Finally, the (α_ij,k,m)-anonymity algorithm based on greedy strategy was designed, and a local optimum method was adopted to implement the ideas of the model which improves the degree of data privacy protection. The proposed model was compared with other three models from information loss, execution times, and the sensitivity distance of equivalent class. The experimental results show that, although the execution time of the proposed model is slightly longer than other compared models, however, the information loss is less and the privacy protection degree of data is higher. It can resist the associated attack and protect the data of multi-sensitive attributes.

Improvement of differential privacy protection algorithm based on OPTICS clustering

WANG Hong, GE Lina, WANG Suqing, WANG Liying, ZHANG Yipeng, LIANG Juncheng

2018, 38(1): 73-78. DOI: 10.11772/j.issn.1001-9081.2017071944

Asbtract ( )

PDF (988KB) ( )

References | Related Articles | Metrics

Clustering algorithm is used to preprocess personal privacy information in order to achieve differential privacy protection, which can reduce the reconstruction error caused by directly distributing histogram data, and the reconstruction error caused by different combining methods of histogram. Aiming at the problem of sensitivity to input data parameters in DP-DBSCAN (Differential Privacy-Density-Based Spatial Clustering of Applications with Noise) differential privacy algorithm, the OPTICS (Ordering Points To Identify Clustering Structure) algorithm based on density clustering was applied to differential privacy protection. And an improved differential privacy protection algorithm, called DP-OPTICS (Differential Privacy-Ordering Points To Identify Clustering Structure) was introduced, the sparse dataset was compressed, the same variance noise and different variance noise were used as two noise-adding ways by comparison, considering the probability of privacy information's being broken by the attacker, the upper bound of privacy parameter ε was determined, which effectively balanced the relationship between the privacy of sensitive information and the usability of data. The DP-OPTICS algorithm was compared with the differential privacy protection algorithm based on OPTICS clustering and DP-DBSCAN algorithm. The DP-OPTICS algorithm is between the other two in time consumption. However, in the case of having the same parameters, the stability of the DP-OPTICS algorithm is the best among them, so the improved OP-OPTICS differential privacy protection algorithm is generally feasible.

Recaptured speech detection algorithm based on convolutional neural network

LI Can, WANG Rangding, YAN Diqun

2018, 38(1): 79-83. DOI: 10.11772/j.issn.1001-9081.2017071896

Asbtract ( )

PDF (838KB) ( )

References | Related Articles | Metrics

Aiming at the problems that recaptured speech attack to speaker recognition system harms the rights and interests of legitimate users, a recaptured speech detection algorithm based on Convolutional Neural Network (CNN) was proposed. Firstly, the spectrograms of the original speech and the recaptured speech were extracted and input into the CNN for feature extraction and classification. Secondly, for the detection task, a new network architecture was constructed, and the effect of the spectrograms with different window shifts were discussed. Finally, the cross-over experiments for various recapture and replay devices were constructed. The experimental results demonstrate that the proposed method can accurately discriminate whether the detected speech is recaptured or not, and the recognition rate achieves 99.26%. Compared with the mute segment Mel-Frequency Cepstral Coefficient (MFCC) algorithm, channel mode noise algorithm and long window scale factor algorithm, the recognition rate is increased by about 26 percentage points, about 21 percentage points and about 0.35 percentage points respectively.

Information extraction method of financial events based on lexical-semantic pattern

LUO Ming, HUANG Hailiang

2018, 38(1): 84-90. DOI: 10.11772/j.issn.1001-9081.2017071678

Asbtract ( )

PDF (1071KB) ( )

References | Related Articles | Metrics

Information extraction is one of the most important tasks in natural language processing. A hierarchical Lexical-Semantic Pattern (LSP) method for the extraction of financial events was proposed for the problem of information extraction in natural language processing due to linguistic diversity, ambiguity and structure. Firstly, a financial event representation model was defined. Secondly, a word vector method based on deep learning was used to realize the automatic generation of synonymous concept dictionary. Finally, some hierarchical LSPs based on finite state machine were used to extract various kinds of financial events. The experimental results show that by using the proposed method various kinds of financial events can be accurately extracted from the financial news text, and for 26 types of financial events recognition the micro average precision is 93.9%, the micro average recall is 86.9%, the micro average F1 value reaches 90.3%.

Salient object detection algorithm based on multi-task deep convolutional neural network

YANG Fan, LI Jianping, LI Xin, CHEN Leiting

2018, 38(1): 91-96. DOI: 10.11772/j.issn.1001-9081.2017061633

Asbtract ( )

PDF (1057KB) ( )

References | Related Articles | Metrics

The current deep learning-based salient object detection algorithms fail to produce accurate object boundaries, which makes the regions along object contours blurred and inaccurate. To solve the problem, a salient object detection algorithm based on multi-task deep learning model was proposed. Firstly, based on deep Convolutional Neural Network (CNN), a multi-task model was used to separately learn region and boundary features of a salient object. Secondly, the detected object boundaries were utilized to produce a number of region candidates. After that the region candidates were re-ranked and their weights were computed by combining the results of salient region detection. Finally, the entire saliency map was extracted. The experimental results on three widely-used benchmarks show that the proposed method achieves better accuracy. According to F-measure, the proposed method averagely outperforms the deep learning-based algorithm by 1.9%, while lowers the Mean Absolutely Error (MAE) by 12.6%.

Uncertainty measurement and attribute reduction in incomplete neighborhood rough set

YAO Sheng, WANG Jie, XU Feng, CHEN Ju

2018, 38(1): 97-103. DOI: 10.11772/j.issn.1001-9081.2017061372

Asbtract ( )

PDF (1056KB) ( )

References | Related Articles | Metrics

Focusing on that the existing attribute reduction algorithms are not suitable for dealing with the incomplete data with both numerical attributes and symbolic attributes, an extented incomplete neighborhood rough set model was proposed. Firstly, the distance between the missing attribute values was defined to deal with incomplete data with mixed attributes by considering the probability distribution of the attribute values. Secondly, the concept of neighborhood mixed entropy was defined to evaluate the quality of attribute reduction and the relevant property theorem was proved. An attribute reduction algorithm for incomplete neighborhood rough set based on neighborhood mixed entropy was constructed. Finally, seven sets of data were selected from the UCI dataset for experimentation, and the algorithms was compared with the Attribute Reduction of Dependency (ARD), the Attribute Reduction of neighborhood Conditional Entropy (ARCE) and the Attribute Reduction of Neighborhood Combination Measure (ARNCM) algorithm respectively. The theoretical analysis and the experimental results show that compared to ARD, ARCE, ARNCM algorithms, the proposed algorithm reduces the attributes by about 1, 7, 0 respectively, and improves the classification accuracy by about 2.5 percentage points, 2.1 percentage points, 0.8 percentage points respectively. The proposed algorithm not only has less reducted attributes, but also has higher classification accuracy.

K-means clustering algorithm based on cluster degree and distance equilibrium optimization

WANG Rihong, CUI Xingmei

2018, 38(1): 104-109. DOI: 10.11772/j.issn.1001-9081.2017071716

Asbtract ( )

PDF (1104KB) ( )

References | Related Articles | Metrics

To deal with the problem that the traditional K-means algorithm is sensitive to the initial clustering center selection, an algorithm of K-Means clustering based on Clustering degree and Distance equalization optimization (K-MCD) was proposed. Firstly, the initial clustering center was selected based on the idea of "cluster degree". Secondly, the selection strategy of total clustering center distance equilibrium optimization was followed to obtain the final initial clustering center. Finally, the text set was vectorized, and the text cluster center and the evaluation criteria of text clustering were reselected to perform text clustering analysis according to the optimization algorithm. The analysis of simulation experiment for the text data set was carried out from the aspects of accuracy and stability. Compared with K-means algorithm, the clustering accuracy of K-MCD algorithm was improved by 18.6, 17.5, 24.3 and 24.6 percentage points respectively for four text sets; the average evolutionary algebraic variance of K-MCD algorithm was 36.99 percentage points lower than K-means algorithm. The experimental results show that K-MCD algorithm can improve text clustering accuracy with good stability.

Self-training method based on semi-supervised clustering and data editing

LYU Jia, LI Junnan

2018, 38(1): 110-115. DOI: 10.11772/j.issn.1001-9081.2017071721

Asbtract ( )

PDF (885KB) ( )

References | Related Articles | Metrics

According to the problem that unlabeled samples of high confidence selected by self-training method contain less information in each iteration and self-training method is easy to mislabel unlabeled samples, a Naive Bayes self-training method based on semi-supervised clustering and data editing was proposed. Firstly, semi-supervised clustering was used to classify a small number of labeled samples and a large number of unlabeled samples, and the unlabeled samples with high membership were chosen, then they were classified by Naive Bayes. Secondly, the data editing technique was used to filter out unlabeled samples with high clustering membership which were misclassified by Naive Bayes. The data editing technique could filter noise by utilizing information of the labeled samples and unlabeled samples, solving the problem that performance of traditional data editing technique may be decreased due to lack of labeled samples. The effectiveness of the proposed algorithm was verified by comparative experiments on UCI datasets.

Improved multi-objective A^* algorithm based on random walk

LIU Haohan, GUO Jingjing, LI Jianfu, HE Huaiqing

2018, 38(1): 116-119. DOI: 10.11772/j.issn.1001-9081.2017071899

Asbtract ( )

PDF (638KB) ( )

References | Related Articles | Metrics

Since New Approach to Multi-Objective A^* combined with dimensionality reduction technique (NAMOA_dr^*) algorithm has the phenomenon of plateau exploration, a Random Walk assisted NAMOA_dr^* (RWNAMOA_dr^*) algorithm which invoked a random walk procedure was proposed to find an exit (labels with heuristic value not dominated by the last extended label's) when the NAMOA_dr^*was stuck on a plateau. To determine when NAMOA_dr^* algorithm was stuck on a plateau exploration, a method of detecting plateau exploration was proposed. When the heuristic value of the extended label was dominated by the last extended label's for continuous m times, NAMOA_dr^* algorithm was considered to fall into the plateau exploration. In the experiments, a randomly generated grid was used, which was a standard test platform for the evaluation of multi-objective search algorithms. The experimental results reveal that compared with NAMOA_dr^* algorithm, RWNAMOA_dr^* algorithm's running time is reduced by 50.69% averagely and its space consuming is reduced by about 10% averagely, which can provide theoretical support for accelerating multi-objective path searching in real life.

Improved time dependent fast travel time algorithm by dynamically selecting heuristic values

LI Jiajia, LIU Xiaojing, LIU Xiangyu, XIA Xiufeng, ZHU Rui

2018, 38(1): 120-125. DOI: 10.11772/j.issn.1001-9081.2017071670

Asbtract ( )

PDF (936KB) ( )

References | Related Articles | Metrics

The existed TD-FTT (Time Dependent Fast Travel Time) algorithm, for answering K Nearest Neighbors (KNN) query in time dependent road network, requires that the issued time and the arrival time of a query must be in the same time interval, which costs a long time in the preprocessing phase. To solve these problems, an Improved TD-FTT (ITD-FTT) algorithm based on dynamically selecting heuristic values was proposed. Firstly, in the preprocessing phase, the road network G_min with the minimum cost for each time interval was created by using time functions of edges. Secondly, a parallel method of utilizing Network Voronoi Diagram (NVD) in road network G_min was used to compute the nearest neighbors of nodes to reduce the time cost. Finally, in the query phase, the heuristic value was dynamically selected to get rid of the time interval limitation by calculating the time interval of the current arrival time of nodes. The experimental results show that in the preprocessing phase, the time cost of ITD-FTT is reduced by 70.12% compared with TD-FTT. In the query phase, the number of traversal nodes of ITD-FTT is 46.52% and 16.63% lower than TD-INE (Time Dependent Incremental Network Expansion) and TD-A (Time Dependent A star) algorithm respectively, and the response time of ITD-FTT is 47.76% and 18.24% lower than TD-INE and TD-A. The experimental results indicate that the ITD-FTT algorithm reduces the number of nodes by query expansion, decreases the time of searching the KNN results and improves the query efficiency.

Dynamic chaotic ant colony system and its application in robot path planning

LI Juan, YOU Xiaoming, LIU Sheng, CHEN Jia

2018, 38(1): 126-131. DOI: 10.11772/j.issn.1001-9081.2017061326

Asbtract ( )

PDF (968KB) ( )

References | Related Articles | Metrics

To solve problems of population diversity and convergence speed when an Ant Colony System (ACS) is used to robot path planning, a dynamic chaos operator was introduced in the ACS. The dynamic chaotic ACS can balance population diversity and convergence speed. The core of dynamic chaotic ACS is that a Logistic chaotic operator was added to the traditional ACS to increase population diversity and improve the quality of the solutions. First, the chaotic operator was added to the pre-iteration to adjust the global pheromone value in the path to increase the population diversity of the algorithm, so as to avoid the algorithm to fall into the local optimal solution. Then, in the later stage, the ACS was used to ensure convergence speed of the dynamic chaotic ACS. The experimental results show that the dynamic chaotic ACS has better population diversity compared with the ACS for the robot path planning problem. The solution quality is higher and the convergence speed is faster. Compared with the Elitist Ant colony System (EAS) and the rank-based Ant System (ASrank), the dynamic chaotic ACS can balance the relationship between the quality of the solutions and the convergence speed. The dynamic chaotic ACS can find better optimal solutions even in the complex obstacle environment. The dynamic chaotic ACS can improve the efficiency of mobile robot path planning.

Polarimetric SAR image feature selection and multi-layer SVM classification using divisibility index

LI Ping, XU Xin, DONG Hao, DENG Xu

2018, 38(1): 132-136. DOI: 10.11772/j.issn.1001-9081.2017071719

Asbtract ( )

PDF (1026KB) ( )

References | Related Articles | Metrics

Separability Index (SI) can be used to select effective classification features, but in the case of multi-dimensional features and good separability of geology, the use of separability index for feature selection can not effectively remove redundancy. Based on this, a method of feature selection and multi-layer Support Vector Machine (SVM) classification was proposed by using separability index and Sequential Backward Selection (SBS) algorithm. Firstly, the classification object and features were determined according to the SIs of all the ground objects under all the features, and then based on the classification accuracies of the objects, the SBS algorithm was used to select the features again. Secondly, the features of next ground objects were determined by the separability index of remaining objects and the SBS algorithm in turn. Finally, the multi-layer SVM was used for classification. The experimental results show that the classification accuracy of the proposed method is improved by 2% compared with the method of multi-layer SVM classification where features are selected only based on the SI, and the classification accuracy of all kinds of objects is higher than 86%, and the running time is half of the original method.

Chaotic crow search algorithm based on differential evolution strategy for solving discount {0-1} knapsack problem

LIU Xuejing, HE Yichao, LU Fengjia, WU Congcong, CAI Xiufeng

2018, 38(1): 137-145. DOI: 10.11772/j.issn.1001-9081.2017061445

Asbtract ( )

PDF (1387KB) ( )

References | Related Articles | Metrics

In Discount {0-1} Knapsack Problem (D{0-1}KP), the weight coefficients and the value coefficients in a large range, are difficult to solve by deterministic algorithms. To solve this problem, a Chaotic Crow Search Algorithm based on Differential Evolution strategy (DECCSA) was proposed. Firstly, the initial crow population was generated by chaotic mapping. Secondly, mixed coding and Greedy Repair and Optimization Strategy (GROS) were used to solve the coding problem of D{0-1}KP. Finally, Difference Evolution (DE) strategy was introduced to improve the convergence rate of the algorithm. The experimental results on four large-scale D{0-1}KP instances show that DECCSA is better than Genetic Algorithm (GA), bacterial foraging optimization algorithm, and mutated bat algorithm, and it can get the optimal solution or approximate optimal solution. It's very suitable for solving D{0-1}KP.

Data compression and spatial indexing technology for massive 3D point cloud

ZHAO Erping, LIU Wei, DANG Hong'en

2018, 38(1): 146-151. DOI: 10.11772/j.issn.1001-9081.2017061489

Asbtract ( )

PDF (1209KB) ( )

References | Related Articles | Metrics

Concerning the problems that compression and spatial index for point cloud data in 3D model are inefficient and overlapping of two adjacent query windows is a large probability event in the process of roaming, the methods of Adjacent Point Difference Progressive Compression (APDPC) and R-tree spatial index for processing redundants based on trimming overlapped regions were proposed. Firstly, spatial subdivision of 3D model was done by an octree, the point cloud data managed by each leaf node was sorted by means of Morton codes, the data was partitioned according to outer cube size of R-tree leaf node, the data difference between adjacent points in the block was calculated, the difference was progressively compressed by using blocks as units, reading the data blocks in batches to create the R-tree. Secondly, the valid range of this query was calculated with the scope of the last query window. Finally, the query method of point cloud data based on R-tree index was given. This method improved the compression rate of point cloud data by 26.61 percentage points, and could realize streaming transmission. Meanwhile, it effectively reduced I/O overhead, the query performance was improved by 35.44%, and data redundancy was reduced by 16.49 percentage points. The experimental results show that the proposed methods have obvious advantages in 3D virtual travel, digital city and other systems.

Coupling similarity-based approach for categorizing spatial database query results

BI Chongchun, MENG Xiangfu, ZHANG Xiaoyan, TANG Yanhuan, TANG Xiaoliang, LIANG Haibo

2018, 38(1): 152-158. DOI: 10.11772/j.issn.1001-9081.2017051219

Asbtract ( )

PDF (1316KB) ( )

References | Related Articles | Metrics

A common spatial query often leads to the problem of multiple query results because a spatial database usually contains large size of data. To deal with this problem, a new categorization approach for spatial database query results was proposed. The solution consists of two steps. In the offline step, the coupling relationship between spatial objects was evaluated by considering the location proximity and semantic similarity between them, and then a set of clusters over the spatial objects could be generated by using probability density-based clustering method, where each cluster represented one type of user requirements. In the online query step, for a given spatial query, a category tree for the user was dynamically generated by using the modified C4.5 decision tree algorithm over the clusters, so that the user could easily select the subset of query results matching his/her needs by exploring the labels assigned on intermediate nodes of the tree. The experimental results demonstrate that the proposed spatial object clustering method can efficiently capture both the semantic and location relationships between spatial objects. The query result categorization algorithm has good effectiveness and low search cost.

Massive data analysis of power utilization based on improved K-means algorithm and cloud computing

ZHANG Chengchang, ZHANG Huayu, LUO Jianchang, HE Feng

2018, 38(1): 159-164. DOI: 10.11772/j.issn.1001-9081.2017071660

Asbtract ( )

PDF (943KB) ( )

References | Related Articles | Metrics

For such difficulties as low mining efficiency and large amount of data that the data mining of residential electricity data has to be faced with, the analysis based on improved K-means algorithm and cloud computing on massive data of power utilization was researched. As the initial cluster center and the value K are difficult to determine in traditional K-means algorithm, an improved K-means algorithm based on density was proposed. Firstly, the product of sample density, the reciprocal of the average distance between the samples in the cluster, and the distance between the clusters were defined as weight product, the initial center was determined successively according to the maximum weight product method and the accuracy of the clustering was improved. Secondly, the parallelization of improved K-means algorithm was realized based on MapReduce model and the efficiency of clustering was improved. Finally, the mining experiment of massive power utilization data was carried out on the basis of 400 households' electricity data. Taking a family as a unit, such features as electricity consumption rate during peak hour, load rate, valley load coefficient and the percentage of power utilization during normal hour were calculated, and the feature vector of data dimension was established to complete the clustering of similar user types, at the same time, the behavioral characteristics of each type of users were analyzed. The experimental results on Hadoop cluster show that the improved K-means algorithm operates stably and efficiently and it can achieve better clustering effect.

General bound estimation method for pattern measures over uncertain datasets

WANG Ju, LIU Fuxian, JIN Chunjie

2018, 38(1): 165-170. DOI: 10.11772/j.issn.1001-9081.2017061582

Asbtract ( )

PDF (906KB) ( )

References | Related Articles | Metrics

Concerning the problem of bound estimation for pattern measures in constraint-based pattern mining, a general bound estimation method for pattern measures over uncertain datasets was proposed. According to the characteristics of uncertain transaction datasets with weight, firstly, a general estimation framework for common pattern measures was designed. Secondly, a fast estimation method for the upper bound of pattern measures under the designed framework was provided. Finally, two commonly used pattern measures were introduced to verify the feasibility of the proposed method. In the experiment, the runtime and memory usage of the Potential High-Utility Itemsets UPper-bound-based mining (PHUI-UP) algorithm with transaction weighted utilization, the proposed upper bound and the actual upper bound were compared. The experimental results show that the proposed method can take less memory usage and runtime to realize the estimation of the upper bound of pattern utilization.

Compression method for trajectory data based on prediction model

CHEN Yu, JIANG Wei, ZHOU Ji'en

2018, 38(1): 171-175. DOI: 10.11772/j.issn.1001-9081.2017061411

Asbtract ( )

PDF (924KB) ( )

References | Related Articles | Metrics

A Compression method for Trajectory data based on Prediction Model (CTPM) was proposed to improve compression efficiency of massive trajectory data in road network environment. The temporal information and spatial information of the trajectory data were respectively compressed so that the compressed trajectory data was lossless in the spatial dimension and the error was bounded in the time dimension. In terms of space, the Prediction by Partial Matching (PPM) algorithm was used to predict the possible position of the next moment by the part trajectory that had been driven. And then the predicted road segments were deleted to reduce the storage cost. In terms of time, the statistical traffic speed model of different time intervals was constructed according to the periodic feature of the traffic condition to predict the required time for moving objects to enter the next section. And then the compression process was performed by deleting the road section information which predicted time error was smaller than the given threshold. In the comparison experiments with Paralleled Road-network-based trajectory comprESSion (PRESS) algorithm, the average compression ratio of CTPM was increased by 43% in space and 1.5% in time, and the temporal error was decreased by 9.5%. The experimental results show that the proposed algorithm can effectively reduce the compression time and compression error while improving the compression ratio.

Incremental frequent pattern mining algorithm for privacy-preserving

ZHANG Yaling, WANG Ting, WANG Shangping

2018, 38(1): 176-181. DOI: 10.11772/j.issn.1001-9081.2017061617

Asbtract ( )

PDF (914KB) ( )

References | Related Articles | Metrics

Aiming at the problems that a database is scanned for multiple times and a record is compared for many times to count in most frequent pattern mining algorithms for privacy-preserving, an Incremental Bitmap-based Randomized Response with Partial Hiding (IBRRPH) algorithm was proposed. Firstly, the bitmap technique was used to represent the transaction in the database, and the "and" operator for bit was used to speed up the support degree calculating. Secondly, an incremental update model was introduced by analyzing incremental access relationship, so that the mining result before was used to the maximum limit during incremental updating. The contrast experiment of performance to the algorithm proposed by Gu et al. (GU C, ZHU B P, ZHANG J K. Improved algorithm of privacy preserving association rule mining. Journal of Nanjing University of Aeronautics & Astronautics, 2015, 47(1):119-124) was done aiming at the increment range from 1000 to 40000. The experimental results show that the efficiency of the IBRRPH algorithm is improved over 21% compared to the algorithm proposed by Gu et al.

k-CS algorithm: trajectory data privacy-preserving based on semantic location protection

HUO Zheng, CUI Honglei, HE Ping

2018, 38(1): 182-187. DOI: 10.11772/j.issn.1001-9081.2017071676

Asbtract ( )

PDF (986KB) ( )

References | Related Articles | Metrics

Since the data utility would be sharply reduced after privacy-preserving process and several attack models could not be resisted by traditional algorithms, such as, semantic location attacks and maximum moving speed attacks, a trajectory privacy-preserving algorithm based on semantic location preservation under road network constraints, called k-CS (k-Connected Sub-graph) algorithm, was proposed. Firstly, two attack models in road network space were proposed. Secondly, the privacy problem of semantic trajectory was defined as the k-CS anonymity problem, which was then proven NP-hard. Finally, an approximation algorithm was proposed to cluster nodes in the road network to construct anonymity zones, and semantic locations were replaced with the corresponding anonymity zones. Experiments were implemented to compare the proposed algorithm with the classical algorithm, called (k,δ)-anonymity. The experimental results show that, the k-CS algorithm performs better than (k,δ)-anonymity algorithm in data utility, query error and runtime. Specifically, k-CS algorithm reduces about 20% in information loss than (k,δ)-anonymity, and k-CS algorithm deceased about 10% in runtime than (k,δ)-anonymity algorithm.

Malicious scanning protection technology based on OpenDayLight

WU Ruohao, DONG Ping, ZHENG Tao

2018, 38(1): 188-193. DOI: 10.11772/j.issn.1001-9081.2017061527

Asbtract ( )

PDF (974KB) ( )

References | Related Articles | Metrics

Aiming at the problem that Distributed Denial of Service (DDoS) attacks are difficult to detect and defend before the damage is generated, a Control Real-time Defense Mechanism (CRDM) based on Software Defined Network (SDN) for malicious scanning was proposed. Firstly, the advantages of the SDN over the traditional network in the network layer protection technology were analyzed. Secondly, according to the network attack-malicious scanning, a CRDM for defending against malicious scanning was proposed. In CRDM, Representational State Transfer (REST) APIs (Application Program Interfaces) provided by the OpenDayLight (ODL) were used to build an external application to achieve detection, determination and prevention on the switch port. Finally, CRDM was implemented on the ODL platform, and the detection and defense scheme of malicious scanning was tested. The simulation results show that:when a port is scanning the network maliciously, CRDM can disable the port in time, and protect against malicious scanning attacks in real-time. Then, the destructive behavior in a DDoS attack is prevented before it is started.

Cloud data assured deletion scheme based on key distribution and ciphertext sampling

WANG Minshen, XIONG Jinbo, LIN Qian, WANG Lili

2018, 38(1): 194-200. DOI: 10.11772/j.issn.1001-9081.2017071751

Asbtract ( )

PDF (1088KB) ( )

References | Related Articles | Metrics

If cloud data is not deleted in time after expiration, it may lead to unauthorized access and privacy leakage. For above issue, a cloud data assured deletion scheme based on key distribution and ciphertext sampling was proposed. It was composed of the encryption algorithm and Distributed Hash Table (DHT) network. Firstly, the plaintext was encrypted into the ciphertext. The ciphertext was sampled by random sampling algorithm. The incomplete ciphertext was uploaded to the cloud. Secondly, The trust value of each node in the DHT network was evaluated by evaluative method. The encryption key was processed into the subkeys by Shamir secret sharing algorithm, and the subkeys were distributed into the nodes with high trust degree. Finally, the encryption key was automatically deleted by the periodic self-updating function of the DHT network. The ciphertext in the cloud was overwritten by uploading random data through the Hadoop Distributed File System (HDFS)'s interface. Assured deletion of cloud data was done by deleting the encryption key and the ciphertext. The security analysis and performance analysis demonstrate that the proposed scheme is secure and efficient.

Detection of SQL injection behaviors for PHP applications

ZHOU Ying, FANG Yong, HUANG Cheng, LIU Liang

2018, 38(1): 201-206. DOI: 10.11772/j.issn.1001-9081.2017071692

Asbtract ( )

PDF (1074KB) ( )

References | Related Articles | Metrics

The SQL (Structured Query Language) injection attack is a threat to Web applications. Aiming at SQL injection behaviors in PHP (Hypertext Preprocessor) applications, a model of detecting SQL injection behaviors based on tainting technology was proposed. Firstly, an SQL statement was obtained when an SQL function was executed, and the identity information of the attacker was recorded through PHP extension technology. Based on the above information, the request log was generated and used as the analysis source. Secondly, the SQL parsing process with taint marking was achieved based on SQL grammar analysis and abstract syntax tree. By using tainting technology, multiple features which reflected SQL injection behaviors were extracted. Finally, the random forest algorithm was used to identify malicious SQL requests. The experimental results indicate that the proposed model gets a high accuracy of 96.9%, which is 7.2 percentage points higher than that of regular matching detection technology. The information acquisition module of the proposed model can be loaded in an extended form in any PHP application; therefore, it is transplantable and applicable in security audit and attack traceability.

Controller deployment strategy based on delay optimization in software defined network

FAN Zifu, YAO Jie, YANG Xianhui

2018, 38(1): 207-211. DOI: 10.11772/j.issn.1001-9081.2017071681

Asbtract ( )

PDF (848KB) ( )

References | Related Articles | Metrics

Most of the controller deployment programs in Software Defined Network (SDN) are focused on the propagation delay under normal network state, ignoring link fault state on the delay. To solve these problems, a controller deployment scheme based on delay optimization was proposed. Firstly, based on the worst delay minimization problem under normal network states and multiple single-link fault states, the network state delay was used as the new delay optimization goal to establish a controller deployment model. Secondly, two heuristic deployment algorithms were proposed to solve the above model:Controller Placement Algorithm based on Greedy Algorithm (GA-CPA) and Controller Placement Algorithm based on Particle Swarm Optimization (PSO-CPA). Finally, in order to measure the performance of the solutions, some real network topologies and data were chosen to verify the validity. The simulation results show that GA-CPA and PSO-CPA algorithms can reduce the network state delay in different degrees, thus ensuring that the worst delay in most network states is maintained at a lower range.

Statistical quality of service driven optimal power allocation in energy harvesting based wireless networks

GAO Ya, ZHANG Hailin, LU Xiaofeng

2018, 38(1): 212-216. DOI: 10.11772/j.issn.1001-9081.2017071720

Asbtract ( )

PDF (914KB) ( )

References | Related Articles | Metrics

An optimal power allocation scheme in energy harvesting based wireless networks was proposed aiming to maximize the effective capacity under the statistical delay-bounded Quality of Service (QoS). In particular, first, the effective capacity maximization problem was formulated for energy harvesting wireless networks. Next, the effective capacity maximization problem was sovled based on convex optimization theory. By analyzing the solution, the facts were obtained that when the QoS constraint is very loose, the optimal power allocation scheme converges to the water filling scheme; meanwhile, when the QoS constraint is very stringent, the optimal power allocation scheme converges to the channel inversion scheme. Then, the developed optimal power allocation scheme was theoretically analyzed by deriving the channel outage probability. Finally, the performance of the optimal power allocation scheme was tested. The experimental results show that the proposed scheme can obtain larger effective capacity than that existing works.

Cluster-based resource allocation scheme in dense femtocell network

JIN Yong, GONG Shengli

2018, 38(1): 217-221. DOI: 10.11772/j.issn.1001-9081.2017061512

Asbtract ( )

PDF (821KB) ( )

References | Related Articles | Metrics

For femtocell downlink interference in the case of dense deployment of femtocells, a cluster-based resource allocation scheme was proposed. Firstly, Fractional Frequency Reuse (FFR) was used to divide all cells into difference spatial regions, which could suppress the co-tier interference among macrocells and cross-tier interference between a macrocell and femtocells. Secondly, all the femtocells were divided into disjoint clusters based on the knowledge of graph theory and convex optimization theory, then a channel allocation algorithm based on fairness of rate was adopted to assign sub-channel to the femtocell user equipment, for limiting the co-tier interference among femtocells. Finally, a distributed power control algorithm was used to adjust the power of femtocells dynamically to improve the performance of the system. The simulation results show that, compared with the conventional clustering method, the proposed algorithm can improve the Signal-to-Interference plus Noise Ratio (SINR) and throughput, and the probability of system throughput below 4Mb/s was reduced to 30%. At the same time, compared with the non-grouping algorithm, the fairness of the proposed algorithm was improved by 12%, which could make users get higher satisfaction.

Multi-granularity topology-based stepwise refinement provenance method for wireless sensor networks

KANG Zhaoling, XU Qinbao, WANG Changda

2018, 38(1): 222-227. DOI: 10.11772/j.issn.1001-9081.2017061627

Asbtract ( )

PDF (1030KB) ( )

References | Related Articles | Metrics

Focusing on the problem that the session-based provenance scheme requires that all of the provenance sessions must be received by the Base Station (BS) correctly before decoding, which decreases the robustness of the scheme, a provenance scheme with stepwise refinement was proposed. Firstly, a large Wireless Sensor Network (WSN) topology was divided into different granularities which were composed of a series of abstract nodes through quotient space dividing theory. Then, the dictionary based provenance scheme was used to transmit provenance at different grained layers. Therefore, the BS reconstructed the provenance from the coarse-grained layer to the fine-grained one as well as judged whether to discard the data or not according to the results of the coarse-grained layer's decoding. The theoretical analysis, simulations and experimental results show that in comparison of the traditional provenance schemes, the average compression ratio of the proposed method is improved by 51.8%, and the energy consumption of that is reduced by 50.5%.

Time-frequency combination de-noising algorithm based on orthogonal frequency division multiplexing/offset quadrature amplitude modulation in power line communication system

ZHENG Jianhong, ZHANG Heng, LI Fei, LI Xiang, DENG Zhan

2018, 38(1): 228-232. DOI: 10.11772/j.issn.1001-9081.2017071727

Asbtract ( )

PDF (790KB) ( )

References | Related Articles | Metrics

Focusing on the issue that the impulse noise in Power Line Communication (PLC) system greatly affects the transmission performance, and most traditional de-noising algorithm can not effectively suppress the impulse noise, a time-frequency combination de-noising algorithm was proposed. Firstly, the impulse noise with large peak in the time domain received signal was detected and zeroed by selecting the appropriate threshold. Secondly, according to the symbols that had been decided in the frequency domain, the smaller impulse noise which had not eliminated in the time domain was reconstructed, and the accuracy of the noise reconstruction was improved by iteration. Finally, the reconstructed impulse noise was subtracted from the frequency domain received signal. Simulation experiments were conducted under the multipath channel of the power line. Compared with traditional time domain and frequency domain de-noising algorithms, the proposed algorithm could achieve the performance improvement of 2dB and 0.5dB respectively when the bit-error rate was 0.01. And as the bit-error rate decreased, the performance gap between them would be even greater. The simulation results show that the proposed time-frequency combination de-noising algorithm can improve the resistance of the PLC system to impulse noise.

Image feature extraction method based on improved nonnegative matrix factorization with universality

JIA Xu, SUN Fuming, LI Haojie, CAO Yudong

2018, 38(1): 233-237. DOI: 10.11772/j.issn.1001-9081.2017061394

Asbtract ( )

PDF (825KB) ( )

References | Related Articles | Metrics

To improve the universality of image feature extraction, an image feature extraction method based on improved Nonnegative Matrix Factorization (NMF) was proposed. Firstly, considering the practical significance of extracted image features, NMF model was used to reduce the dimension of image feature vector. Secondly, in order to represent the image by a small number of coefficients, a sparse constraint was added to the NMF model as one of the regular terms. Then, to make the optimized feature have better inter-class differentiation, the clustering property constraint would be another regular term of the NMF model. Finally, through optimizing the model by using gradient descent method, the best feature basis vector and image feature vector could be acquired. The experimental results show that for three image databases, the acquired features extracted by the improved NMF model are more conducive to correct image classification or identification, and the False Accept Rate (FAR) and False Reject Rate (FRR) are reduced to 0.021 and 0.025 respectively.

Fence-like occlusion detection algorithm using super-pixel segmentation and graph cuts

LIU Yu, JIN Weizheng, FAN Ci'en, ZOU Lian

2018, 38(1): 238-245. DOI: 10.11772/j.issn.1001-9081.2017071722

Asbtract ( )

PDF (1518KB) ( )

References | Related Articles | Metrics

Due to the limited angle of photography, some natural images are oscured by fence-like occlusion such as barbed wire, fence and glass joints. A novel fence-like occlusion detection algorithm was proposed to repair such images. Firstly, aiming at the drawbacks of the existing fence detection algorithms using single pixel color feature and fixed shape feature, the image was divided into super pixels and a joint feature of color and texture was introduced to describe the super pixel blocks. Thus, the classification of a pixel classification problem was converted to a super pixel classification problem, which inhibited the misclassification caused by local color changes. Secondly, the super-pixel blocks were classified by using the graph cuts algorithm to extend the mesh structure along the smooth edges without being restricted by the fixed shape, which improved the detection accuracy of the special-shaped fence structure and avoided the manual input required by the algorithm proposed by Farid et al. (FARID M S, MAHMOOD A, GRANGETTO M. Image de-fencing framework with hybrid inpainting algorithm. Signal, Image and Video Processing, 2016, 10(7):1193-1201) Then, new joint features were used to train the Support Vector Machine (SVM) classifier and classify all non-classified super-pixel blocks to obtain an accurate fence mask. Finally, the SAIST (Spatially Adaptive Iterative Singular-value Thresholding) inpainting algorithm was used to repair the image. In the experiment, the obtained fence mask retained more detail than that of the algorithm proposed by Farid et al., meanwhile using the same repair algorithm, the image restoration effect was significantly improved. Using the same fence mask, restored images by using the SAIST algorithm are 3.06 and 0.02 higher than that by using the algorithm proposed by Farid et al., respectively, in Peak Signal-to-Noise Rate (PSNR) and Structural SIMilarity (SSIM). The overall repair results were significantly improved compared to the algorithm proposed by Farid et al. and the algorithm proposed by Liu et al. (LIU Y Y, BELKINA T, HAYS J H, et al. Image de-fencing. Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2008:1-8) when using the SAIST inpainting algorithm combined with the proposed fence detection algorithm. The experimental results show that the proposed algorithm improves the detection accuracy of the fence mask, thus yields better fence removed image reconstruction.

Improved algorithm of image super resolution based on residual neural network

WANG Yining, QIN Pinle, LI Chuanpeng, CUI Yuhao

2018, 38(1): 246-254. DOI: 10.11772/j.issn.1001-9081.2017061461

Asbtract ( )

PDF (1533KB) ( )

References | Related Articles | Metrics

To efficiently improve the effects of image Super Resolution (SR), a multi-stage cascade residual convolution neural network model was proposed. Firstly, two-stage SR image reconstruction method was used to reconstruct the 2-times SR image and then reconstruct the 4-times SR image; secondly, residual layer and jump layer were used to predict the texture information of the high resolution space in the first and second stages, and deconvolution layer was used to reconstruct 2-times and 4-times SR images. Finally, two multi-task loss functions were constructed respectively by the results of two stages. And the loss of the first stage guided that of the second one, which accelerated the network training and enhanced the supervision and guidance of the network learning. The experimental results show that compared with bilinear algorithm, bicubic algorithm, Super Resolution using Convolutional Neural Network (SRCNN) algorithm and Fast Super Resolution Convolutional Neural Network (FSRCNN) algorithm, the proposed model can better construct the details and texture of images, which avoids the image over smoothing after iterating, and achieves higher Peak Signal-to-Noise Ratio (PSNR) and Mean Structural SIMilarity (MSSIM).

Disparity map generation technology based on convolutional neural network

ZHU Junpeng, ZHAO Hongli, YANG Haitao

2018, 38(1): 255-259. DOI: 10.11772/j.issn.1001-9081.2017071659

Asbtract ( )

PDF (1010KB) ( )

References | Related Articles | Metrics

Focusing on the issues such as high cost, long time consumption and background holes in the disparity map in naked-eye 3D applications, learning and prediction algorithm based on Convolutional Neural Network (CNN) was introduced. Firstly, the change rules of a dataset could be mastered through training and learning the dataset. Secondly, the depth map with continuous lasting depth value was attained by extracting and predicting the features of the left view in the input CNN. Finally, the right view was produced by the superposition of diverse stereo pairs after folding the predicted depth and original maps. The simulation results show that the pixel-wise reconstruction error of the proposed algorithm is 12.82% and 10.52% lower than that of 3D horizontal disparity algorithm and depth image-based rendering algorithm. In addition, the problems of background hole and background adhesion have been greatly improved. The experimental results show that CNN can improve the image quality of disparity maps.

Night-time vehicle detection based on Gaussian mixture model and AdaBoost

CHEN Yan, YAN Teng, SONG Junfang, SONG Huansheng

2018, 38(1): 260-263. DOI: 10.11772/j.issn.1001-9081.2017071763

Asbtract ( )

PDF (819KB) ( )

References | Related Articles | Metrics

Focusing on the issue that the accuracy of night-time vehicle detection is relatively low, a method of accurately detecting the night-time vehicles by constructing a Gaussian Mixture Model (GMM) for the geometric relationship of the headlights and an AdaBoost (Adaptive Boosting) classifier using inverse projected vehicle samples was proposed. Firstly, the inverse projection plane was set according to the spatial position relation of the headlights in the traffic scene, and the headlights area was roughly positioned by the image preprocessing. Secondly, the geometrical relationship of the headlights was used to construct the GMM with the inverse projected images, and the headlights were initially matched. Finally, the vehicles were detected by using the AdaBoost classifier for inverse projected vehicle samples. In the comparison experiments with the AdaBoost classifier for the original image, the proposed method increased detection rate by 1.93%, decreased omission ratio by 17.83%, decreased false detection rate by 27.61%. Compared with D-S (Dempster-Shafer) evidence theory method, the proposed method increased detection rate by 2.03%, decreased omission ratio by 7.58%, decreased false detection rate by 47.51%. The proposed method can effectively improve the relative detection accuracy, reduces the interference of ground reflection and shadow, and satisfies the requirements of reliability and accuracy of night-time vehicle detection in traffic scene.

Natural scene text detection based on maximally stable extremal region in color space

FAN Yihua, DENG Dexiang, YAN Jia

2018, 38(1): 264-269. DOI: 10.11772/j.issn.1001-9081.2017061389

Asbtract ( )

PDF (1191KB) ( )

References | Related Articles | Metrics

To solve the problem that the text regions can not be extracted well in low contrast images by traditional Maximally Stable Extremal Regions (MSER) method, a novel scene text detection method based on edge enhancement was proposed. Firstly, the MSER method was effectively improved by Histogram of Oriented Gradients (HOG), the robustness of MSER method was enhanced to low contrast images and MSER was applied in color space. Secondly, the Bayesian model was used for the classification of characters, three features with translation and rotation invariance including stroke width, edge gradient direction and inflexion point were used to delete non-character regions. Finally, the characters were grouped into text lines by geometric characteristics of characters. The proposed algorithm's performance on standard benchmarks, such as International Conference on Document Analysis and Recognition (ICDAR) 2003 and ICDAR 2013, was evaluated. The experimental results demonstrate that MSER based on edge enhancement in color space can correctly extract text regions from complex and low contrast images. The Bayesian model based classification method can detect characters from small sample set with high recall. Compared with traditional MSER based method of text detection, the proposed algorithm can improve the detection rate and real-time performance of the system.

Evolution graph clustering algorithm of enterprise trust network based on fragment

LU Zhigang, XIE Wanting

2018, 38(1): 270-276. DOI: 10.11772/j.issn.1001-9081.2017071726

Asbtract ( )

PDF (1121KB) ( )

References | Related Articles | Metrics

Concerning the problem about the identification and evolution of enterprise trust alliance in dynamic trust network, a Graph Clustering (GC) algorithm based on fragment was proposed. Firstly, by considering the time information of network evolution, the trust network of enterprise was encoded. Secondly, the evaluation function of coding cost for dividing and presenting the structure of trust network was built. When the trust alliance was stable, the trust network during this time period would be compressed into a fragment; when the alliance changed, a new trust network fragment would be built and the structure of it would be re-devided. Finally, by finding the minimum coding cost, the stable structure of trust alliance and the timestamp of structural mutation could be found. The experimental results indicate that the proposed algorithm can identify the enterprise trust alliances and their mutations; and the accuracy and operating efficiency are higher than the classical community discovery algorithm.

Evaluation of sector dynamic traffic capacity based on controller's cognitive load and improved ant colony algorithm

WANG Chao, ZHU Ming, WANG Min

2018, 38(1): 277-283. DOI: 10.11772/j.issn.1001-9081.2017061499

Asbtract ( )

PDF (1149KB) ( )

References | Related Articles | Metrics

The existing evaluation of dynamic traffic capacity does not consider the controller's cognitive load. In order to improve the accuracy of air traffic flow management, a new sector dynamic traffic capacity evaluation model based on controller's cognitive load and improved ant colony algorithm was constructed. Firstly, a dynamic flight constrained region model which could describe the dynamic influence factors of the sector was constructed, and the dynamic control guided path planning was realized by improving ant colony algorithm, which could meet the calculation speed requirement of air traffic flow management. Then, the concept of control and guidance load intensity was proposed and used for the construction of sector dynamic traffic capacity evaluation model, which expanded the concept of controller's total cognitive load. Finally, taking the Sanya regulatory sector as an example, the dynamic traffic capacity of the sector at 9 moments in the next 2 hours was evaluated by taking 15 min as an interval. The results of example verification show that the results of dynamic traffic capacity obtained by the proposed model are different from the actual operating results by an airplane, and the effects are ideal.

Design of indoor mobile fire-extinguishing robot system based on wireless sensor network

SHI Bing, DUAN Suolin, LI Ju, WANG Peng, ZHU Yifei

2018, 38(1): 284-289. DOI: 10.11772/j.issn.1001-9081.2017071757

Asbtract ( )

PDF (956KB) ( )

References | Related Articles | Metrics

Aiming at the problem that the indoor mobile fire extinguishing robot can not obtain comprehensive environmental information in time by means of its inner sensors and the absence of remote network control function, a system architecture based on Wireless Sensor Network (WSN) with function of remote network control was proposed. Firstly, a WSN with mesh topology was built to collect indoor environmental information. Secondly, after analyzing the logic of parts of the system, a database and a Web server were completed to achieve the browsing function for remote clients. Finally, the function of remote network control of robot was achieved by developing the software with Socket communication function for network clients. The test results show that the rate of data packet loss for mesh topology without covering the gateway node is 2% at 1.5s sending interval, which is 67% lower than the tree topology in the same situation. By adopting the proposed system architecture, both more comprehensive indoor environment information and reduction of data packet loss rate for WSN are achieved, and the function of remote network control is also realized.

Design and implementation of positioning system for embedded aeronautical ground lights

NIU Guochen, YUAN Jie, GU Runping

2018, 38(1): 290-294. DOI: 10.11772/j.issn.1001-9081.2017061497

Asbtract ( )

PDF (760KB) ( )

References | Related Articles | Metrics

To solve the lights positioning problem for the cleaning system of aviation embedded aeronautical ground lights in the civil airport, a light positioning system based on vision was designed. Firstly, according to the installation standards, characteristics of nighttime lighting and impacts of external interference, the geometric modeling of the positioning system was designed and the parameters of cameras were optimized. Furthermore, the feasibility of system was taken into account at the same time. Then an improved algorithm of maximum between-class variance method named Otsu was proposed to achieve the adaptive threshold segmentation, and the step distance method was used to calculate the centroid of aeronautical ground lights. Finally, in order to improve the system accuracy, the system was calibrated after its errors had been analyzed. The actual experiment for the positioning system of embedded aeronautical ground lights was carried out in the nighttime environment. The experimental results show that the designed system has the advantages of high positioning speed, high accuracy with average error of 16.3mm, and it also has strong environment adaptability. This visual positioning system can effectively meet the positioning demands of aviation embedded aeronautical ground lights.

Optimum feature selection based on genetic algorithm under Web spam detection

WANG Jiaqing, ZHU Yan, CHEN Tung-shou, CHANG Chin-chen

2018, 38(1): 295-299. DOI: 10.11772/j.issn.1001-9081.2017061560

Asbtract ( )

PDF (807KB) ( )

References | Related Articles | Metrics

Focusing on the issue that features used in Web spam detection are always high-dimensional and redundant, an Improved Feature Selection method Based on Information Gain and Genetic Algorithm (IFS-BIGGA) was proposed. Firstly, the priorities of features were ranked by Information Gain (IG), and dynamic threshold was set to get rid of redundant features. Secondly, the function of chromosome encoding was modified and the selection operator was improved in Genetic Algorithm (GA). After that, the Area Under receiver operating Characteristic (AUC) of Random Forest (RF) classifier was utilized as the fitness function to pick up the features with high degree of identification. Finally, the Optimal Minimum Feature Set (OMFS) was obtained by increasing the experimental iteration to avoid the randomness of the proposed algorithm. The experimental results show that OMFS, compared to the high-dimensional feature set, although the AUC under RF is decreased by 2%, the True Positive Rate (TPR) is increased by 21% and the feature dimension is reduced by 92%. And the average detecting time is decreased by 83%; moreover, by comparing to the Traditional GA (TGA) and Imperialist Competitive Algorithm (ICA), the F1 score under Bayes Net (BN) is increased by 4.2% and 3.5% respectively. The experimental results that the IFS-BIGGA can effectively reduce the dimension of features, which means it can effectively reduce the calculation cost, improves the detection efficiency in the actual Web spam detection inspection project.

Design of controller based on backstepping nonlinear pure feedback system

JIA Fujin, JIANG Yuan

2018, 38(1): 300-304. DOI: 10.11772/j.issn.1001-9081.2017061365

Asbtract ( )

PDF (666KB) ( )

References | Related Articles | Metrics

To solve the problem that it is difficult to design a controller by previous coordinate transformation because there is nonaffine structure in nonlinear pure feedback system, a new coordinate transformation was proposed and a first-order control input auxiliary system was introduced to deal with the nonlinear pure feedback systems. Firstly, a new state equation was calculated by combining the new coordinate transformation. Secondly, positive definite Lyapunov function was designed for each step based on the backstepping method. Finally, the derivatives of Lyapunov functions were made negative by designing virtual controllers and auxiliary controller, so the tracking problem of nonlinear pure feedback systems was theoretically solved. The experimental results show that the designed auxiliary controller is able to make the state of the nonlinear system bounded globally, and the control output can track the given signal, and the tracking error becomes asymptotically stable.

Table of Content