Search Result

Select

Low-latency neighbor selection scheme for blockchain networks based on multi-stage propagation

Gongli LI, Xiaodi CHEN, Lu LI

Journal of Computer Applications 2025, 45 (12): 3939-3946. DOI: 10.11772/j.issn.1001-9081.2024111678

Abstract （20）

HTML （0）

PDF （793KB）（82）

Save

Blockchain relies on an unstructured Peer-to-Peer （P2P） overlay network for the propagation of transactions and blocks. In this network structure， propagation is delayed， and the long-tail propagation problem is significant， which lead to inconsistencies in the information stored by nodes， that is the phenomenon of blockchain forks. Forks not only waste computational resources in the entire blockchain network， but also introduce a series of security issues. To reduce propagation delays in blockchain networks， a Neighbor Selection scheme based on Multi-stage Propagation （NSMP） was proposed to optimize the network topology by selecting neighbor nodes. Firstly， the nodes’ Outbound neighbors were divided into strong and weak propagators based on two factors： propagation ability and proximity， and different neighbor selection schemes were applied at different stages of network propagation， thereby reducing propagation hops and shortening propagation time. At the same time， the long-tail propagation problems in both existing and default schemes were further solved. Finally， the propagation ability of nodes was quantified by a fitting function based on node local characteristics， proximity information of the nodes was quantified using the Ping protocol， and the designed scheme was tested through simulation experiments using the network simulator SimBlock. Experimental results show that NSMP reduces the fork rate by 52.17% compared to the default scheme， demonstrating the feasibility and effectiveness of NSMP. Besides， according to the simulation data of experiments， the optimal parameter setting for the distribution of neighbor node proximity was determined.

Table and Figures | Reference | Related Articles | Metrics

Select

Smart contract vulnerability detection method based on echo state network

Chunxia LIU, Hanying XU, Gaimei GAO, Weichao DANG, Zilu LI

Journal of Computer Applications 2025, 45 (1): 153-161. DOI: 10.11772/j.issn.1001-9081.2024010025

Abstract （238）

HTML （5）

PDF （1988KB）（79）

Save

Smart contracts on blockchain platforms are decentralized applications to provide secure and trusted services to multiple parties on the chain. Smart contract vulnerability detection can ensure the security of these contracts. However， the existing methods for detecting smart contract vulnerabilities encountered issues of insufficient feature learning and low vulnerability detection accuracy when dealing with imbalanced sample sizes and incomplete semantic information mining. Moreover， these methods cannot detect new vulnerabilities in contracts. A smart contract vulnerability detection method based on Echo State Network （ESN） was proposed to address the above problems. Firstly， different semantic and syntactic edges were learned on the basis of contract graph， and feature vectors were obtained through Skip-Gram model training. Then， ESN was combined with transfer learning to achieve transfer and extension of new contract vulnerabilities in order to improve the vulnerability detection rate. Finally， experiments were conducted on the smart contract dataset collected on Etherscan platform. Experimental results show that the accuracy， precision， recall， and F1-score of the proposed method reach 94.30%， 97.54%， 91.68%， and 94.52%， respectively. Compared with Bidirectional Long Short-Term Memory （BLSTM） network and Bidirectional Long Short-Term Memory with ATTention mechanism （BLSTM-ATT）， the proposed method has the accuracy increased by 5.93 and 11.75 percentage points respectively， and the vulnerability detection performance is better. The ablation experiments also further validate the effectiveness of ESN for smart contract vulnerability detection.

Table and Figures | Reference | Related Articles | Metrics

Select

Multi-site wind speed prediction based on graph dynamic attention network

Bolu LI, Li WU, Xiaoying WANG, Jianqiang HUANG, Tengfei CAO

Journal of Computer Applications 2023, 43 (11): 3616-3624. DOI: 10.11772/j.issn.1001-9081.2022111749

Abstract （333）

HTML （6）

PDF （4716KB）（409）

Save

The task of spatio-temporal sequence prediction has a wide range of applications in the fields such as transportation， meteorology and smart city. It is necessary to learn the spatio-temporal characteristics of different data with the combination of external factors such as precipitation and temperature when making station wind speed predictions， which is one of the main tasks in meteorological forecasting. The irregular distribution of meteorological stations and the inherent intermittency of the wind itself bring the challenge of achieving wind speed prediction with high accuracy. In order to consider the influence of multi-site spatial distribution on wind speed to obtain accurate and reliable prediction results， a Graph-based Dynamic Switch-Attention Network （Graph-DSAN） wind speed prediction model was proposed. Firstly， the distances between different sites were used to reconstruct the connection of them. Secondly， the process of local sampling was used to model adjacency matrices of different sampling sizes to achieve the aggregation and transmission of the information between neighbor nodes during the graph convolution process. Thirdly， the results of the graph convolution processed by Spatio-Temporal Position Encoding （STPE） were fed into the Dynamic Attention Encoder （DAE） and Switch-Attention Decoder （SAD） for dynamic attention computation to extract the spatio-temporal correlations. Finally， a multi-step prediction was formed by using autoregression. In experiments on wind speed prediction on 15 sites data in New York State， the designed model was compared with ConvLSTM， Graph Multi-Attention Network （GMAN）， Spatio-Temporal Graph Convolutional Network （STGCN）， Dynamic Switch-Attention Network （DSAN） and Spatial-Temporal Dynamic Network （STDN）. The results show that the Root Mean Square Error （RMSE） of 12 h prediction of Graph-DSAN model is reduced by 28.2%， 6.9%， 27.7%， 14.4% and 8.9% respectively， verifying the accuracy of Graph-DSAN in wind speed prediction.

Table and Figures | Reference | Related Articles | Metrics

Select

Vehicle RKE two-factor authentication protocol resistant to physical cloning attack

Changgeng LIU, Yali LIU, Qipeng LU, Tao LI, Changlu LIN, Yi ZHU

Journal of Computer Applications 2023, 43 (11): 3375-3384. DOI: 10.11772/j.issn.1001-9081.2022111802

Abstract （279）

HTML （8）

PDF （1299KB）（242）

Save

Attackers can illegally open a vehicle by forgeing the Radio Frequency IDentification （RFID） signal sent by the vehicle remote key. Besides， when the vehicle remote key is lost or stolen， the attacker can obtain the secret data inside the vehicle remote key and clone a usable vehicle remote key， which will threaten the property and privacy security of the vehicle owner. Aiming at the above problems， a Vehicle RKE Two-Factor Authentication （VRTFA） protocol for vehicle Remote Keyless Entry （RKE） that resists physical cloning attack was proposed. The protocol is based on Physical Uncloneable Function （PUF） and biological fingerprint feature extraction and recovery functions， so that the specific hardware physical structure of the legal vehicle remote key cannot be forged. At the same time， the biological fingerprint factor was introduced to build a two-factor authentication protocol， thereby solving the security risk of vehicle remote key theft， and further guaranteeing the secure mutual authentication of vehicle RKE system. Security analysis results of the protocol using BAN logic show that VRTFA protocol can resist malicious attacks such as forgery attack， desynchronization attack， replay attack， man-in-the-middle attack， physical cloning attack， and full key leakage attack， and satisfy the security attributes such as forward security， mutual authentication， data integrity， and untraceability. Performance analysis results show that VRTFA protocol has stronger security and privacy and better practicality than the existing RFID authentication protocols.

Table and Figures | Reference | Related Articles | Metrics

Select

Efficient certificateless ring signature scheme based on elliptic curve

Xiuping ZHU, Yali LIU, Changlu LIN, Tao LI, Yongquan DONG

Journal of Computer Applications 2023, 43 (11): 3368-3374. DOI: 10.11772/j.issn.1001-9081.2022111801

Abstract （350）

HTML （15）

PDF （740KB）（297）

Save

Ring signature is widely used to solve the problems of user identity and data privacy disclosure because of its spontaneity and anonymity； and certificateless public key cryptosystem can not only solve the problem of key escrow， but also do not need the management of public key certificates； certificateless ring signature combines the advantages of both of the above mentioned， and has extensive research significance， but most of the existing certificateless ring signature schemes are based on the calculation of bilinear pairings and modular exponentiation， which are computationally expensive and inefficient. In order to improve the efficiency of signature and verification stages， a new Efficient CertificateLess Ring Signature （ECL-RS） scheme was proposed， which used elliptic curve with low computational cost， high security and good flexibility. The security statute of ECL-RS scheme stems from a discrete logarithm problem and a Diffie-Hellman problem， and the scheme is proved to be resistant to public key substitution attacks and malicious key generation center attacks under Random Oracle Model （ROM） with unforgeability and anonymity. Performance analysis shows that ECL-RS scheme only needs （n+2）（n is the number of ring members） elliptic curve scalar multiplication and scalar addition operations as well as （n+3） one-way hash operations， which has lower computational cost and higher efficiency while ensuring security.

Table and Figures | Reference | Related Articles | Metrics

Select

Real-time SLAM algorithm with keyframes determined by inertial measurement unit

WEI Wenle, JIN Guodong, TAN Lining, LU Libin, CHEN Danqi

Journal of Computer Applications 2020, 40 (4): 1157-1163. DOI: 10.11772/j.issn.1001-9081.2019081326

Abstract （689）

PDF （3649KB）（463）

Save

Due to the limitation of the computational power of embedded processors,the poor real-time performance has always been an urgent problem to be solved in the practical applications of Visual Inertial Simultaneous Localization And Mapping(VI-SLAM). Therefore,a real-time Simultaneous Localization And Mapping(SLAM)with keyframes determined by Inertial Measurement Unit(IMU)was proposed,which was mainly divided into three threads:tracking,local mapping and loop closing. Firstly,the keyframes were determined adaptively by the tracking thread through the IMU pre-integration, and the adaptive threshold was derived from the result of the visual inertia tight coupling optimization. Then,only the keyframes were tracked,thereby avoiding the feature processing to all frames. Finally,a more accurate Unmanned Aerial Vehicle(UAV)pose was obtained by the local mapping thread through the visual inertial bundle adjustment in the sliding window,and the globally consistent trajectory and map were output by the loop closing thread. Experimental results on the dataset EuRoC show that the algorithm can significantly reduce the tracking thread time consumption without loss of precision and robustness,and reduce the dependence of VI-SLAM on computing resources. In the actual flight test,the true trajectory of the drone with scale information can be estimated accurately by the algorithm in real time.

Reference | Related Articles | Metrics

Select

Task scheduling strategy based on data stream classification in Heron

ZHANG Yitian, YU Jiong, LU Liang, LI Ziyang

Journal of Computer Applications 2019, 39 (4): 1106-1116. DOI: 10.11772/j.issn.1001-9081.2018081848

Abstract （608）

PDF （1855KB）（426）

Save

In a new platform for big data stream processing called Heron, the round-robin scheduling algorithm is usually used for task scheduling by default, which does not consider the topology runtime state and the impact of different communication modes among task instances on Heron's performance. To solve this problem, a task scheduling strategy based on Data Stream Classification in Heron (DSC-Heron) was proposed, including data stream classification algorithm, data stream cluster allocation algorithm and data stream classification scheduling algorithm. Firstly, the instance allocation model of Heron was established to clarify the difference in communication overhead among different communication modes of the task instances. Secondly, the data stream was classified according to the real-time data stream size between task instances based on the data stream classification model of Heron. Finally, the packing plan of Heron was constructed by using the interrelated high-frequency data streams as the basic scheduling unit to complete the scheduling to minimize the communication cost by transforming inter-node data streams into intra-node ones as many as possible. After running SentenceWordCount, WordCount and FileWordCount topologies in a Heron cluster environment with 9 nodes, the results show that compared with the Heron default scheduling strategy, DSC-Heron has 8.35%, 7.07% and 6.83% improvements in system complete latency, inter-node communication overhead and system throughput respectively; in the load balancing aspect, the standard deviations of CPU usage and memory usage of the working nodes are decreased by 41.44% and 41.23% respectively. All experimental results show that DSC-Heron can effectively improve the performance of the topologies, and has the most significant optimization effect on FileWordCount topology which is close to the real application scenario.

Reference | Related Articles | Metrics

Select

Dynamic task dispatching strategy for stream processing based on flow network

LI Ziyang, YU Jiong, BIAN Chen, LU Liang, PU Yonglin

Journal of Computer Applications 2018, 38 (9): 2560-2567. DOI: 10.11772/j.issn.1001-9081.2017122910

Abstract （1342）

PDF （1352KB）（544）

Save

Concerning the problem that sharp increase of data input rate leads to the rising of computing latency which influences the real-time of computing in big data stream processing platform, a dynamic dispatching strategy based on flow network was proposed and applied to a data stream processing platform named Apache Flink. Firstly, a Directed Acyclic Graph (DAG) was transformed to a flow network by defining the capacity and flow of every edge and a capacity detection algorithm was used to ascertain the capacity value of every edge. Secondly, a maximum flow algorithm was used to acquire the improved network and the optimization path in order to promote the throughput of cluster when the data input rate is increasing; meanwhile the feasibility of the algorithm was proved by evaluating its time-space complexity. Finally, the influence of an important parameter on the algorithm execution was discussed and recommended parameter values of different types of jobs were obtained by experiments. The experimental results show that the throughput promoting rate of the strategy is higher than 16.12% during the increasing phases of the data input rate in different types of benchmarks compared with the original dispatching strategy of Apache Flink, so the dynamic dispatching strategy efficiently promotes the throughput of cluster under the premise of task latency constraint.

Reference | Related Articles | Metrics

Select

Proximal smoothing iterative algorithm for magnetic resonance image reconstruction based on Moreau-envelope

LIU Xiaohui, LU Lijun, FENG Qianjin, CHEN Wufan

Journal of Computer Applications 2018, 38 (7): 2076-2082. DOI: 10.11772/j.issn.1001-9081.2017122980

Abstract （685）

PDF （1157KB）（394）

Save

To solve the problem of two non-smooth regularization terms in sparse reconstruction of Magnetic Resonance Imaging (MRI) based on Compressed Sensing (CS), a new Proximal Smoothing Iterative Algorithm (PSIA) based on Moreau-envelope was proposed. The classical sparse reconstruction for MRI based on CS is a problem of minimizing the objective function with a linear combination of three terms:the least square data fidelity term, the sparse regularization term of wavelet transform, and the Total Variation (TV) regularization term. Firstly, the proximal smoothing of the wavelet transform regularization term in the objective function was carried out. Then, the linear combination of the data fidelity term and the wavelet transform regularization term after smooth approximation was considered as a new convex function that could be continuously derived. Finally, PSIA was used to solve the new optimization problem. The proposed algorithm can not only cope with the two regularization constraints simultaneously in the optimization problem, but also avoid the algorithm robustness problem caused by fixed weights. The experimental results on simulated phantom images and real MR images show that, compared with four classical sparse reconstruction algorithms such as Conjugate Gradient (CG) decent algorithm, TV l₁ Compressed MRI (TVCMRI) algorithm, Reconstruction From Partial k space algorithm (RecPF) and Fast Composite Smoothing Algorithm (FCSA), the proposed algorithm has better reconstruction results of image signal-to-noise ratio, relative error and structural similarity index, and its algorithm complexity is comparable to the existing fastest reconstruction algorithm FCSA.

Reference | Related Articles | Metrics

Select

Long text classification combined with attention mechanism

LU Ling, YANG Wu, WANG Yuanlun, LEI Zijian, LI Ying

Journal of Computer Applications 2018, 38 (5): 1272-1277. DOI: 10.11772/j.issn.1001-9081.2017112652

Abstract （2718）

PDF （946KB）（1243）

Save

News text usually consists of tens to hundreds of sentences, which has a large number of characters and contains more information that is not relevant to the topic, affecting the classification performance. In view of the problem, a long text classification method combined with attention mechanism was proposed. Firstly, a sentence was represented by a paragraph vector, and then a neural network attention model of paragraph vectors and text categories was constructed to calculate the sentence's attention. Then the sentence was filtered according to its contribution to the category, which value was mean square error of sentence attention vector. Finally, a classifier base on Convolutional Neural Network (CNN) was constructed. The filtered text and the attention matrix were respectively taken as the network input. Max pooling was used for feature filtering. Random dropout was used to reduce over-fitting. Experiments were conducted on data set of Chinese news text classification task, which was one of the shared tasks in Natural Language Processing and Chinese Computing (NLP&CC) 2014. The proposed method achieved 80.39% in terms of accuracy for the filtered text, which length was 82.74% of the text before filtering, yielded an accuracy improvement of considerable 2.1% compared to text before filtering. The emperimental results show that combining with attention mechanism, the proposed method can improve accuracy of long text classification while achieving sentence level information filtering.

Reference | Related Articles | Metrics

Select

Task scheduling algorithm based on weight in Storm

LU Liang, YU Jiong, BIAN Chen, YING Changtian, SHI Kangli, PU Yonglin

Journal of Computer Applications 2018, 38 (3): 699-706. DOI: 10.11772/j.issn.1001-9081.2017082125

Abstract （683）

PDF （1385KB）（667）

Save

Apache Storm, a typical platform for big data stream computing, uses a round-robin scheduling algorithm as the default scheduler, which does not consider the fact that differences of computational and communication cost are ubiquitous among different tasks and different data streams in one topology. Hence optimization is needed in terms of load balance and communication cost. To solve this problem, a Task Scheduling Algorithm based on Weight in Storm (TSAW-Storm) was proposed. In the algorithm, CPU occupation was taken as the weight of a task in a specific topology, and similarly tuple rate between a pair of tasks was taken as the weight of a data stream. Then tasks were assigned to the most suitable work node gradually by maximizing the gain of weight of data streams via transforming inter-node data streams into intra-node ones as many as possible with load balance ensured in order to reduce network overhead. Experimental results show that TSAW-Storm can reduce latency and inter-node tuple rate by about 30.0% and 32.9% respectively, and standard deviation of CPU load of work nodes is only 25.8% when compared to Storm default scheduling algorithm in WordCount benchmark with 8 work nodes. Additionally, online scheduler is deployed in contrast experiment. Experimental results show that TSAW-Storm can reduce latency, inter-node tuple rate and standard deviation of CPU load by about 7.76%, 11.8% and 5.93% respectively, which needs only a bit of executive overhead compared to online scheduler. Therefore, the proposed algorithm can reduce communication cost as well as improve load balance effectively, which makes a great contribution to the efficient operation of Apache Storm.

Reference | Related Articles | Metrics

Select

Task scheduling strategy based on topology structure in Storm

LIU Su, YU Jiong, LU Liang, LI Ziyang

Journal of Computer Applications 2018, 38 (12): 3481-3489. DOI: 10.11772/j.issn.1001-9081.2018040741

Abstract （1159）

PDF （1471KB）（491）

Save

In order to solve the problems of large communication cost and unbalanced load in the default round-robin scheduling strategy of Storm stream computing platform, a Task Scheduling Strategy based on Topology Structure (TS ²) in Storm was proposed. Firstly, the work nodes with sufficient and available Central Processing Unit (CPU) resources were selected and only a process was allocated to each work node to eliminate the communication cost between processes within the nodes and optimize the process deployment. Then, the topology structure was analyzed, the component with the biggest degree in the topology was found and the thread of the component was assigned with the highest priority. Finally, under the condition of the maximum number of threads that a node could carry, the associated tasks were deployed to the same node as far as possible to reduce the communication cost between nodes, improve the load balance of cluster and optimize the thread deployment. The experimental results show that, in terms of system latency, the average optimization rate of TS ² is 16.91% and 5.69% respectively compared with Storm default scheduling strategy and offline scheduling strategy, which effectively improves the real-time performance of system. Additionally, compared with the Storm default scheduling strategy, the communication cost between nodes of TS ² is reduced by 15.75% and its average throughput is improved by 14.21%.

Reference | Related Articles | Metrics

Select

Energy-efficient strategy for threshold control in big data stream computing environment

PU Yonglin, YU Jiong, WANG Yuefei, LU Liang, LIAO Bin, HOU Dongxue

Journal of Computer Applications 2017, 37 (6): 1580-1586. DOI: 10.11772/j.issn.1001-9081.2017.06.1580

Abstract （671）

PDF （1225KB）（611）

Save

In the field of big data real-time analysis and computing, the importance of stream computing is constantly improved while the energy consumption of dealing with data on stream computing platform rises constantly. In order to solve the problems, an Energy-efficient Strategy for Threshold Control (ESTC) was proposed by changing the processing mode of node to data in stream computing. First of all, according to system load difference, the threshold of the work node was determined. Secondly, according to the threshold of the work node, the system data stream was randomly selected to determine the physical voltage of the adjustment system in different data processing situation. Finally, the system power was determined according to the different physical voltage. The experimental results and theoretical analysis show that in stream computing cluster consisting of 20 normal PCs, the system based on ESTC saves about 35.2% more energy than the original system. In addition, the ratio of performance and energy consumption under ESTC is 0.0803 tuple/(s·J), while the original system is 0.0698 tuple/(s·J). Therefore, the proposed ESTC can effectively reduce the energy consumption without affecting the system performance.

Reference | Related Articles | Metrics

Select

Stance detection method based on entity-emotion evolution belief net

LU Ling, YANG Wu, LIU Xu, LI Yan

Journal of Computer Applications 2017, 37 (5): 1402-1406. DOI: 10.11772/j.issn.1001-9081.2017.05.1402

Abstract （636）

PDF （800KB）（529）

Save

To deal with the problem of stance detection of Chinese social network reviews which lack theme or emotion features, a method of stance detection based on entity-emotion evolution Bayesian belief net was proposed in this paper. Firstly, three types of domain dependent entities, including noun, verb-object phrase and verb-noun compound attributive centered structure were extracted. The domain-related emotion features were extracted, and the variable correlation strength was used as a constraint on the learning of the network structure. Then the 2-dependence Bayesian network classifier was constructed to describe the dependence of entity, stance and emotion features. The stance of reviews was deducted from combination condition of entities and emotion features. Experiments were tested on Natural Language Processing & Chinese Computing 2016 (NLP&CC2016). The experimental results show that the average micro-F reaches 70.8%, and average precision of FAVOR and AGAINST increases by 4.1 percentage points and 3.1 percentage points over Bayesian network classification method with emotion features only respectively. The average micro-F on 5 target data sets of evaluation reaches 62.3%, which exceeds average level of the evaluation.

Reference | Related Articles | Metrics

Select

Chinese short text classification method by combining semantic expansion and convolutional neural network

LU Ling, YANG Wu, YANG Youjun, CHEN Menghan

Journal of Computer Applications 2017, 37 (12): 3498-3503. DOI: 10.11772/j.issn.1001-9081.2017.12.3498

Abstract （663）

PDF （928KB）（1017）

Save

Chinese news title usually consists of a single word to dozens of words. It is difficult to improve the accuracy of news title classification due to the problems such as few characters and sparse features. In order to solve the problems, a new method for text semantic expansion based on word embedding was proposed. Firstly, the news title was expanded into triples consisting of title, subtitle and keywords. The subtitle was constructed by combining the synonym of title and the part of speech filtering method, and the keywords were extracted from the semantic composition of words in multi-scale sliding windows. Then, the Convolutional Neural Network (CNN) model was constructed for categorizing the expanded text. Max pooling and random dropout were used for feature filtering and avoidance of overfitting. Finally, the double-word spliced by title and subtitle, and the multi-keyword set were fed into the model respectively. Experiments were conducted on the news title classification dataset of the Natural Language Processing & Chinese Computing in 2017 (NLP&CC2017). The experimental results show that, the classification precision of the combination model of expanding news title to triples and CNN is 79.42% in 18 categories of news titles, which is 9.5% higher than the original CNN model without expanding, and the convergence rate of model is improved by keywords expansion. The proposed expansion method of triples and the constructed CNN model are verified to be effective.

Reference | Related Articles | Metrics

Select

Video recommendation algorithm based on clustering and hierarchical model

JIN Liang, YU Jiong, YANG Xingyao, LU Liang, WANG Yuefei, GUO Binglei, Liao Bin

Journal of Computer Applications 2017, 37 (10): 2828-2833. DOI: 10.11772/j.issn.1001-9081.2017.10.2828

Abstract （732）

PDF （1025KB）（788）

Save

Concerning the problem of data sparseness, cold start and low user experience of recommendation system, a video recommendation algorithm based on clustering and hierarchical model was proposed to improve the performance of recommendation system and user experience. Focusing on the user, similar users were obtained by analyzing Affiliation Propagation (AP) cluster, then historical data of online video of similar users was collected and a recommendation set of videos was geberated. Secondly, the user preference degree of a video was calculated and mapped into the tag weight of the video. Finally, a recommendation list of videos was generated by using analytic hierarchy model to calculate the ranking of user preference with videos. The experimental results on MovieLens Latest Dataset and YouTube video review text dataset show that the proposed algorithm has good performance in terms of Root-Mean-Square Error (RMSE) and the recommendation accuracy.

Reference | Related Articles | Metrics

Select

Dynamic data stream load balancing strategy based on load awareness

LI Ziyang, YU Jiong, BIAN Chen, WANG Yuefei, LU Liang

Journal of Computer Applications 2017, 37 (10): 2760-2766. DOI: 10.11772/j.issn.1001-9081.2017.10.2760

Abstract （892）

PDF （1299KB）（995）

Save

Concerning the problem of unbalanced load and incomplete comprehensive evaluation of nodes in big data stream processing platform, a dynamic load balancing strategy based on load awareness algorithm was proposed and applied to a data stream processing platform named Apache Flink. Firstly, the computational delay time of the nodes was obtained by using the depth-first search algorithm for the Directed Acyclic Graph (DAG) and regarded as the basis for evaluating the performance of the nodes, and the load balancing strategy was created. Secondly, the load migration technology for data stream was implemented based on the data block management strategy, and both the global and local load optimization was implemented through feedback. Finally, the feasibility of the algorithm was proved by evaluating its time-space complexity, meanwhile the influence of important parameters on the algorithm execution was discussed. The experimental results show that the proposed algorithm increases the efficiency of the task execution by optimizing the load sharing between nodes, and the task execution time is shortened by 6.51% averagely compared with the traditional load balancing strategy of Apache Flink.

Reference | Related Articles | Metrics

Select

Coordinator selection strategy based on RAMCloud

WANG Yuefei, YU Jiong, LU Liang

Journal of Computer Applications 2016, 36 (9): 2402-2408. DOI: 10.11772/j.issn.1001-9081.2016.09.2402

Abstract （441）

PDF （1102KB）（305）

Save

Focusing on the issue that ZooKeeper cannot meet the requirement of low latency and quick recovery of RAMCloud, a Coordinator Election Strategy (CES) based on RAMCloud was proposed. First of all, according to the network environment of RAMCloud and factors of the coordinator itself, the performance indexes of coordinator were divided into two categories including individual indexes and coordinator indexes, and models for them were built separately. Next, the operation of RAMCloud was divided into error-free running period and data recovery period, their fitness functions were built separately, and then the two fitness functions were merged into a total fitness function according to time ratio. Lastly, on the basis of fitness value of RAMCloud Backup Coordinator (RBC), a new operator was proposed with randomness and the capacity of selecting an ideal target: CES would firstly eliminate poor-performing RBC by screening, as the range of choice was narrowed, CES would select the ultimate RBC from the collection of ideal coordinators by means of roulette. The experimental results showed that compared with other RBCs in the NS2 simulation environment, the coordinator selected by CES decreased latency by 19.35%; compared with ZooKeeper in the RAMCloud environment, the coordinator selected by CES reduced recovery time by 10.02%. In practical application of RAMCloud, the proposed CES can choose the coordinator with better performance, ensure the demand of low latency and quick recovery.

Reference | Related Articles | Metrics

Select

Parallel access strategy for big data objects based on RAMCloud

CHU Zheng, YU Jiong, LU Liang, YING Changtian, BIAN Chen, WANG Yuefei

Journal of Computer Applications 2016, 36 (6): 1526-1532. DOI: 10.11772/j.issn.1001-9081.2016.06.1526

Abstract （682）

PDF （1195KB）（425）

Save

RAMCloud only supports the small object storage which is not larger than 1 MB. When the object which is larger than 1 MB needs to be stored in the RAMCloud cluster, it will be constrained by the object's size. So the big data objects can not be stored in the RAMCloud cluster. In order to resolve the storage limitation problem in RAMCloud, a parallel access strategy for big data objects based on RAMCloud was proposed. Firstly, the big data object was divided into several small data objects within 1 MB. Then the data summary was created in the client. The small data objects which were divided in the client were stored in RAMCloud cluster by the parallel access strategy. On the stage of reading, the data summary was firstly read, and then the small data objects were read in parallel from the RAMCloud cluster according to the data summary. Then the small data objects were merged into the big data object. The experimental results show that, the storage time of the proposed parallel access strategy for big data objects can reach 16 to 18 μs and the reading time can reach 6 to 7 μs without destroying the architecture of RAMCloud cluster. Under the InfiniBand network framework, the speedup of the proposed paralled strategy almost increases linearly, which can make the big data objects access rapidly and efficiently in microsecond level just like small data objects.

Reference | Related Articles | Metrics

Select

Strategy for object index based on RAMCloud

WANG Yuefei, YU Jiong, LU Liang

Journal of Computer Applications 2016, 36 (5): 1222-1227. DOI: 10.11772/j.issn.1001-9081.2016.05.1222

Abstract （457）

PDF （876KB）（511）

Save

In order to solve the problem of low using rate, RAMCloud would change the positions of objects, which would cause the failure for Hash to localize the object, and the low efficiency of data search. On the other hand, since the needed data could not be positioned rapidly in the recovery process of the data, the returned segments from every single backup could not be organized perfectly. Due to such problems, RAMCloud Global Key (RGK) and binary index tree, as solutions, were proposed. RGK can be divided into three parts:positioned on master, on segment, and on object. The first two parts constituted Coordinator Index Key (CIK), which means in the recovery process, Coordinator Index Tree (CIT) could position the master of segments. The last two parts constituted Master Index Key (MIK), and Master Index Tree (MIT) could obtain objects quickly, even though the data was shifted the position in the memory. Compared with the traditional RAMCloud cluster, the time of obtaining objects can obviously reduce when the data throughput is increasing. Also, the idle time of coordinator and recombined time of log are both declining. The experimental results show that the global key with the support of the binary index tree can reduce the time of obtaining objects and recovering.

Reference | Related Articles | Metrics

Select

Automatic short text summarization method based on multiple mapping

LU Ling, YANG Wu, CAO Qiong

Journal of Computer Applications 2016, 36 (2): 432-436. DOI: 10.11772/j.issn.1001-9081.2016.02.0432

Abstract （561）

PDF （860KB）（947）

Save

Traditional automatic text summarization has generally no word count requirements while many social network platforms have word count limitation. Balanced performance is hardly obtained in short text summarization by traditional digest technology because of the limitation of word count. In view of this problem, a new automatic short text summarization method was proposed. Firstly, the values of relationship mapping, length mapping, title mapping and position mapping were calculated to respectively form some sets of candidate sentences. Secondly, the candidate sentences sets were mapped to abstract sentences set by multiple mapping strategies according to series of multiple mapping rules, and the recall ratio was increased by putting central sentences into the set of abstract sentences. The experimental results show that multiple mappings can obtain stable performance in short text summarization, the F measures of ROUGE-1 and ROUGE-2 tests are 0.49 and 0.35 respectively, which are better than the average level of NLP&CC2015 evaluation, proving the effectiveness of the method.

Reference | Related Articles | Metrics

Select

News recommendation method by fusion of content-based recommendation and collaborative filtering

YANG Wu, TANG Rui, LU Ling

Journal of Computer Applications 2016, 36 (2): 414-418. DOI: 10.11772/j.issn.1001-9081.2016.02.0414

Abstract （895）

PDF （678KB）（1562）

Save

To solve poor diversity problem of user interests in content-based news recommendation and cold-start problem in hybrid recommendation, a new method of news recommendation based on fusion of content-based recommendation and collaborative filtering was proposed. Firstly, the content-based method was used to find the user's interest. Secondly, similar user group of the target user was found out by using hybrid similarity pattern which contains content similarity and behavior similarity, and the user's potential interest was found by predicting the user's interest in feature words. Next, the user interest model with characteristics of personalization and diversity was obtained by fusing user's existed interest and potential interest. Lastly, the recommendation list was output after calculating the similarity of candidate news and fusion model. The experimental results show that, compared with the content-based recommendation methods, the proposed method obviously increases F-measure and Diversity; and it has equivalent performance with hybrid recommendation method, however it does not need time to accumulate enough user clicks of candidate news and has no cold start problem.

Reference | Related Articles | Metrics

Select

Optimization of spherical Voronoi diagram generating algorithm based on graphic processing unit

WANG Lei, WANG Pengfei, ZHAO Xuesheng, LU Lituo

Journal of Computer Applications 2015, 35 (6): 1564-1566. DOI: 10.11772/j.issn.1001-9081.2015.06.1564

Abstract （680）

PDF （612KB）（487）

Save

Spherical Voronoi diagram generating algorithm based on distance computation and comparison of Quaternary Triangular Mesh (QTM) has a higher precision relative to dilation algorithm. However, massive distance computation and comparison lead to low efficiency. To improve efficiency, Graphic Processing Unit (GPU) parallel computation was used to implement the algorithm. Then, the algorithm was optimized with respect to the access to GPU shared memory, constant memory and register. At last, an experimental system was developed by using C++ and Compute Unified Device Architecture (CUDA) to compare the efficiency before and after the optimization. The experimental results show that efficiency can be improved to a great extent by using different GPU memories reasonably. In addition, a higher speed-up ratio can be acquired when the data scale is larger.

Reference | Related Articles | Metrics

Select

Energy-efficient strategy of distributed file system based on data block clustering storage

WANG Zhengying, YU Jiong, YING Changtian, LU Liang

Journal of Computer Applications 2015, 35 (2): 378-382. DOI: 10.11772/j.issn.1001-9081.2015.02.0378

Abstract （597）

PDF （766KB）（474）

Save

Concerning the low server utilization and complicated energy management caused by block random placement strategy in distributed file systems, the vector of the visiting feature on data block was built to depict the behavior of the random block accessing. K-means algorithm was adopted to do the clustering calculation according to the calculation result, then the datanodes were divided into multiple regions to store different cluster data blocks. The data blocks were dynamic reconfigured according to the clustering calculation results when the system load is low. The unnecessary datanodes could sleep to reduce the energy consumption. The flexible set of distance parameters between clusters made the strategy be suitable for different scenarios that has different requests for the energy consumption and utilization. Compared with hot-cold zoning strategies, the mathematical analysis and experimental results prove that the proposed method has a higher energy saving efficiency, the energy consumption reduces by 35% to 38%.

Reference | Related Articles | Metrics

Select

Data migration model based on RAMCloud hierarchical storage architecture

GUO Gang, YU Jiong, LU Liang, YING Changtian, YIN Lutong

Journal of Computer Applications 2015, 35 (12): 3392-3397. DOI: 10.11772/j.issn.1001-9081.2015.12.3392

Abstract （597）

PDF （878KB）（386）

Save

In order to achieve the efficient storage and access to the huge amounts of data online, under the hierarchical storage architecture of memory cloud, a model of Migration Model based on Data Significance (MMDS) was proposed. Firstly, the importance of data itself was calculated based on factors of the size of the data itself, the importance of time, the total amount of user access, and so on. Secondly, the potential value of the data was evaluated by adopting users' similarity and the importance ranking of the PageRank algorithm in the recommendation system. The importance of the data was determined by the importance of data itself and its potential value together. Then, data migration mechanism was designed based on the importance of data, The experimental results show that, the proposed model can identify the importance of the data and place the data in a hierarchical way and improved the data access hit rate from the storage system compared with the algorithms of Least Recently Used (LRU), Least Frequently Used (LFU), Migration Strategy based on Data Value (MSDV). The proposed model can alleviate the part pressure of storage and has improved the data access performance.

Reference | Related Articles | Metrics

Select

Video recommendation algorithm fusing comment analysis and latent factor model

YIN Lutong, YU Jiong, LU Liang, YING Changtian, GUO Gang

Journal of Computer Applications 2015, 35 (11): 3247-3251. DOI: 10.11772/j.issn.1001-9081.2015.11.3247

Abstract （572）

PDF （790KB）（693）

Save

Video recommender is still confronted with many challenges such as lack of meta-data of online videos, and also it's difficult to abstract features on multi-media data directly. Therefore an Video Recommendation algorithm Fusing Comment analysis and Latent factor model (VRFCL) was proposed. Starting with video comments, it firstly analyzed the sentiment orientation of user comments on multiple videos, and resulted with some numeric values representing user's attitude towards corresponding video. Then it constructed a virtual rating matrix based on numeric values calculated before, which made up for data sparsity to some extent. Taking diversity and high dimensionality features of online video into consideration, in order to dig deeper about user's latent interest into online videos, it adapted Latent Factor Model (LFM) to categorize online videos. LFM enables us to add latent category feature to the basis of traditional recommendation system which comprised of dual user-item relationship. A series of experiments on YouTube review data were carried to prove that VRFCL algorithm achieves great effectiveness.

Reference | Related Articles | Metrics

Select

Energy-efficient strategy for disks in RAMCloud

LU Liang YU Jiong YING Changtian WANG Zhengying LIU Jiankuang

Journal of Computer Applications 2014, 34 (9): 2518-2522. DOI: 10.11772/j.issn.1001-9081.2014.09.2518

Abstract （254）

PDF （777KB）（446）

Save

The emergence of RAMCloud has improved user experience of Online Data-Intensive (OLDI) applications. However, its energy consumption is higher than traditional cloud data centers. An energy-efficient strategy for disks under this architecture was put forward to solve this problem. Firstly, the fitness function and roulette wheel selection which belong to genetic algorithm were introduced to choose those energy-saving disks to implement persistent data backup; secondly, reasonable buffer size was needed to extend average continuous idle time of disks, so that some of them could be put into standby during their idle time. The simulation experimental results show that the proposed strategy can effectively save energy by about 12.69% in a given RAMCloud system with 50 servers. The buffer size has double impacts on energy-saving effect and data availability, which must be weighed.

Reference | Related Articles | Metrics

Select

Energy-efficient strategy for dynamic management of cloud storage replica based on user visiting characteristic

WANG Zhengying YU Jiong YING Changtian LU Liang BAN Aiqin

Journal of Computer Applications 2014, 34 (8): 2256-2259. DOI: 10.11772/j.issn.1001-9081.2014.08.2256

Abstract （398）

PDF （793KB）（579）

Save

For low server utilization and serious energy consumption waste problems in cloud computing environment, an energy-efficient strategy for dynamic management of cloud storage replica based on user visiting characteristic was put forward. Through transforming the study of the user visiting characteristics into calculating the visiting temperature of Block, DataNode actively applied for sleeping so as to achieve the goal of energy saving according to the global visiting temperature.The dormant application and dormancy verifying algorithm was given in detail, and the strategy concerning how to deal with the visit during DataNode dormancy was described explicitly. The experimental results show that after adopting this strategy, 29%-42% DataNode can sleep, energy consumption reduces by 31%, and server response time is well. The performance analysis show that the proposed strategy can effectively reduce the energy consumption while guaranteeing the data availability.

Reference | Related Articles | Metrics

Select

Improvement and simulation of K-shortest-paths algorithm in international flight route network

HU Xin XU Tao DING Xialu LI Jianfu

Journal of Computer Applications 2014, 34 (4): 1192-1195. DOI: 10.11772/j.issn.1001-9081.2014.04.1192

Abstract （516）

PDF （654KB）（535）

Save

K-Shortest-Paths (KSP) problem is the optimization issue in international flight route network. With the analysis on the international flight route network and KSP algorithm, the typical Yen algorithm solve KSP problem was investigated. To resolve the problem that Yen algorithm occupied much time in solving the candidate paths, an improved Yen algorithm was proposed. The improved Yen algorithm was set up by using the heuristic strategy of A* algorithm, which reduced the time to generate candidate paths, thereby, the search efficiency was improved and the search scale was reduced. The simulation results of international flight route network example show that the improved Yen algorithm can quickly solve KSP problem in international flight route network. Compared with the Yen algorithm, the efficiency of the proposed algorithm is increased by 75.19%, so it can provide decision support for international flight route optimization.

Reference | Related Articles | Metrics

Select

Optimal storing strategy based on small files in RAMCloud

YING Changtian YU Jiong LU Liang LIU Jiankuang

Journal of Computer Applications 2014, 34 (11): 3104-3108. DOI: 10.11772/j.issn.1001-9081.2014.11.3104

Abstract （387）

PDF （782KB）（702）

Save

RAMCloud stores data using log segment structure. When large amount of small files store in RAMCloud, each small file occupies a whole segment, so it may leads to much fragments inside the segments and low memory utilization. In order to solve the small file problem, a strategy based on file classification was proposed to optimize the storage of small files. Firstly, small files were classified into three categories including structural related, logical related and independent files. Before uploading, merging algorithm and grouping algorithm were used to deal with these files respectively. The experiment demonstrates that compared with non-optimized RAMCloud, the proposed strategy can improve memory utilization.

Reference | Related Articles | Metrics