Search Result

Select

Evaluation of training efficiency and training performance of graph neural network models based on distributed environment

Yinchuan TU, Yong GUO, Heng MAO, Yi REN, Jianfeng ZHANG, Bao LI

Journal of Computer Applications 2025, 45 (8): 2409-2420. DOI: 10.11772/j.issn.1001-9081.2024081140

Abstract （49）

HTML （2）

PDF （1623KB）（22）

Save

With the rapid growth of graph data sizes， Graph Neural Network （GNN） faces computational and storage challenges in processing large-scale graph-structured data. Traditional stand-alone training methods are no longer sufficient to cope with increasingly large datasets and complex GNN models. Distributed training is an effective way to address these problems due to its parallel computing power and scalability. However， on one hand， the existing distributed GNN training evaluations mainly focus on the performance metrics represented by model accuracy and the efficiency metrics represented by training time， but pay less attention to the metrics of data processing efficiency and computational resource utilization； on the other hand， the main scenarios for algorithm efficiency evaluation are single machine with one card or single machine with multiple cards， and the existing evaluation methods are relatively simple in a distributed environment. To address these shortcomings， an evaluation method for model training in distributed scenarios was proposed， which includes three aspects： evaluation metrics， datasets， and models. Three representative GNN models were selected according to the evaluation method， and distributed training experiments were conducted on four large open graph datasets with different data characteristics to collect and analyze the obtained evaluation metrics. Experimental results show that all of model complexity， training time， computing node throughput and computing Node Average Throughput Ratio （NATR） are influenced by model architecture and data structure characteristics in distributed training； sample processing and data copying take up much time in training， and the time of one computing node waiting for other computing nodes cannot be ignored either； compared with stand-alone training， distributed training reduces the computing node throughput significantly， and further optimization of resource utilization for distributed systems is needed. It can be seen that the proposed evaluation method provides a reference for optimizing the performance of GNN model training in a distributed environment， and establishes an experimental foundation for further model optimization and algorithm improvement.

Table and Figures | Reference | Related Articles | Metrics

Select

Audit log association rule mining based on improved Apriori algorithm

XU Kaiyong, GONG Xuerong, CHENG Maocai

Journal of Computer Applications 2016, 36 (7): 1847-1851. DOI: 10.11772/j.issn.1001-9081.2016.07.1847

Abstract （806）

PDF （771KB）（559）

Save

Aiming at the problem of low-level intelligence and low utilization of audit logs of the security audit system, a secure audit system based on association rule mining was proposed. The proposed system was able to take full advantage of the existing audit logs and establish the behavior pattern database of users and the system with data mining technique. The abnormal situation was discovered in a timely manner and the security of computer system was improved. An improved E-Apriori algorithm was proposed which could narrow the scanning range of the set of transactions, lower the time complexity, and refine the operating efficiency. The experimental results indicate that the lift of recognition capability to identify the type of attack can reach 10% in the secure audit system based on association rule mining, the proposed E-Apriori algorithm clearly outperforms the traditional Apriori algorithm and FP-GROWTH algorithm, and the maximum increase can reach 51% especially in the large sparse datasets.

Reference | Related Articles | Metrics

Select

Stateless communication mechanism between an IPv4 network and the IPv6 Internet

HAN Guoliang, SHENG Maojia, BAO Congxiao, LI Xing

Journal of Computer Applications 2015, 35 (8): 2113-2117. DOI: 10.11772/j.issn.1001-9081.2015.08.2113

Abstract （1072）

PDF （938KB）（49224）

Save

In the IPv4/IPv6 transition process, since some legacy IPv4 networks still need to communicate with the IPv6 Internet, the stateless communication mechanism between an IPv4 network and the IPv6 Internet, which complements the current IPv4/IPv6 translation framework, was proposed. First, the communication procedures in two related scenarios were demonstrated. The two scenarios include IPv6 Internet clients accessing IPv4 servers and IPv4 clients accessing IPv6 Internet servers. The one-way IPv6-IPv4 address mapping function is the key component of the mechanism. Therefore, the requirements and three quantitative criteria of the one-way mapping function were discussed. Afterwards, multiple Hash functions as the candidates of the one-way mapping function were compared and analyzed with the real user data of large IPv6 websites and real IPv6 server addresses. The simulation results show that the FarmCity Hash function is suitable to be deployed in the above two scenarios because it has short average processing time, low collision rate and low reverse query complexity. It also verifies the validity of the stateless communication mechanism. Compared with current stateful communication mechanisms, the stateless mechanism has better scalability and traceability. Moreover, the capacity for bidirectional communication facilitates a smooth migration path towards the IPv6 Internet.

Reference | Related Articles | Metrics