Journal of Computer Applications ›› 2020, Vol. 40 ›› Issue (4): 978-984.DOI: 10.11772/j.issn.1001-9081.2019081327

• Artificial intelligence • Previous Articles     Next Articles

Automatic smart contract classification model based on hierarchical attention mechanism and bidirectional long short-term memory neural network

WU Yuxin1, CAI Ting2,3, ZHANG Dabin1   

  1. 1. School of Big Data and Computer Science, Guangdong Baiyun University, Guangzhou Guangdong 510450, China;
    2. School of Data and Computer Science, Sun Yat-sen University, Guangzhou Guangdong 510006, China;
    3. School of Big Data and Software Engineering, College of Mobile Telecommunications, Chongqing University of Posts and Telecommunications, Chongqing 401520, China
  • Received:2019-08-01 Revised:2019-09-05 Online:2020-04-10 Published:2019-09-29
  • Supported by:
    This work is partially supported by the Science and Technology Research Program of Chongqing Municipal Education Commission (KJZD-K201802401),the 2018 Annual Scientific Research Program of Guangdong Baiyun University(2018BYKYK05).

基于层级注意力机制与双向长短期记忆神经网络的智能合约自动分类模型

吴雨芯1, 蔡婷2,3, 张大斌1   

  1. 1. 广东白云学院 大数据与计算机学院, 广州 510450;
    2. 中山大学 数据科学与计算机学院, 广州 510006;
    3. 重庆邮电大学移通学院 大数据与软件学院, 重庆 401520
  • 通讯作者: 蔡婷
  • 作者简介:吴雨芯(1987-),男,重庆人,讲师,硕士,主要研究方向:区块链、深度学习、大数据、人工智能;蔡婷(1984-),女,湖北广水人,副教授,博士研究生,主要研究方向:区块链、互联网计算、网络安全模型、控制技术;张大斌(1969-),男,湖北潜江人,教授,博士,CCF会员,主要研究方向:信息预测与决策、数据挖掘、商务智能。
  • 基金资助:
    重庆市教育委员会科学技术研究项目(KJZD-K201802401);广东白云学院2018年度科研项目(2018BYKYK05)。

Abstract: For that the variety of smart contract applications on the blockchain platform exists more widely and manual filtering the suitable smart contract application services is more difficult, a hierarchical attention mechanism and Bidirectional Long Short-Term Memory(Bi-LSTM)neural network based model was proposed for automatic smart contract classification,namely HANN-SCA(Hierarchical Attention Neural Network with Source Code and Account). Firstly,the Bi-LSTM network was used to simultaneously model the smart contract source code and account information to extract the feature information of smart contract to the greatest extent. The source code perspective focused on the semantic features of code, and the account information perspective focused on the features of the account. Then,in the process of feature learning,the attention mechanism was introduced into the word level and the sentence level respectively to focus on the words and sentences that were important to the classification of smart contract. Finally,the code features and the account features were spliced to generate the document-level feature representation of the smart contract,and the classification task was completed through the Softmax layer. Experimental results on datasets of Dataset-E,Dataset-N and Dataset-EO show that the classification precisions of HANN-SCA model reach 93. 1%,91. 7% and 92. 1% respectively,which are better than those of the traditional Support Vector Machine(SVM)model and other neural network benchmark models,and the proposed model also has better stability and higher convergence speed.

Key words: smart contract classification, hierarchical attention mechanism, Bidirectional Long Short-Term Memory (Bi-LSTM) network, code semantic feature, account feature

摘要: 针对区块链平台上智能合约应用种类繁多,人工筛选合适的智能合约应用服务日益困难的问题,提出一种基于层级注意力机制与双向长短期记忆(Bi-LSTM)神经网络的智能合约自动分类模型——HANN-SCA。首先,利用Bi-LSTM网络从智能合约源代码和账户信息两个角度同时建模,最大限度地提取智能合约的特征信息。其中源代码角度关注智能合约中的代码语义特征,账户信息角度关注智能合约的账户特征。然后,在特征学习过程中从词层面和句层面分别引入注意力机制,重点捕获对智能合约分类有重要意义的单词和句子。最后,拼接代码特征与账户特征以生成智能合约文档级特征表示,通过Softmax层完成分类任务。实验结果表明,所提模型在Dataset-E、Dataset-N和Dataset-EO数据集上的分类正确率分别达到了93.1%、91.7%和92.1%,效果明显优于传统的支持向量机模型(SVM)和其他神经网络基准模型,且具有更好的稳定性与更高的收敛速度。

关键词: 智能合约分类, 层级注意力机制, 双向长短期记忆网络, 代码语义特征, 账户特征

CLC Number: