Journal of Computer Applications ›› 2017, Vol. 37 ›› Issue (11): 3335-3338.DOI: 10.11772/j.issn.1001-9081.2017.11.3335

Previous Articles     Next Articles

Session identification algorithm based on dynamic time threshold of adjacent requests

ZENG Ling1,2, XIAO Ruliang1,2   

  1. 1. Faculty of Software, Fujian Normal University, Fuzhou Fujian 350117, China;
    2. Fujian Provincial Engineering Research Center of Public Service Big Data Mining and Application, Fuzhou Fujian 350117, China
  • Received:2017-05-19 Revised:2017-07-28 Online:2017-11-10 Published:2017-11-11
  • Supported by:
    This work is partially supported by the Key Project of Fujian Scientific and Technolgical Plan (2016H6007), the City School Cooperation Project of Fuzhou (2016-G-40).

基于相邻请求的动态时间阈值会话识别算法

曾令1,2, 肖如良1,2   

  1. 1. 福建师范大学 软件学院, 福州 350117;
    2. 福建省公共服务大数据挖掘与应用工程研究中心, 福州 350117
  • 通讯作者: 肖如良
  • 作者简介:曾令(1993-),女,湖北孝感人,硕士研究生,主要研究方向:机器学习;肖如良(1966-),男,湖南娄底人,教授,博士,CCF高级会员,主要研究方向:Web智能推荐系统、软件工程、系统虚拟化。
  • 基金资助:
    福建省科技计划重大项目(2016H6007);福州市市校合作项目(2016-G-40)。

Abstract: Focusing on the issue of improving the efficiency of session sequence modeling in the anomaly detection analysis of big data platform, a session identification algorithm based on Dynamic Adjustive Interval Time threShold of adjacent requests (DAITS) was proposed. Firstly, the factor of website pages and the average factor of users access time to the page were combined. Then, the appropriate weighting factor was used to dynamically adjust the time threshold. Finally, the session was divided according to whether the time threshold was exceeded. The experimental results show that compared with the traditional methods of using fixed thresholds, the precision of session identification was increased by 14.8% and the recall was increased by 13.2%; compared with the existing methods with dynamic adjustive thresholds, the precision of session identification was increased by 6.2% and the recall was increased by 3.2%.

Key words: anomaly detection, session identification, session sequence, adjacent request, dynamic time threshold

摘要: 在大数据平台的异常检测分析中,为提高会话序列建模的效率,提出一种基于相邻请求的动态调整时间间隔阈值的会话识别算法——DAITS算法。首先同时结合站点页面因子和用户访问页面时间的平均因子;然后在两者间加入合适的权重因子对时间阈值进行动态调整;最后根据判断是否超过该时间阈值来划分会话。实验结果表明,DAITS算法比传统使用固定阈值的方法在会话识别的精确率和查全率上提高了14.8%和13.2%,比动态调整阈值的方法在精确率和查全率上提高了6.2%和3.2%。

关键词: 异常检测, 会话识别, 会话序列, 相邻请求, 动态时间阈值

CLC Number: