计算机应用 ›› 2010, Vol. 30 ›› Issue (4): 1056-1058.

• 信息安全 • 上一篇    下一篇

利用图片类日志信息改进会话识别质量

范纯龙1,姜宏飞2,李华3   

  1. 1. 辽宁沈阳航空工业学院新校区
    2. 辽宁沈阳航空工业学院道义校区
    3.
  • 收稿日期:2009-10-12 修回日期:2009-12-03 发布日期:2010-04-15 出版日期:2010-04-01
  • 通讯作者: 范纯龙
  • 基金资助:
    辽宁省教育厅基金

Use of picture log information in improving session identification quality

  • Received:2009-10-12 Revised:2009-12-03 Online:2010-04-15 Published:2010-04-01
  • Contact: fan chunlong

摘要: 数据预处理是Web日志挖掘的基础,而会话识别则是数据预处理的关键步骤,其质量严重影响Web日志挖掘的结果。在分析现有会话识别方法的基础上,提出了利用数据预处理中废弃的图片等日志数据,并结合扩展Web图结构,从页面分组规则和路径补全算法两个方面改进会话识别质量,并通过实验证实该方法对改善会话识别质量是有效的。

关键词: 会话识别, 数据预处理, Web图结构, 路径补全, 数据清洗

Abstract: Data pre-processing is the basis for Web log mining, and session identification is a key step in data preprocessing, so session identification quality seriously influences Web log mining results. The paper analyzed the current session identification methods and proposed to improve session identification quality by pictures log data abandoned in data pre-processing. With reference to the expansion of Web graph structure, the improvement was made from such two aspects as page grouping rules and path completion algorithm. The method is experimentally proved to be effective to improve the session identification quality.

Key words: session identification, data pre-processing, Web graph structure, path completion, data cleaning