计算机应用 ›› 2009, Vol. 29 ›› Issue (09): 2348-2350.

• 信息安全 • 上一篇    下一篇

基于文本特征的文本水印算法

斯琴1,张力2,廉德亮2   

  1. 1. 深圳大学
    2.
  • 收稿日期:2009-03-24 修回日期:2009-05-15 发布日期:2009-11-10 出版日期:2009-09-01
  • 通讯作者: 斯琴
  • 基金资助:
    国家级基金

Text watermarking based on text feature

  • Received:2009-03-24 Revised:2009-05-15 Online:2009-11-10 Published:2009-09-01

摘要: 基于格式的文本水印算法对格式攻击的鲁棒性比较差,而基于自然语言的文本水印算法相对难以实现,因此提出一种基于词频的文本零水印算法。对文本内容进行分词并计算每个分词的词频,根据设定的词频阈值范围依次提取分词序列作为文本特征,将文本特征、水印和密钥注册于版权保护(IPR)信息库。水印检测可实现盲检测。将该算法用于含有图像等多媒体信息的中英文文档,试验结果证明,该算法对剪切、粘贴、内容顺序颠倒等攻击有较强的鲁棒性。

关键词: 文本水印, 文本特征, 特征提取, 词频, 分词

Abstract: The format-based text watermarking algorithm has poor robustness against format attacks, and the natural-language-based text watermarking algorithm is difficult to realize. A text zero-watermarking based on word frequency was proposed. Words were segmented and word frequency was computed. The words were sequentially extracted in threshold range of word frequency to be text feature. Text feature, watermark and secret key were registered to the information database. Watermarking detection was blind. Both Chinese and English documents with multimedia information were tested in the experiments. Experimental results demonstrate that the technique has good robustness against attacks, such as cutting, pasting and reversing.

Key words: text watermarking, text feature, feature extraction, word frequency, word segmentation

中图分类号: