计算机应用

• 数据库数据挖掘 • 上一篇    下一篇

数据仓库ETL中相似重复记录的检测方法及应用

张永, 迟忠先, 闫德勤   

  1. 辽宁师范大学计算机系
  • 收稿日期:2005-10-24 修回日期:2005-12-22 发布日期:2006-04-01 出版日期:2006-04-01
  • 通讯作者: 张永

Approximately duplicated records examining method and its application in ETL of data warehouse

<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a><<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>a<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>h<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>r<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>e<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>f<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>=<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>"<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>h<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>t<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>t<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>p<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>s<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>:<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>/<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>/<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>w<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>w<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>w<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>.<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>j<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>o<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>c<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>a<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>.<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>c<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>n<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>/<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>E<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>N<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>/<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>a<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>r<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>t<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>i<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>c<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>l<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>e<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>/<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>a<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>d<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>v<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>a<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>n<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>c<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>e<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>d<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>S<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>e<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>a<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>r<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>c<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>h<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>R<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>e<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>s<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>u<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>l<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>t<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>.<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>d<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>o<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>?<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>s<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>e<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>a<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>r<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>c<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>h<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>S<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>Q<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>L<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>=<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>(<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>(<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>(<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>Y<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>o<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>n<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>g<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>Z<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>h<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>a<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>n<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>g<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>[<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>A<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>u<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>t<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>h<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>o<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>r<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>]<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>)<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>A<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>N<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>D<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>1<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>[<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>J<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>o<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>u<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>r<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>n<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>a<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>l<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>]<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>)<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>A<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>N<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>D<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>y<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>e<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>a<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>r<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>[<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>O<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>r<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>d<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>e<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>r<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>]<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>)<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>"<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>t<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>a<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>r<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>g<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>e<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>t<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>=<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>"<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>_<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>b<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>l<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>a<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>n<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>k<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>"<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>><a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>Y<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>o<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>n<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>g<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>Z<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>h<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>a<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>n<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>g<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a><<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>/<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>a<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>><a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>,<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>,<a href="https://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>   

  1. 辽宁师范大学计算机系
  • Received:2005-10-24 Revised:2005-12-22 Online:2006-04-01 Published:2006-04-01
  • Contact: Yong Zhang

摘要: 检测和消除数据仓库中的相似重复记录是数据清洗和提高数据质量要解决的主要问题之一。将位置编码技术引入到数据仓库ETL中,提出了一种相似重复记录的检测算法PCM(位置编码方法)。该算法不仅可以应用到西文字符集中,而且也成功的应用到了中文字符集中,实例表明该算法具有很好的效果。

关键词: 位置编码, 数据仓库, ETL, 相似重复记录

Abstract: Examining and eliminating approximately duplicated records is one of main problems needed to solve for data cleaning and improving data quality. The position-coding technology to ETL of data warehouse was introduced,a novel examining algorithm named Position-Coding Method(PCM) of approximately duplicated records was presented.The algorithm was applied to Chinese character set, as well as Western character set. Experiment comparison with the previous work indicates that the method is effective.

Key words: position-coding, data warehouse, ETL, approximately duplicated records