Lin, ZJ; King, I; Lyu, MR PageSim: A novel link-based similarity measure for the world wide web 2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, (WI 2006 MAIN CONFERENCE PROCEEDINGS) 687-693, 2006

Eugene Garfield garfield at CODEX.CIS.UPENN.EDU
Tue Apr 29 16:46:36 EDT 2008


Email Address: zjlin at cse.cuhk.edu.hk

URL: http://www2006.org/programme/files/pdf/p36.pdf

Author(s): Lin, ZJ (Lin, Zhenjiang); King, I (King, Irwin); Lyu, MR (Lyu, 
Michael R.) 

Title: PageSim: A novel link-based similarity measure for the world wide 
web 

Source: 2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, 
(WI 2006 MAIN CONFERENCE PROCEEDINGS) 687-693, 2006 

Language: English 

Document Type: Article 

Conference Title: IEEE/WIC/ACM International Conference on Web 
Intelligence 

Conference Date: DEC 18-22, 2006 

Conference Location: Hong Kong, PEOPLES R CHINA 

Conference Sponsors: IEEE, WIC, ACM, Hong Konlg Baptist Univ 

Abstract: The requirement for measuring the similarity between web pages 
arises in many applications on the Web, such as web searching engine and 
web document classification. According to the unique characteristics of 
the Web, which are huge, rapidly growing, high dynamic, and untrustworthy, 
we propose a novel link-based similarity measure called PageSim. Based on 
the strategy of PageRank score propagation, PageSim is efficient, 
scalable, stable, and 'fairly" robust, and therefore is applicable to the 
Web. We present intuitions behind the PageSim model, and outline the model 
with mathematical definitions. We also suggest the pruning technique for 
efficient computation of PageSim scores, and conduct experiments to 
illustrate the effectiveness and specialities of PageSim. 

Addresses: Chinese Univ Hong Kong, Dept Comp Sci & Engn, Shatin, Hong Kong 
Peoples R China. 

Reprint Address: Lin, ZJ, Chinese Univ Hong Kong, Dept Comp Sci & Engn, 
Shatin, Hong Kong Peoples R China. 

Cited Reference Count: 18 

Publisher Name: IEEE COMPUTER SOC 

Publisher Address: 10662 LOS VAQUEROS CIRCLE, PO BOX 3014, LOS ALAMITOS, 
CA 90720-1264 USA 

ISBN: 978-0-7695-2747-5 

Source Item Page Count: 7 

Subject Category: Computer Science, Artificial Intelligence; Computer 
Science, Information Systems 

ISI Document Delivery No.: BFY84 

ARASU A
ACM T INTERNET TECHN 1 : 2 2001 

BREWINGTON B
WWW 00 : 2000 

FLAKE GW
KDD C : 150 2000 

GYONGYI Z
1 INT WORKSH ADV INF 2005 

HENZINGER MR
SIGIR FORUM 36 : 11 2002 

JEH G
KDD 02 : 538 2002 

JOACHIMS T
ICML 97 : 143 1997 

KESSLER M
AM DOCUMENTATION 14 : 1963 

LAWRENCE S
INTELLIGENCE 11 : 32 2000 

LIBENNOWELL D
12 ANN ACM INT C INF 2003 556 

LU W
CASCON 01 : 11 2001 

NEWMAN MEJ
EUR PHYS J B 38 : 321 2004 

NG AY
SIGIR FOR ACM SPEC I : 258 2001 

PAGE L
PAGERANK CITATION RA : 1998 

RISVIK KM
Search engines and Web dynamics 
COMPUTER NETWORKS-THE INTERNATIONAL JOURNAL OF COMPUTER AND 
TELECOMMUNICATIONS NETWORKING 39 : 289 2002 

SALTON G
AUTOMATIC TEXT PROCE : 1989 

SMALL H
J AM SOC INFORM SCI 24 : 265 1973 

WU B
WWW 05 : 820 2005 



More information about the SIGMETRICS mailing list