Thelwall M. "Can Google's PageRank be used to find the most important academic Web pages? J Doc 59(2):205-217 2003

Eugene Garfield garfield at CODEX.CIS.UPENN.EDU
Tue May 25 10:34:07 EDT 2004


Mike Thelwall : m.thelwall at wlv.ac.uk


TITLE       Can Google's PageRank be used to find the most important
            academic Web pages?
AUTHRO      Thelwall M
JOURNAL     JOURNAL OF DOCUMENTATION 59 (2): 205-217 2003

Document type: Article     Language: English     Cited References: 32
Times Cited: 0      Explanation

Abstract:
Google's PageRank is an influential algorithm that uses a model of Web use
that is dominated by its link structure in order to rank pages by their
estimated value to the Web community. This paper reports on the outcome of
applying the algorithm to the Web sites of three national university
systems in order to test whether it is capable of identifying the most
important Web pages. The results are also compared with simple inlink
counts. It was discovered that the highest inlinked pages do not always
have the highest PageRank, indicating that the two metrics are genuinely
different, even for the top pages. More significantly, however, internal
links dominated external links for the high ranks in either method and
superficial reasons accounted for high scores in both cases. It is
concluded that PageRank is not useful for identifying the top pages in a
site and that it must be combined with a powerful text matching techniques
in order to get the quality of information retrieval results provided by
Google.

Author Keywords:
Internet, universities, information retrieval, algorithms, effectiveness

KeyWords Plus:
IMPACT FACTORS, CRAWLER

Addresses:
Thelwall M, Wolverhampton Univ, Sch Comp & Informat Technol, Wolverhampton,
England
Wolverhampton Univ, Sch Comp & Informat Technol, Wolverhampton, England

Publisher:
EMERALD GROUP PUBLISHING LTD, 60/62 TOLLER LANE, BRADFORD BD8 9BY, W
YORKSHIRE, ENGLAND

IDS Number:
730YD

ISSN:
0022-0418

Cited Author            Cited Work                Volume    Page  Year
BHARAT K              10 INT WORLD WID WEB                            2001
 BRIN S                COMPUT NETWORKS ISDN          30       107      1998
 BRODER A              COMPUT NETW                   33       309      2000
 GAO J                 TREC10 WEB TRACK EXP                            2001
 GLASER J              SCIENTOMETRICS                52       411      2001
 GOODRUM AA            INFORM PROCESS MANAG          37       661      2001
 GOOGLE                GOOGL TECHN                                     2002
 HAVELIWALA T          EFFICIENT COMPUTATIO                            1999
 HAWKING D             INFORMATION TECHNOLO                   307      2000
 HEYDON A              WORLD WIDE WEB                 2       219      1999
 INGWERSEN P           J DOC                         54       236      1998
 KLEINBERG JM          J ACM                         46       604      1999
 LARSON RR             ASIS 96                                         1996
 LEYDESDORFF L         CYBERMETRICS                   4                2000
 LIFANTSEV M           P INT C INT COMP                       143      2000
 NG AY                 P 24 ANN INT ACM SIG                   258      2001
 PAGE B                6285999                       US                1998
 RAFIEI D              COMPUT NETW                   33       823      2000
 RICHARDSON M          NEURAL INFORMATION P                            2001
 ROUSSEAU R            CYBERMETRICS                   1                1997
 SMITH A               SCIENTOMETRICS                54                2002
 SMITH AG              J DOC                         55       577      1999
 SULLIVAN D            GOOGLE TOPS SEARCH H                            2002
 THELWALL M            J AM SOC INF SCI TEC          52      1157      2001
 THELWALL M            J DOC                         58        60      2002
 THELWALL M            J DOC                         57       177      2001
 THELWALL M            J DOCUMENTATION                                 2001
 THELWALL M            J INFORM SCI                  27       319      2001
 THELWALL M            ONLINE INFORMATION R          26                2002
 THELWALL M            ONLINE INFORMATION R          26       124      2002
 THELWALL M            PUBLICLY ACCESSIBLE                             2001
 XI W                  TREC 2001                              686      2001



More information about the SIGMETRICS mailing list