helwall M, Wilkinson D "Finding similar academic Web sites with links, bibliometric couplings and colinks" Information Processing & Management 40(3):515-526 May 2004
Eugene Garfield
garfield at CODEX.CIS.UPENN.EDU
Fri Jul 2 14:27:33 EDT 2004
Mike Thelwall : m.thelwall at wiv.ac.uk
David Wilkinson: d.wilkinson at wlv.ac.uk
TITLE Finding similar academic Web sites with links, bibliometric
couplings and colinks
AUTHOR Thelwall M, Wilkinson D
JOURNAL INFORMATION PROCESSING & MANAGEMENT 40 (3): 515-526 MAY 2004
Document type: Article Language: English Cited References: 49
Times Cited: 0
Abstract:
A common task in both Webmetrics and Web information retrieval is to
identify a set of Web pages or sites that are similar in content. In this
paper we assess the extent to which links, colinks and couplings can be used
to identify similar Web sites. As an experiment, a random sample of 500
pairs of domains from the UK academic Web were taken and human assessments
of site similarity, based upon content type, were compared against ratings
for the three concepts. The results show that using a combination of all
three gives the highest probability of identifying similar sites, but
surprisingly this was only a marginal improvement over using links alone.
Another unexpected result was that high values for either colink counts or
couplings were associated with only a small increased likelihood of
similarity. The principal advantage of using couplings and colinks was found
to be greater coverage in terms of a much larger number of pairs of sites
being connected by these measures, instead of increased probability of
similarity. In information retrieval terminology, this is improved recall
rather than improved precision. (C) 2003 Elsevier Ltd. All rights reserved.
Author Keywords:
document clustering, webmetrics, Web information retrieval
KeyWords Plus:
SCIENCE, DEPARTMENTS, INFORMATION, COCITATION, IMPACT
Addresses:
Thelwall M, Wolverhampton Univ, Sch Comp & Informat Technol, Wulfruna St,
Wolverhampton WV1 1SB, England
Wolverhampton Univ, Sch Comp & Informat Technol, Wolverhampton WV1 1SB, England
Publisher:
PERGAMON-ELSEVIER SCIENCE LTD, THE BOULEVARD, LANGFORD LANE, KIDLINGTON,
OXFORD OX5 1GB, ENGLAND
IDS Number:
818PX
ISSN:
0306-4573
Cited Author Cited Work Volume Page Year
AGUILLO IF ONLINE INFORMATION 9 239 1998
ALMIND TC J DOC 53 404 1997
ARASU A ACM T INTERNET TECHN 1 2 2001
BJORNEBORN L P 12 ACM C HYP HYP 133 2001
BJORNEBORN L SHARED OUTLINKS WEBO 2001
BORGMAN CL ANNU REV INFORM SCI 36 3 2002
BRIN S COMPUT NETWORKS ISDN 30 107 1998
BRODER A COMPUT NETW 33 309 2000
CAWKELL T ASIS MONOGRAPH SERIE 177 2000
CHAKRABARTI S STRUCTURE BROAD TOPI 2002
CHEN C INFORMATION VISUALIS 1999
CHEN CM INTERACT COMPUT 10 353 1998
CHU H J ED LIB INFORMATION 43 110 2002
CRONIN B J INFORM SCI 27 1 2001
DEARING R REPORT NATL COMMITTE 1997
FLAKE GW COMPUTER 35 66 2002
GAO J TREC 10 WEB TRACK EX 2001
GARRIDO M CYBERACTIVISM ONLINE 165 2003
GLANZEL W SCIENTOMETRICS 50 199 2001
HAVELIWALA TH SCALABLE TECHNIQUES 2000
INGWERSEN P J DOC 54 236 1998
KLEINBERG JM J ACM 46 604 1999
LI XM SCIENTOMETRICS 57 239 2003
NG AY P 17 INT JOINT C ART 903 2001
NG AY P 24 ANN INT ACM SIG 258 2001
PARK HW J AM SOC INF SCI TEC 53 592 2002
PENNOCK DM P NATL ACAD SCI USA 99 5207 2002
PIROLLO P CHI 96 P C HUM FACT 118 1996
POLANCO X CLUSTERING MAPPING W 2001
ROGERS R SCI CULTURE 11 191 2002
ROUSSEAU R CYBERMETRICS 1 1997
SALTON G INTRO MODERN INFORMA 1983
SCHVANEVELDT RW PSYCHOL LEARN MOTIV 24 249 1989
SMALL H J AM SOC INFORM SCI 50 799 1999
SMALL H J AM SOC INFORM SCI 24 265 1973
SMALL H SCIENTOMETRICS 38 275 1997
TANG R IN PRESS DISCIPLINAR
THELWALL M IN PRESS J DOCUMENTA 59
THELWALL M INTERNET RES 12 124 2002
THELWALL M J AM SOC INF SCI TEC 53 995 2002
THELWALL M J AM SOC INF SCI TEC 52 1157 2001
THELWALL M J DOC 58 563 2002
THELWALL M J INFORM SCI 27 319 2001
THELWALL M J INFORM SCI 27 393 2001
THELWALL M PUBLICLY ACCESSIBLE 2001
THELWALL M SCIENTOMETRICS 55 335 2002
THOMAS O J INFORM SCI 26 421 2000
WATTS DJ NATURE 393 440 1998
WHITE HD J AM SOC INFORM SCI 32 163 1981
More information about the SIGMETRICS
mailing list