Lin, J (Lin, Jimmy) PageRank without hyperlinks: Reranking with PubMed related article networks for biomedical text retrieval BMC BIOINFORMATICS, 9: Art. No. 270 JUN 6 2008

Eugene Garfield garfield at CODEX.CIS.UPENN.EDU
Tue Aug 5 14:33:46 EDT 2008


E-mail Address: jimmylin at umd.edu 

Author(s): Lin, J (Lin, Jimmy) 

Title: PageRank without hyperlinks: Reranking with PubMed related article 
networks for biomedical text retrieval 

Source: BMC BIOINFORMATICS, 9: Art. No. 270 JUN 6 2008 

Language: English 

Document Type: Article 

Abstract: Background: Graph analysis algorithms such as PageRank and HITS 
have been successful in Web environments because they are able to extract 
important inter-document relationships from manually-created hyperlinks. 
We consider the application of these techniques to biomedical text 
retrieval. In the current PubMed (R) search interface, a MEDLINE (R) 
citation is connected to a number of related citations, which are in turn 
connected to other citations. Thus, a MEDLINE record represents a node in 
a vast content-similarity network. This article explores the hypothesis 
that these networks can be exploited for text retrieval, in the same 
manner as hyperlink graphs on the Web.
Results: We conducted a number of reranking experiments using the TREC 
2005 genomics track test collection in which scores extracted from 
PageRank and HITS analysis were combined with scores returned by an off-
the-shelf retrieval engine. Experiments demonstrate that incorporating 
PageRank scores yields significant improvements in terms of standard 
ranked-retrieval metrics.
Conclusion: The link structure of content-similarity networks can be 
exploited to improve the effectiveness of information retrieval systems. 
These results generalize the applicability of graph analysis algorithms to 
text retrieval in the biomedical domain. 

Addresses: Natl Lib Med, Natl Ctr Biotechnol Informat, Bethesda, MD 20894 
USA; Univ Maryland, The iSch, College Pk, MD 20742 USA 

Reprint Address: Lin, J, Natl Lib Med, Natl Ctr Biotechnol Informat, 
Bethesda, MD 20894 USA. 

E-mail Address: jimmylin at umd.edu 

Cited Reference Count: 25 

Times Cited: 0 

Publisher: BIOMED CENTRAL LTD 

Publisher Address: CURRENT SCIENCE GROUP, MIDDLESEX HOUSE, 34-42 CLEVELAND 
ST, LONDON W1T 4LB, ENGLAND 

ISSN: 1471-2105 

DOI: 10.1186/1471-2105-9-270 

29-char Source Abbrev.: BMC BIOINFORMATICS 

ISO Source Abbrev.: BMC Bioinformatics 

Source Item Page Count: 12 

Subject Category: Biochemical Research Methods; Biotechnology & Applied 
Microbiology; Mathematical & Computational Biology 

ISI Document Delivery No.: 326JC 

ABDI H
ENCY MEASUREMENT STA : 103 2007 

AMATI G
Probabilistic models of information retrieval based on measuring the 
divergence from randomness 
ACM TRANSACTIONS ON INFORMATION SYSTEMS 20 : 357 2002
 
CLEVERDON CW
ASLIB CRANFIELD RES : 1968 

DIAZ F
Regularizing query-based retrieval scores 
INFORMATION RETRIEVAL 10 : 531 DOI 10.1007/s10791-007-9034-8 2007 

ERKAN G
P EMNLP 2004 : 365 2004 

HARMAN DK
TREC EXPT EVALUATION : 21 2005 

HEARST MA
P 19 ANN INT ACM SIG : 76 1996 

HERSH WR
P 14 TEXT RETRIEVAL : 2005 

HUANG X
P 14 TEXT RETRIEVAL : 2005 

KLEINBERG JM
Authoritative sources in a hyperlinked environment 
JOURNAL OF THE ACM 46 : 604 1999 

KURLAND O
P SIGIR 05 : 306 2005 

LEUSKI A
P 10 INT C INF KNOWL : 33 2001 

LIN J
PubMed related articles: a probabilistic topic-based model for content 
similarity 
BMC BIOINFORMATICS 8 : ARTN 423 2007 

LIN J
INFORM PROC IN PRESS : 2008 

LIN J
P 31 ANN INT ACM SIG : 2008 

LIN YJ
A document clustering and ranking system for exploring MEDLINE citations 
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION 14 : 651 DOI 
10.1197/jamia.M2215 2007 

LIU X
P 27 ANN INT ACM SIG : 186 2004 

MIHALCEA R
P 42 ANN M ASS COMP : 170 2004 

PAGE L
SIDLWP19990120 STANF : 1999 

PIROLLI P
Information foraging 
PSYCHOLOGICAL REVIEW 106 : 643 1999 

SHAFFER JP
MULTIPLE HYPOTHESIS-TESTING 
ANNUAL REVIEW OF PSYCHOLOGY 46 : 561 1995 

SMUCKER M
P 29 ANN INT ACM SIG : 461 2006 

VANRIJSBERGEN CJ
INFORM RETRIEVAL : 1979 

VOORHEES EM
P 8 ANN INT ACM SIGI : 188 1985 

WILBUR WJ
THE EFFECTIVENESS OF DOCUMENT NEIGHBORING IN SEARCH ENHANCEMENT 
INFORMATION PROCESSING & MANAGEMENT 30 : 253 1994 



More information about the SIGMETRICS mailing list