Lin, J (Lin, Jimmy) PageRank without hyperlinks: Reranking with PubMed related article networks for biomedical text retrieval BMC BIOINFORMATICS, 9: Art. No. 270 JUN 6 2008
Eugene Garfield
garfield at CODEX.CIS.UPENN.EDU
Tue Aug 5 14:33:46 EDT 2008
E-mail Address: jimmylin at umd.edu
Author(s): Lin, J (Lin, Jimmy)
Title: PageRank without hyperlinks: Reranking with PubMed related article
networks for biomedical text retrieval
Source: BMC BIOINFORMATICS, 9: Art. No. 270 JUN 6 2008
Language: English
Document Type: Article
Abstract: Background: Graph analysis algorithms such as PageRank and HITS
have been successful in Web environments because they are able to extract
important inter-document relationships from manually-created hyperlinks.
We consider the application of these techniques to biomedical text
retrieval. In the current PubMed (R) search interface, a MEDLINE (R)
citation is connected to a number of related citations, which are in turn
connected to other citations. Thus, a MEDLINE record represents a node in
a vast content-similarity network. This article explores the hypothesis
that these networks can be exploited for text retrieval, in the same
manner as hyperlink graphs on the Web.
Results: We conducted a number of reranking experiments using the TREC
2005 genomics track test collection in which scores extracted from
PageRank and HITS analysis were combined with scores returned by an off-
the-shelf retrieval engine. Experiments demonstrate that incorporating
PageRank scores yields significant improvements in terms of standard
ranked-retrieval metrics.
Conclusion: The link structure of content-similarity networks can be
exploited to improve the effectiveness of information retrieval systems.
These results generalize the applicability of graph analysis algorithms to
text retrieval in the biomedical domain.
Addresses: Natl Lib Med, Natl Ctr Biotechnol Informat, Bethesda, MD 20894
USA; Univ Maryland, The iSch, College Pk, MD 20742 USA
Reprint Address: Lin, J, Natl Lib Med, Natl Ctr Biotechnol Informat,
Bethesda, MD 20894 USA.
E-mail Address: jimmylin at umd.edu
Cited Reference Count: 25
Times Cited: 0
Publisher: BIOMED CENTRAL LTD
Publisher Address: CURRENT SCIENCE GROUP, MIDDLESEX HOUSE, 34-42 CLEVELAND
ST, LONDON W1T 4LB, ENGLAND
ISSN: 1471-2105
DOI: 10.1186/1471-2105-9-270
29-char Source Abbrev.: BMC BIOINFORMATICS
ISO Source Abbrev.: BMC Bioinformatics
Source Item Page Count: 12
Subject Category: Biochemical Research Methods; Biotechnology & Applied
Microbiology; Mathematical & Computational Biology
ISI Document Delivery No.: 326JC
ABDI H
ENCY MEASUREMENT STA : 103 2007
AMATI G
Probabilistic models of information retrieval based on measuring the
divergence from randomness
ACM TRANSACTIONS ON INFORMATION SYSTEMS 20 : 357 2002
CLEVERDON CW
ASLIB CRANFIELD RES : 1968
DIAZ F
Regularizing query-based retrieval scores
INFORMATION RETRIEVAL 10 : 531 DOI 10.1007/s10791-007-9034-8 2007
ERKAN G
P EMNLP 2004 : 365 2004
HARMAN DK
TREC EXPT EVALUATION : 21 2005
HEARST MA
P 19 ANN INT ACM SIG : 76 1996
HERSH WR
P 14 TEXT RETRIEVAL : 2005
HUANG X
P 14 TEXT RETRIEVAL : 2005
KLEINBERG JM
Authoritative sources in a hyperlinked environment
JOURNAL OF THE ACM 46 : 604 1999
KURLAND O
P SIGIR 05 : 306 2005
LEUSKI A
P 10 INT C INF KNOWL : 33 2001
LIN J
PubMed related articles: a probabilistic topic-based model for content
similarity
BMC BIOINFORMATICS 8 : ARTN 423 2007
LIN J
INFORM PROC IN PRESS : 2008
LIN J
P 31 ANN INT ACM SIG : 2008
LIN YJ
A document clustering and ranking system for exploring MEDLINE citations
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION 14 : 651 DOI
10.1197/jamia.M2215 2007
LIU X
P 27 ANN INT ACM SIG : 186 2004
MIHALCEA R
P 42 ANN M ASS COMP : 170 2004
PAGE L
SIDLWP19990120 STANF : 1999
PIROLLI P
Information foraging
PSYCHOLOGICAL REVIEW 106 : 643 1999
SHAFFER JP
MULTIPLE HYPOTHESIS-TESTING
ANNUAL REVIEW OF PSYCHOLOGY 46 : 561 1995
SMUCKER M
P 29 ANN INT ACM SIG : 461 2006
VANRIJSBERGEN CJ
INFORM RETRIEVAL : 1979
VOORHEES EM
P 8 ANN INT ACM SIGI : 188 1985
WILBUR WJ
THE EFFECTIVENESS OF DOCUMENT NEIGHBORING IN SEARCH ENHANCEMENT
INFORMATION PROCESSING & MANAGEMENT 30 : 253 1994
More information about the SIGMETRICS
mailing list