Torvik, VI; Weeber, M; Swanson, DR; Smalheiser, NR. "A probabilistic similarity metric for Medline records: A model for author name disambiguation " JASIST 56 (2). JAN 15 2005. p.140-158
Eugene Garfield
garfield at CODEX.CIS.UPENN.EDU
Mon Feb 21 14:09:45 EST 2005
E-mail Addresses:
V. Torvik : vtorvik at uic.edu
M. Weeber : marc at weeber.net
D.R.Swanson : dswanson at uchicago.edu
N.R.Smalheiser: smalheiser at psych.uic.edu
TITLE: A probabilistic similarity metric for Medline records: A
model for author name disambiguation (Article, English)
AUTHOR: Torvik, VI; Weeber, M; Swanson, DR; Smalheiser, NR
SOURCE: JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE
AND TECHNOLOGY 56 (2). JAN 15 2005. p.140-158
ABSTRACT: We present a model for estimating the probability that a
pair of author names (sharing last name and first initial), appearing on two
different Medline articles, refer to the same individual. The model uses a
simple yet powerful similarity profile between a pair of articles, based on
title, journal name, coauthor names, medical subject headings (MeSH),
language, affiliation, and name attributes (prevalence in the literature,
middle initial, and suffix). The similarity profile distribution is computed
from reference sets consisting of pairs of articles containing almost
exclusively author matches versus nonmatches, generated in an unbiased
manner. Although the match set is generated automatically and might contain
a small proportion of nonmatches, the model is quite robust against
contamination with nonmatches. We have created a free, public service
("Author-ity":
http://arrowsmith.psych.uic.edu) that takes as input an author's name given
on a specific article, and gives as output a list of all articles with that
(last name, first initial) ranked by decreasing similarity, with match
probability indicated.
Addresses: Torvik VI (reprint author), Univ Illinois, Dept Psychiat,
MC912,1601 W Taylor St, Chicago, IL 60612 USA
Univ Illinois, Dept Psychiat, Chicago, IL 60612 USA
Univ Chicago, Div Humanities, Chicago, IL 60637 USA
Publisher: JOHN WILEY & SONS INC, 111 RIVER ST, HOBOKEN, NJ 07030 USA
Subject Category: COMPUTER SCIENCE, INFORMATION SYSTEMS; INFORMATION SCIENCE
& LIBRARY SCIENCE
IDS Number: 890JU
ISSN: 1532-2882
Cited References:
*NOM, 2001, NLM TECHN B
CHURCHES T, 2002, BMC MED INFORMATICS
FRENCH JC, 2000, J AM SOC INFORM SCI, V51, P774
GARFIELD E, 1979, CITATION INDEXING IT
GROSSMAN JW, 2002, CONGRESSUS NUMERANTI, V158, P201
HOLMES D, 2001, LIT LINGUSITIC COMPU, V16, P403
JAIN AK, 1988, ALGORITHMS CLUSTERIN
JONES KS, 2000, INFORM PROCESS MANAG, V36, P779
JUDSON DH, 2002, ANN M CLASS SOC N AM
KARYPIS G, 1999, IEEE COMPUT, V32, P68
LAWRENCE S, 1999, P 3 INT C AUT AG, P392
NEWMAN MEJ, 2001, PHYR REV E, V64
NOYONS ECM, 1999, J AM SOC INFORM SCI, V50, P115
ROBERTSON SE, 1977, J DOC, V33, P294
ROBERTSON T, 1988, ORDER RESITRICTED ST
RUIZPEREZ R, 2002, J MED LIBR ASSOC, V90, P411
SALTON G, 1975, COMMUN ACM, V18, P613
SWANSON DR, 1997, ARTIF INTELL, V91, P183
TASKAR B, 2001, P 17 INT JOINT C ART, P870
TORVIK VI, 2002, INFORMS J COMPUT, V14, P144
TORVIK VI, 2002, NIH HUM BRAIN PROJ A
TORVIK VI, 2003, INFORM SCIENCES, V151, P171
WARNER JW, 2001, P 1 ACM IEEE CS JOIN, P21
WILBUR WJ, 1996, COMPUT BIOL MED, V26, P209
WINKLER WE, 1995, BUSINESS SURVEY METH, P355
YU H, 2002, J AM MED INFORM ASSN, V9, P262
More information about the SIGMETRICS
mailing list