Torvik, VI; Weeber, M; Swanson, DR; Smalheiser, NR. "A probabilistic similarity metric for Medline records: A model for author name disambiguation " JASIST 56 (2). JAN 15 2005. p.140-158

Eugene Garfield garfield at CODEX.CIS.UPENN.EDU
Mon Feb 21 14:09:45 EST 2005


E-mail Addresses:

V. Torvik     :  vtorvik at uic.edu
M. Weeber     :  marc at weeber.net
D.R.Swanson   :  dswanson at uchicago.edu
N.R.Smalheiser:  smalheiser at psych.uic.edu


TITLE:          A probabilistic similarity metric for Medline records: A
                model for author name disambiguation (Article, English)
AUTHOR:         Torvik, VI; Weeber, M; Swanson, DR; Smalheiser, NR
SOURCE:         JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE
                AND TECHNOLOGY  56 (2). JAN 15 2005. p.140-158


ABSTRACT:       We present a model for estimating the probability that a
pair of author names (sharing last name and first initial), appearing on two
different Medline articles, refer to the same individual. The model uses a
simple yet powerful similarity profile between a pair of articles, based on
title, journal name, coauthor names, medical subject headings (MeSH),
language, affiliation, and name attributes (prevalence in the literature,
middle initial, and suffix). The similarity profile distribution is computed
from reference sets consisting of pairs of articles containing almost
exclusively author matches versus nonmatches, generated in an unbiased
manner. Although the match set is generated automatically and might contain
a small proportion of nonmatches, the model is quite robust against
contamination with nonmatches. We have created a free, public service
("Author-ity":
http://arrowsmith.psych.uic.edu) that takes as input an author's name given
on a specific article, and gives as output a list of all articles with that
(last name, first initial) ranked by decreasing similarity, with match
probability indicated.

Addresses: Torvik VI (reprint author), Univ Illinois, Dept Psychiat,
MC912,1601 W Taylor St, Chicago, IL 60612 USA
Univ Illinois, Dept Psychiat, Chicago, IL 60612 USA
Univ Chicago, Div Humanities, Chicago, IL 60637 USA


Publisher: JOHN WILEY & SONS INC, 111 RIVER ST, HOBOKEN, NJ 07030 USA
Subject Category: COMPUTER SCIENCE, INFORMATION SYSTEMS; INFORMATION SCIENCE
& LIBRARY SCIENCE

IDS Number: 890JU
ISSN: 1532-2882

Cited References:

*NOM, 2001, NLM TECHN B
   CHURCHES T, 2002, BMC MED INFORMATICS
   FRENCH JC, 2000, J AM SOC INFORM SCI, V51, P774
   GARFIELD E, 1979, CITATION INDEXING IT
   GROSSMAN JW, 2002, CONGRESSUS NUMERANTI, V158, P201
   HOLMES D, 2001, LIT LINGUSITIC COMPU, V16, P403
   JAIN AK, 1988, ALGORITHMS CLUSTERIN
   JONES KS, 2000, INFORM PROCESS MANAG, V36, P779
   JUDSON DH, 2002, ANN M CLASS SOC N AM
   KARYPIS G, 1999, IEEE COMPUT, V32, P68
   LAWRENCE S, 1999, P 3 INT C AUT AG, P392
   NEWMAN MEJ, 2001, PHYR REV E, V64
   NOYONS ECM, 1999, J AM SOC INFORM SCI, V50, P115
   ROBERTSON SE, 1977, J DOC, V33, P294
   ROBERTSON T, 1988, ORDER RESITRICTED ST
   RUIZPEREZ R, 2002, J MED LIBR ASSOC, V90, P411
   SALTON G, 1975, COMMUN ACM, V18, P613
   SWANSON DR, 1997, ARTIF INTELL, V91, P183
   TASKAR B, 2001, P 17 INT JOINT C ART, P870
   TORVIK VI, 2002, INFORMS J COMPUT, V14, P144
   TORVIK VI, 2002, NIH HUM BRAIN PROJ A
   TORVIK VI, 2003, INFORM SCIENCES, V151, P171
   WARNER JW, 2001, P 1 ACM IEEE CS JOIN, P21
   WILBUR WJ, 1996, COMPUT BIOL MED, V26, P209
   WINKLER WE, 1995, BUSINESS SURVEY METH, P355
   YU H, 2002, J AM MED INFORM ASSN, V9, P262



More information about the SIGMETRICS mailing list