Ruths, D; Al Zamal, F. 2010. A Method for the Automated, Reliable Retrieval of Publication-Citation Records. PLOS ONE 5 (8): art. no.-e12133
Eugene Garfield
garfield at CODEX.CIS.UPENN.EDU
Tue Sep 21 13:23:54 EDT 2010
Ruths, D; Al Zamal, F. 2010. A Method for the Automated, Reliable Retrieval of
Publication-Citation Records. PLOS ONE 5 (8): art. no.-e12133..
Author Full Name(s): Ruths, Derek; Al Zamal, Faiyaz
Language: English
Document Type: Article
KeyWords Plus: INDEX
Abstract: Background: Publication records and citation indices often are used
to evaluate academic performance. For this reason, obtaining or computing
them accurately is important. This can be difficult, largely due to a lack of
complete knowledge of an individual's publication list and/or lack of time
available to manually obtain or construct the publication-citation record. While
online publication search engines have somewhat addressed these problems,
using raw search results can yield inaccurate estimates of publication-citation
records and citation indices.
Methodology: In this paper, we present a new, automated method that
produces estimates of an individual's publication-citation record from an
individual's name and a set of domain-specific vocabulary that may occur in the
individual's publication titles. Because this vocabulary can be harvested directly
from a research web page or online (partial) publication list, our method delivers
an easy way to obtain estimates of a publication-citation record and the
relevant citation indices. Our method works by applying a series of stringent
name and content filters to the raw publication search results returned by an
online publication search engine. In this paper, our method is run using Google
Scholar, but the underlying filters can be easily applied to any existing
publication search engine. When compared against a manually constructed data
set of individuals and their publication-citation records, our method provides
significant improvements over raw search results. The estimated publication-
citation records returned by our method have an average sensitivity of 98%
and specificity of 72% (in contrast to raw search result specificity of less than
10%). When citation indices are computed using these records, the estimated
indices are within 10% of the true value, compared to raw search results which
have overestimates of, on average, 75%.
Conclusions: These results confirm that our method provides significantly
improved estimates over raw search results, and these can either be used
directly for large-scale (departmental or university) analysis or further refined
manually to quickly give accurate publication-citation records.
Addresses: [Ruths, Derek; Al Zamal, Faiyaz] McGill Univ, Dept Comp Sci,
Montreal, PQ, Canada
Reprint Address: Ruths, D, McGill Univ, Dept Comp Sci, Montreal, PQ, Canada.
E-mail Address: druths at ruthsresearch.org
ISSN: 1932-6203
DOI: 10.1371/journal.pone.0012133
fulltext: http://www.plosone.org/article/info:doi/10.1371/journal.pone.0012133
More information about the SIGMETRICS
mailing list