Demleitner M; Kurtz M; Accomazzi A; Gunther E; Grant CS; Murray SS; "Automated resolution of noisy bibliographic references" Classification, Clustering, and Data Mining Applications. 2004, p.521-530 . Springer-Verlag Berlin, Berlin

Eugene Garfield garfield at CODEX.CIS.UPENN.EDU
Tue Nov 16 12:42:11 EST 2004


Markus Demleitner : msdemlei at cl.uni-heidelberg.de

FULL TEXT AVAILABLE AT :
http://arxiv.org/PS_cache/cs/pdf/0401/0401028.pdf


Title    : Automated resolution of noisy bibliographic references
Author(s): Demleitner M; Kurtz M; Accomazzi A; Gunther E;
           Grant CS; Murray SS
Source:    Classification, Clustering, and Data Mining Applications. 2004,
           p.521-530 . Springer-Verlag Berlin, Berlin


Language: English

Author Address : Demeleitner M, Univ Heidelbergm Bergheimer Str 58,
Heidelberg, Germany

Summary : We describe a system used by the NASA Astrophysics Data System to
identify bibliographic references obtained from scanned article pages by OCR
methods with records in a bibliographic database.  We analyse the process
generating the noisy references and conclude that the three-step procedure
of correcting the OCR results, parsing the corrected string and matching it
against the database provides unsatisfactory results. Instead, we propose a
method that allows a controlled merging of correction, parsing and matching,
inspired by dependency grammars.  We also report on the effectiveness of
various heuristics that we have employed to improve recall.



More information about the SIGMETRICS mailing list