Powley, B; Dale, R High accuracy citation extraction and named entity recognition for a heterogeneous corpus of academic papers PROC OF THE 2007 IEEE INTL CONF ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (NLP-KE'07) 119-124, 2007
Eugene Garfield
garfield at CODEX.CIS.UPENN.EDU
Tue Feb 19 11:44:43 EST 2008
Email address: bpowley at comp.mq.edu.au
Author(s): Powley, B (Powley, Brett); Dale, R (Dale, Robert)
Title: High accuracy citation extraction and named entity recognition for
a heterogeneous corpus of academic papers
Source: PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL CONFERENCE ON NATURAL
LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (NLP-KE'07) 119-124, 2007
Language: English
Document Type: Article
Conference Title: International Conference on Natural Language Processing
and Knowledge Engineering
Conference Date: AUG 30-SEP 01, 2007
Conference Location: Beijing, PEOPLES R CHINA
Conference Sponsors: IEEE Signal Proc Soc, Chinese Assoc Artificial
Intellignece, Chinese Informat Proc Soc China, IEEE Beijing Sect, Beijing
Univ Posts & Telecommun
Abstract: Citation indices are increasingly being used not only as
navigational tools for researchers, but also as the basis for measurement
of academic performance and research impact. This means that the
reliability of tools used to extract citations and construct such indices
is becoming more critical; however, existing approaches to citation
extraction still fall short of the high accuracy required if critical
assessments are to be based on them. In this paper, we present techniques
for high accuracy extraction of citations from academic papers, designed
for applicability across a broad range of disciplines and document styles.
We integrate citation extraction, reference parsing, and author named
entity recognition to significantly improve performance in citation
extraction, and demonstrate this performance on a cross-disciplinary
heterogeneous corpus. Applying our algorithm to previously unseen
documents, we demonstrate high F-measure performance of 0.98 for author
named entity recognition and 0.97 for citation extraction.
Reprint Address: Powley, B, Macquarie Univ, Ctr Language Technol, Sydney,
NSW 2109 Australia.
Publisher Name: IEEE
Publisher Address: 345 E 47TH ST, NEW YORK, NY 10017 USA
ISBN: 978-1-4244-1610-3
Cited Reference Count: 8
BERGMARK D
CSTR20001821 : 2000
BERGMARK D
SIGIR FORUM 35 : 2001
BESAGNI D
DOCUMENT ANAL RECOGN : 84 2003
GARFIELD E
CITATION INDEXES FOR SCIENCE - NEW DIMENSION IN DOCUMENTATION THROUGH
ASSOCIATION OF IDEAS
SCIENCE 122 : 108 1955
GIUFFRIDA G
DL 00 : 77 2000
POWLEY B
P 8 RIAO INT C LARG : 2007
SEYMORE K
AAAI 99 WORKSH MACH : 1999
TAKASU A
P 3 ACM IEEE CS JOIN : 2003
More information about the SIGMETRICS
mailing list