Sriphaew, K; Theeramunkong, T Quality evaluation for document relation discovery using citation information IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, E90D (8): 1225-1234 AUG 200
Eugene Garfield
garfield at CODEX.CIS.UPENN.EDU
Wed Jun 25 10:51:03 EDT 2008
E-mail Address: thanaruk at siit.tu.ac.th
Author(s): Sriphaew, K (Sriphaew, Kritsada); Theeramunkong, T
(Theeramunkong, Thanaruk)
Title: Quality evaluation for document relation discovery using citation
information
Source: IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, E90D (8): 1225-1234
AUG 2007
Language: English
Document Type: Article
Author Keywords: document relations; frequent itemset mining; citation
matrix; quality evaluation; document relation evaluation
Abstract: Assessment of discovered patterns is an important issue in the
field of knowledge discovery. This paper presents an evaluation method
that utilizes citation (reference) information to assess the quality of
discovered document relations. With the concept of transitivity as
direct/indirect citations, a series of evaluation criteria is introduced
to define the validity of discovered relations. Two kinds of validity,
called soft validity and hard validity, are proposed to express the
quality of the discovered relations. For the purpose of impartial
comparison, the expected validity is statistically estimated based on the
generative probability of each relation pattern. The proposed evaluation
is investigated using more than 10,000 documents obtained from a research
publication database. With frequent itemset mining as a process to
discover document relations, the proposed method was shown to be a
powerful way to evaluate the relations in four aspects: soft/hard scoring,
direct/indirect citation, relative quality over the expected value, and
comparison to human judgment.
Addresses: Thammasat Univ, Sirindhorn Int Inst Technol, Sch Informat &
Comp Technol, Bangkok 10200, Thailand
Reprint Address: Sriphaew, K, Thammasat Univ, Sirindhorn Int Inst Technol,
Sch Informat & Comp Technol, 2 Prachan Rd, Bangkok 10200, Thailand.
E-mail Address: thanaruk at siit.tu.ac.th
Cited Reference Count: 19
Times Cited: 0
Publisher: IEICE-INST ELECTRONICS INFORMATION COMMUNICATIONS ENG
Publisher Address: KIKAI-SHINKO-KAIKAN BLDG MINATO-KU SHIBAKOEN 3 CHOME,
TOKYO, 105, JAPAN
ISSN: 0916-8532
29-char Source Abbrev.: IEICE TRANS INFORM SYST
ISO Source Abbrev.: IEICE Trans. Inf. Syst.
Source Item Page Count: 10
Subject Category: Computer Science, Information Systems; Computer Science,
Software Engineering
ISI Document Delivery No.: 202WE
GANIZ M
LUCSE05027 : 2005
GORDON MD
Using latent semantic indexing for literature based discovery
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE 49 : 674 1998
HAN J
2000 ACM SIGMOD INT 2000 1
KESSLER MM
BIBLIOGRAPHIC COUPLING BETWEEN SCIENTIFIC PAPERS
AMERICAN DOCUMENTATION 14 : 10 1963
KLEINBERG J
ACM 46 : 604 1999
LINDSAY RK
Literature-based discovery by lexical statistics
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE 50 : 574 1999
MCCALLUM AK
BOW TOOLKIT STAT LAN : 1996
NANBA H
11 SIG CLASS RES WOR 2000 117
PAGE L
PAGERANK CITATION RA : 1998
PRATT W
P 16 NAT C ART INT : 80 1999
ROSCH E
PRINCIPLES CATEGORIZ : 27 1978
ROUSSEAU R
A classification of author co-citations: Definitions and search strategies
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY
55 : 513 DOI 10.1002/asi.10401 2004
SALTON G
INTRO MODERN INFORM : 1986
SALTON G
INTRO MODERN INFORM : 1983
SMALL H
J AM SOC INFORM SCI 42 : 676 1973
SRIPHAEW K
P 23 INT C ART INT A : 112 2005
SWANSON DR
MEDICAL LITERATURE AS A POTENTIAL SOURCE OF NEW KNOWLEDGE
BULLETIN OF THE MEDICAL LIBRARY ASSOCIATION 78 : 29 1990
SWANSON DR
PERSPECTIVES BIOL ME 30 : 1 1986
WHITE H
BIBLIOMETRICS ANN RE : 119 1989
More information about the SIGMETRICS
mailing list