Visa A. Toivonen J. Vanharanta H. Back B. "Contents matching defined by prototypes: Methodology " J. Management Information Systems 18(4):87-100 Spring 2002
Eugene Garfield
garfield at CODEX.CIS.UPENN.EDU
Wed Aug 14 11:38:00 EDT 2002
Ari Visa : avisa at cs.tut.fi
Title : Contents matching defined by prototypes: Methodology
verification with books of the bible
Author : Visa A, Toivonen J, Vanharanta H, Back B
Journal : JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 18 (4): 87-100 SPR
2002
Document type: Article Language: English
Cited References: 21 Times Cited: 0
Abstract:
It is common that text documents are characterized and classified by key
words, index terms, or headings. We have developed a new methodology based
on prototype matching. The prototype is an interesting document or a part of
an extracted, interesting text. This prototype is matched with the existing
document database or with the monitored document flow. The claim is that the
new methodology is capable of extracting the contents of the document. To
verify this hypothesis, a test with the Bible was designed. Different
translations in English, Latin, Greek, and Finnish were selected to test
materials. Verification tests that included the search of the ten nearest
books to every book of the Bible were performed with a designed prototype
version of the software application. The test results are
reported in this paper.
Author Keywords:
bible, document classification, knowledge discovery, methodology, prototype
matching, text mining, verification
Addresses:
Abo Akad Univ, Turku, Finland
Publisher:
M E SHARPE INC, ARMONK
IDS Number:
532QG
ISSN:
0742-1222
Cited Author Cited Work Volume Page Year
BACK B INT J ACCOUNTING 2 249 2001
BROWN G DISCOURSE ANAL 1983
CANGELOSI A IEEE T EVOLUT COMPUT 5 93 2001
DEWEY M CLASSIFICATION SUBJE 1876
DEWEY M US BUREAU ED SPECIAL 623 1876
DRETSKE FI KNOWLEDGE FLOW INFOR 1981
HARTER SP ONLINE INFORMATION R 1986
KIRBY S IEEE T EVOLUT COMPUT 5 102 2001
LAHTINEN T THESIS U HELSINKI FI 2000
LYONS J SEMANTICS 1 1977
MANNING CD FDN STAT NATURAL LAN 1999
OARD DW CSTR3643 1996
SALTON G AUTOMATIC TEXT PROCE 1989
SALTON G COMMUN ACM 18 613 1975
SAVAGERUMBAUGH ES BRAIN LANG 6 265 1978
SINCLAIR J COLLINS COBUILD ENGL 1987
SOUKHANOV AH WEBSTERS NEW RIVERSI 1984
SYKES JB CONCISE OXFORD DICT 1976
VISA A P 34 ANN HAW INT C S 2001
VISA A P SSGRR 2000 INT C A 2000
VISA A THEORY TOOLS TECHN 3 4384 149 2001
More information about the SIGMETRICS
mailing list