Visa A. Toivonen J. Vanharanta H. Back B. "Contents matching defined by prototypes: Methodology " J. Management Information Systems 18(4):87-100 Spring 2002

Eugene Garfield garfield at CODEX.CIS.UPENN.EDU
Wed Aug 14 11:38:00 EDT 2002


Ari Visa : avisa at cs.tut.fi

Title : Contents matching defined by prototypes: Methodology
verification with books of the bible
Author  : Visa A, Toivonen J, Vanharanta H, Back B
Journal : JOURNAL OF MANAGEMENT INFORMATION SYSTEMS  18 (4): 87-100 SPR
2002

 Document type: Article   Language: English
Cited References: 21      Times Cited: 0


Abstract:
It is common that text documents are characterized and classified by key
words, index terms, or headings. We have developed a new methodology based
on prototype matching. The prototype is an interesting document or a part of
an extracted, interesting text. This prototype is matched with the existing
document database or with the monitored document flow. The claim is that the
new methodology is capable of extracting the contents of the document. To
verify this hypothesis, a test with the Bible was designed. Different
translations in English, Latin, Greek, and Finnish were selected to test
materials. Verification tests that included the search of the ten nearest
books to every book of the Bible were performed with a designed prototype
version of the software application. The test results are
reported in this paper.

Author Keywords:
bible, document classification, knowledge discovery, methodology, prototype
matching, text mining, verification

Addresses:
Abo Akad Univ, Turku, Finland

Publisher:
M E SHARPE INC, ARMONK

IDS Number:
532QG

ISSN:
0742-1222

Cited Author            Cited Work                Volume   Page      Year

 BACK B                INT J ACCOUNTING               2     249      2001
 BROWN G               DISCOURSE ANAL                                1983
 CANGELOSI A           IEEE T EVOLUT COMPUT           5      93      2001
 DEWEY M               CLASSIFICATION SUBJE                          1876
 DEWEY M               US BUREAU ED SPECIAL                 623      1876
 DRETSKE FI            KNOWLEDGE FLOW INFOR                          1981
 HARTER SP             ONLINE INFORMATION R                          1986
 KIRBY S               IEEE T EVOLUT COMPUT           5     102      2001
 LAHTINEN T            THESIS U HELSINKI FI                          2000
 LYONS J               SEMANTICS 1                                   1977
 MANNING CD            FDN STAT NATURAL LAN                          1999
 OARD DW               CSTR3643                                      1996
 SALTON G              AUTOMATIC TEXT PROCE                          1989
 SALTON G              COMMUN ACM                    18     613      1975
 SAVAGERUMBAUGH ES     BRAIN LANG                     6     265      1978
 SINCLAIR J            COLLINS COBUILD ENGL                          1987
 SOUKHANOV AH          WEBSTERS NEW RIVERSI                          1984
 SYKES JB              CONCISE OXFORD DICT                           1976
 VISA A                P 34 ANN HAW INT C S                          2001
 VISA A                P SSGRR 2000 INT C A                          2000
 VISA A                THEORY TOOLS TECHN 3        4384     149      2001



More information about the SIGMETRICS mailing list