Manaris B, Pellicoro L, Pothering G, Hodges H "Investigating Esperanto's statistical proportions relative to other languages using neural networks and Zipf's law " Proc Iasted Intl Conf on Artificial Intelligence & Applications : 102-108, 2006
Eugene Garfield
garfield at CODEX.CIS.UPENN.EDU
Tue Jul 18 14:54:56 EDT 2006
Manaris at cs.cofc.edu
idpellic at edisto.cofc.edu
pother at cs.cofc.edu
Hodgesh at cofc.edu
Title: Investigating Esperanto's statistical proportions relative to other
languages using neural networks and Zipf's law
Author(s): Manaris B, Pellicoro L, Pothering G, Hodges H
Source: PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON ARTIFICIAL
INTELLIGENCE AND APPLICATIONS : 102-108, 2006
Editor(s): Devedzic V
Document Type: Article
Language: English
Cited References: 20
Conference Information: IASTED International Conference on Artificial
Intelligence and Applications
Innsbruck, AUSTRIA, FEB 13-16, 2006
IASTED
Abstract:
Esperanto is a constructed natural language, which was intended to be an
easy-to-learn lingua franca. Zipf's law models the statistical proportions
of various phenomena in human ecology, including natural languages. Given
Esperanto's artificial origins, one wonders how "natural" it appears,
relative to other natural languages, in the context of Zipf's law. To
explore this question, we collected a total of 283 books from six
languages: English, French, German, Italian, Spanish, and Esperanto. We
applied Zipf-based metrics on our corpus to extract distributions for word,
word distance, word bigram, word trigram, and word length for each book.
Statistical analyses show that Esperanto's statistical proportions are
similar to those of other languages. We then trained artificial neural
networks (ANNs) to classify books according to language. The ANNs achieved
high accuracy rates (86.3% to 98.6%). Subsequent analysis identified German
as having the most unique proportions, followed by Esperanto, Italian,
Spanish, English, and French. Analysis of misclassified patterns shows that
Esperanto's statistical proportions resemble mostly those of German and
Spanish, and least those of French and Italian.
Addresses: Manaris B (reprint author), Coll Charleston, Dept Comp Sci, 66
George St, Charleston, SC 29424 USA
Coll Charleston, Dept Comp Sci, Charleston, SC 29424 USA
Publisher: ACTA PRESS, BOX 3243 POSTAL STATION B, CALGARY, T2M 4L8, CANADA
IDS Number: BEA05
ISBN: 0-88986-556-6
CITED REFERENCES:
ADAMIC LA
Q J ELECT COMMERCE 1 : 5 2000
BAK P
title not available
PHYS REV LETT 59 : 381 1987
BOULTON M
ZAMENHOF CREATOR ESP : 1960
BURGOS JD
title not available
BIOSYSTEMS 39 : 227 1996
GELBUKH A
LECT NOTES COMPUTER : 2001
HALL I
THESIS U WAIKATO : 1998
HARLOW D
LITERATURO : 2005
KALDA J
ZIPFS LAW HUMAN HEAR : 2001
LI W
ZIPFS LAW : 2005
LI WT
title not available
J THEOR BIOL 219 : 539 2002
MACHADO P
Adaptive critics for evolutionary artists
APPLICATIONS OF EVOLUTIONARY COMPUTING 3005 : 437 2004
MANARIS B
title not available
COMPUT MUSIC J 29 : 55 2005
MANDELBROT B
FRACTAL GEOMETRY NAT : 1977
SALINGAROS NA
title not available
ENVIRON PLANN B 26 : 909 1999
SCHROEDER M
FRACTALS CHAOS POWER : 1991
SPEHAR B
title not available
COMPUT GRAPH-UK 27 : 813 2003
TAYLOR RP
title not available
NATURE 399 : 422 1999
VOSS RF
title not available
NATURE 258 : 317 1975
WITTEN IH
DATA MINING PRACTICA : 2005
ZIPF GK
HUMAN BEHAV PRINCIPL : 1949
More information about the SIGMETRICS
mailing list