Manaris B, Pellicoro L, Pothering G, Hodges H "Investigating Esperanto's statistical proportions relative to other languages using neural networks and Zipf's law " Proc Iasted Intl Conf on Artificial Intelligence & Applications : 102-108, 2006

Eugene Garfield garfield at CODEX.CIS.UPENN.EDU
Tue Jul 18 14:54:56 EDT 2006


Manaris at cs.cofc.edu
idpellic at edisto.cofc.edu
pother at cs.cofc.edu
Hodgesh at cofc.edu


Title: Investigating Esperanto's statistical proportions relative to other
languages using neural networks and Zipf's law

Author(s): Manaris B, Pellicoro L, Pothering G, Hodges H

Source: PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON ARTIFICIAL
INTELLIGENCE AND APPLICATIONS : 102-108, 2006

Editor(s): Devedzic V
Document Type: Article
Language: English
Cited References: 20

Conference Information: IASTED International Conference on Artificial
Intelligence and Applications
Innsbruck, AUSTRIA, FEB 13-16, 2006
IASTED

Abstract:
Esperanto is a constructed natural language, which was intended to be an
easy-to-learn lingua franca. Zipf's law models the statistical proportions
of various phenomena in human ecology, including natural languages. Given
Esperanto's artificial origins, one wonders how "natural" it appears,
relative to other natural languages, in the context of Zipf's law. To
explore this question, we collected a total of 283 books from six
languages: English, French, German, Italian, Spanish, and Esperanto. We
applied Zipf-based metrics on our corpus to extract distributions for word,
word distance, word bigram, word trigram, and word length for each book.
Statistical analyses show that Esperanto's statistical proportions are
similar to those of other languages. We then trained artificial neural
networks (ANNs) to classify books according to language. The ANNs achieved
high accuracy rates (86.3% to 98.6%). Subsequent analysis identified German
as having the most unique proportions, followed by Esperanto, Italian,
Spanish, English, and French. Analysis of misclassified patterns shows that
Esperanto's statistical proportions resemble mostly those of German and
Spanish, and least those of French and Italian.


Addresses: Manaris B (reprint author), Coll Charleston, Dept Comp Sci, 66
George St, Charleston, SC 29424 USA
Coll Charleston, Dept Comp Sci, Charleston, SC 29424 USA

Publisher: ACTA PRESS, BOX 3243 POSTAL STATION B, CALGARY, T2M 4L8, CANADA
IDS Number: BEA05

ISBN: 0-88986-556-6

CITED REFERENCES:
ADAMIC LA
Q J ELECT COMMERCE 1 : 5 2000

 BAK P
title not available
PHYS REV LETT 59 : 381 1987

 BOULTON M
ZAMENHOF CREATOR ESP : 1960

 BURGOS JD
title not available
BIOSYSTEMS 39 : 227 1996

 GELBUKH A
LECT NOTES COMPUTER : 2001

 HALL I
THESIS U WAIKATO : 1998

 HARLOW D
LITERATURO : 2005

 KALDA J
ZIPFS LAW HUMAN HEAR : 2001

 LI W
ZIPFS LAW : 2005

 LI WT
title not available
J THEOR BIOL 219 : 539 2002

 MACHADO P
Adaptive critics for evolutionary artists
APPLICATIONS OF EVOLUTIONARY COMPUTING 3005 : 437 2004

 MANARIS B
title not available
COMPUT MUSIC J 29 : 55 2005

 MANDELBROT B
FRACTAL GEOMETRY NAT : 1977

 SALINGAROS NA
title not available
ENVIRON PLANN B 26 : 909 1999

 SCHROEDER M
FRACTALS CHAOS POWER : 1991

 SPEHAR B
title not available
COMPUT GRAPH-UK 27 : 813 2003

 TAYLOR RP
title not available
NATURE 399 : 422 1999

 VOSS RF
title not available
NATURE 258 : 317 1975

 WITTEN IH
DATA MINING PRACTICA : 2005

 ZIPF GK
HUMAN BEHAV PRINCIPL : 1949



More information about the SIGMETRICS mailing list