SV: [SIGMETRICS] extensions of FullText.exe and Ti.ex for co-word analysis with provisions for stop word lists and word frequency lists

Jesper Wiborg Schneider JWS at DB.DK
Fri Mar 24 02:52:05 EST 2006


Dear Loet;

Many thanks for the programme.
I've tried out your program, but I run into difficulties. I'm unable to
create .dat files for use with pajek. I use Olle Persson's bibexcel program
to extract words from titles. Subsequently, I make a frequency analysis in
order to choose the words for the word.txt file in your program - one word
pr. Line. Next I arrange all the paper titles in a text file, one title pr.
Line and save it with CR/LF as a text.txt file for use with your program. I
now run the program. I do get dbase IV output files, but I do not get the
.dat files for use with pajek - I wonder where I go wrong?

Could you provide an example of a word.txt and a text.txt file - perhaps I
can detect my errors?

Best wishes - Jesper Schneider

**********************************************
Jesper Wiborg Schneider, PhD, Assistant Professor
Department of Information Studies Royal School of Library & Information
Science
Sohngårdsholmsvej 2, DK-9000 Aalborg, DENMARK Tel. +45 98773041, Fax. +45
98151042
E-mail: jws at db.dk
Homepage:http://www2.db.dk/jws/home_dk.htm
**********************************************

-----Oprindelig meddelelse-----
Fra: Loet Leydesdorff [mailto:loet at LEYDESDORFF.NET]
Sendt: 23. marts 2006 10:32
Til: SIGMETRICS at LISTSERV.UTK.EDU
Emne: [SIGMETRICS] extensions of FullText.exe and Ti.ex for co-word analysis
with provisions for stop word lists and word frequency lists


Dear colleagues,

While using the programs in class, it became clear that one needs a
provision to generate word frequency lists from the texts and to correct for
stop words. The latter facility is now added to these programs (at
http://www.leydesdorff.net/software/ti/stopword.exe ) and the former is set
as a hyperlink to TextSTAT-2 of Dutch Linguistics Department of the
Technical University in Berlin (at
http://www.niederlandistik.fu-berlin.de/textstat/software-en.html).

The programs themselves can be found at
http://www.leydesdorff.net/software/ti and
http://www.leydesdorff.net/software/fulltext , respectively. For advanced
users it may be useful to remark that one can also replace the cosine
matrices with Pearson correlation matrices by feeding the output file
matrix.dbf into SPSS and running the appropriate routines. In my opinion,
some convincing arguments have been made to use the cosine as the similarity
criterion for the visualizations.

With best wishes,


Loet
________________________________

Loet Leydesdorff
Amsterdam School of Communications Research (ASCoR)
Kloveniersburgwal 48, 1012 CX Amsterdam
Tel.: +31-20- 525 6598; fax: +31-20- 525 3681
loet at leydesdorff.net ; http://www.leydesdorff.net/



More information about the SIGMETRICS mailing list