[Sigia-l] Cluster analysis algorithms for understanding card sorts

William Denton buff at pobox.com
Mon Sep 8 23:17:59 EDT 2003


Recently I did a card sort with sixteen people organizing 70 cards each.
As others have found, IBM's EZSort [1] crashes on this amount of data.  I
hacked up some Perl and made a matrix showing how closely each card is
related to each other.  It was revealing just to look at each term and
list the terms most commonly associated with it, without any larger
groupings.  Some clusters were easily seen when everyone associated A with
B and C, B with C and A, and C with B and A, for example.

This doesn't get me to a dendrogram, though, which is what I want: a tree
chart like EZSort generates.  I've looked around on the web and some
journal indexes and can't find anything that describes just how cluster
analysis is done in cases like this.  There are some textbooks I can try,
but I get the feeling they all concentrate on genetics.

Can anyone recommend something that describes algorithms for cluster
analysis and tree chart generation, suitable for using in a program?
The more directly applicable to information organization and card sorting,
the better.  If I can get a useful Perl script working, I'll make it
available.  There are some open source cluster analysis tools out there
([2] [3]) but nothing, apparently, aimed at us.

Thanks for any pointers.

Bill

[1] http://www-3.ibm.com/ibm/easy/eou_ext.nsf/Publish/410 (requires
    Javascript--download links don't show in Lynx)
[2] http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/software.htm
[3] http://search.cpan.org/search?query=algorithm::cluster
-- 
William Denton : Toronto, Canada : http://www.miskatonic.org/ : Caveat lector.




More information about the Sigia-l mailing list