lwaltman at FEW.EUR.NL
Wed Nov 5 11:29:16 EST 2008
Dear Leo and Loet,
Your paper contains some nice theoretical results. However, we have serious doubts about the way in which you use these results for visualization purposes. The general idea of your approach seems to be that a visualization of authors can best be made using the cosine as a similarity measure, but preferably in such a way that in the visualization authors are not connected if their Pearson correlation is below 0. But what is wrong with a Pearson correlation below 0? Consider two authors, A and B. Author A has, respectively, 101, 102, 103, and 104 cocitations with authors C, D, E, and F. Author B has, respectively, 104, 103, 102, and 101 cocitations with authors C, D, E, and F. Hence, the Pearson correlation for authors A and B equals -1. Consequently, according to your reasoning, there should be no connection between A and B. But why not? Authors A and B are very similar, since they have almost the same cocitation profile with authors C, D, E, and F. Therefore, there should definitely be a connection between authors A and B.
In our opinion, it makes no sense to ask the question whether the Pearson correlation for two authors is above or below 0. This question is completely irrelevant for visualization purposes. We discuss this in detail in a paper recently published in JASIST (59(10):1653-1661, 2008, http://dx.doi.org/10.1002/asi.20872). By the way, this paper also contains an empirical example showing that sometimes the choice between the cosine and the Pearson correlation results in significantly different visualizations.
Ludo Waltman and Nees Jan van Eck
Ludo Waltman MSc
Erasmus School of Economics
Erasmus University Rotterdam
P.O. Box 1738
3000 DR Rotterdam
Tel: (+31) 10 4088938
Fax: (+31) 10 4089162
E-mail: lwaltman at few.eur.nl
From: ASIS&T Special Interest Group on Metrics [mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Loet Leydesdorff
Sent: Wednesday 5 November 2008 11:15
To: SIGMETRICS at LISTSERV.UTK.EDU
Adminstrative info for SIGMETRICS (for example unsubscribe): http://web.utk.edu/~gwhitney/sigmetrics.html
The relation between Pearson's correlation coefficient r and Salton's cosine measure <http://www.leydesdorff.net/CosineVsPearson/index.htm>
Journal of the American Society for Information Science & Technology (forthcoming)
The relation between Pearson's correlation coefficient and Salton's cosine measure is revealed based on the different possible values of the division of the -norm and the -norm of a vector. These different values yield a sheaf of increasingly straight lines which form together a cloud of points, being the investigated relation. The theoretical results are tested against the author co-citation relations among 24 informetricians for whom two matrices can be constructed, based on co-citations: the asymmetric occurrence matrix and the symmetric co-citation matrix. Both examples completely confirm the theoretical results. The results enable us to specify an algorithm which provides a threshold value for the cosine above which none of the corresponding Pearson correlations would be negative. Using this threshold value can be expected to optimize the visualization of the vector space.
<click here for pdf> <http://www.leydesdorff.net/CosineVsPearson/CosineVsPearson.pdf>
Leo Egghe and Loet Leydesdorff
More information about the SIGMETRICS