White HD "Author cocitation analysis and ...
Loet Leydesdorff
loet at LEYDESDORFF.NET
Thu Jan 15 11:33:20 EST 2004
Dear Steven and colleagues,
I now read the paper of William P. Jones and George W. Furnas, JASIST,
38(6), 1987, 420-442, and it is really enlightening because they explain
the difference between the cosine and the Pearson so that I as a
non-mathematician can clearly understand it.
The Pearson correlation is just the cosine applied to the vectors after
normalization to the mean. Thus, while the cosine can be written:
cos(x,y) = Sigma (x * y) / {sqrt(Sigma x)^2 * sqrt(Sigma y)^2)
The Pearson equivalently is precisely the same formula with the x
replaced with {x - mean(x)} and y with {y - mean(y)} I had never seen
this connection. (I apologize for the notation in ASCII.)
It follows clearly that the effects of this normalization will be
minimal if the distribution is normal, but the more the distribution
deviates from normal, the less the mean is meaningful as a parameter,
and the cosine then outperforms the Pearson. The authors say it as
follows: "The use of moment normalization in these measures introduces
additional potential drawbacks. (...) Moment normalization removes a
degree of freedom from the expressive power of query and object
vectors."
Does this all lead to a recommendation of using the cosine matrix
instead of the Pearson matrix as input to (for example) factor analysis
in the case of non-normal distributions? Is there a statistician
listening on the list who can answer this question? Or does the factor
analysis require the parametric statistics as input? References?
(SPSS allows for the input of an external matrix in the multivariate
routines.)
With kind regards,
Loet
_____
Loet Leydesdorff
Amsterdam School of Communications Research (ASCoR)
Kloveniersburgwal 48, 1012 CX Amsterdam
Tel.: +31-20- 525 6598; fax: +31-20- 525 3681
<mailto:loet at leydesdorff.net> loet at leydesdorff.net ;
<http://www.leydesdorff.net/> http://www.leydesdorff.net/
<http://www.upublish.com/books/leydesdorff-sci.htm> The Challenge of
Scientometrics ; <http://www.upublish.com/books/leydesdorff.htm> The
Self-Organization of the Knowledge-Based Society
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.asis.org/pipermail/sigmetrics/attachments/20040115/e6eac0c3/attachment.html>
More information about the SIGMETRICS
mailing list