White HD "Author cocitation analysis and Pearson's r" Journal of the American Society for Information Science and Technology 54(13):1250-1259 November 2003,

Loet Leydesdorff loet at LEYDESDORFF.NET
Thu Dec 4 03:41:36 EST 2003


> -----Original Message-----
> From: ASIS&T Special Interest Group on Metrics
> [mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Eugene Garfield
> Sent: Monday, December 01, 2003 9:57 PM
> To: SIGMETRICS at LISTSERV.UTK.EDU
> Subject: [SIGMETRICS] White HD "Author cocitation analysis
> and Pearson's r" Journal of the American Society for
> Information Science and Technology 54(13):1250-1259 November 2003,
>
>
> Howard D. White : Howard.Dalby.White at drexel.edu
>
> TITLE    Author cocitation analysis and Pearson's r
>
> AUTHOR   White HD
>
> JOURNAL  JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION
>          SCIENCE AND TECHNOLOGY 54 (13): 1250-1259 NOV 2003

Dear Howard and colleagues,

I read this article with interest and I agree that for most practical
purposes Pearson's r will do a job similar to Salton's cosine.
Nevertheless, the argument of Ahlgren et al. (2002) seems convincing to
me. Scientometric distributions are often highly skewed and the mean can
easily be distorted by the zeros. The cosine elegantly solves this
problem.

A disadvantage of the cosine (in comparison to the r) may be that it
does not become negative in order to indicate dissimilarity. This is
particularly important for the factor analysis. I have thought about
input-ing the cosine matrix into the factor analysis (SPSS allows for
importing a matrix in this analysis), but that seems a bit tricky.

Caroline Wagner and I did a study on coauthorship relations entitled
"Mapping Global Science using International Coauthorships: A comparison
of 1990 and 2000" (Intern. J. of Technology and Globalization,
forthcoming) in which we used the same matrix for mapping using the
cosine (and then Pajek for the visualization) and for the factor
analysis using Pearson's r. The results are provided as factor plots in
the preprint version of the paper at
http://www.leydesdorff.net/sciencenets/mapping.pdf .

While the cosine maps exhibit the hierarchy by placing the central
cluster in the center (including the U.S.A. and some Western-European
countries), the factor analysis reveals the main structural axes of the
system as competitive relations between the U.S.A., U.K., and
continental Europe (Germany + Russia). The French system can be
considered as a fourth axis. These eigenvectors function as competitors
for collaboration with authors from other (smaller or more peripheral)
countries.

Thus, the two measures enable us to show something differently: Salton's
cosine exhibits the hierarchy and one might say that the factor analysis
on the basis of Pearson's r enables us to show the heterarchy among
competing axes in the system.

With kind regards,

Loet

  _____

Loet Leydesdorff
Amsterdam School of Communications Research (ASCoR)
Kloveniersburgwal 48, 1012 CX Amsterdam
Tel.: +31-20- 525 6598; fax: +31-20- 525 3681
 <mailto:loet at leydesdorff.net> loet at leydesdorff.net ;
<http://www.leydesdorff.net/> http://www.leydesdorff.net/


 <http://www.upublish.com/books/leydesdorff-sci.htm> The Challenge of
Scientometrics ;  <http://www.upublish.com/books/leydesdorff.htm> The
Self-Organization of the Knowledge-Based Society



>
>
>  Document type: Article  Language: English  Cited References:
> 20  Times Cited: 0
>
> Abstract:
> In their article "Requirements for a cocitation similarity
> measure, with special reference to Pearson's correlation
> coefficient," Ahlgren, Jarneving, and Rousseau fault
> traditional author cocitation analysis (ACA) for using
> Pearson's r as a measure of similarity between authors
> because it fails two tests of stability of measurement. The
> instabilities arise when rs are recalculated after a first
> coherent group of authors has been augmented by a second
> coherent group with whom the first has little or no
> cocitation. However, AJ&R neither cluster nor map their data
> to demonstrate how fluctuations in rs will mislead the
> analyst, and the problem they pose is remote from both theory
> and practice in traditional ACA. By entering their own rs
> into multidimensional scaling and clustering routines, I show
> that, despite rs fluctuations, clusters based on it are much
> the same for the combined groups as for the separate groups.
> The combined groups when mapped appear as polarized clumps of
> points in two-dimensional space, confirming that differences
> between the groups have become much more important than
> differences within the groups-an accurate portrayal of what
> has happened to the data. Moreover, r produces clusters and
> maps very like those based on other coefficients that AJ&R
> mention as possible replacements, such as a cosine similarity
> measure or a chi square dissimilarity measure. Thus, r
> performs well enough for the purposes of ACA. Accordingly, I
> argue that qualitative information revealing why authors are
> cocited is more important than the cautions proposed in the
> AJ&R critique. I include notes on topics such as handling the
> diagonal in author cocitation matrices, lognormalizing data,
> and testing r for significance.
>
> KeyWords Plus:
> INTELLECTUAL STRUCTURE, SCIENCE
>
> Addresses:
> White HD, Drexel Univ, Coll Informat Sci & Technol, 3152
> Chestnut St, Philadelphia, PA 19104 USA Drexel Univ, Coll
> Informat Sci & Technol, Philadelphia, PA 19104 USA
>
> Publisher:
> JOHN WILEY & SONS INC, 111 RIVER ST, HOBOKEN, NJ 07030 USA
>
> IDS Number:
> 730VQ
>
>
>  Cited Author            Cited Work                Volume
>  Page   Year
>      ID
>
>  AHLGREN P             J AM SOC INF SCI TEC          54
> 550      2003
>  BAYER AE              J AM SOC INFORM SCI           41
> 444      1990
>  BORGATTI SP           UCINET WINDOWS SOFTW
>          2002
>  BORGATTI SP           WORKSH SUNB 20 INT S
>          2000
>  DAVISON ML            MULTIDIMENSIONAL SCA
>          1983
>  EOM SB                J AM SOC INFORM SCI           47
> 941      1996
>  EVERITT B             CLUSTER ANAL
>          1974
>  GRIFFITH BC           KEY PAPERS INFORMATI
>  R6      1980
>  HOPKINS FL            SCIENTOMETRICS                 6
>  33      1984
>  HUBERT L              BRIT J MATH STAT PSY          29
> 190      1976
>  LEYDESDORFF L         INFORMERICS 87 88
> 105      1988
>  MCCAIN KW             J AM SOC INFORM SCI           41
> 433      1990
>  MCCAIN KW             J AM SOC INFORM SCI           37
> 111      1986
>  MCCAIN KW             J AM SOC INFORM SCI           35
> 351      1984
>  MULLINS NC            THEORIES THEORY GROU
>          1973
>  WHITE HD              BIBLIOMETRICS SCHOLA
>  84      1990
>  WHITE HD              J AM SOC INF SCI TEC          54
> 423      2003
>  WHITE HD              J AM SOC INFORM SCI           49
> 327      1998
>  WHITE HD              J AM SOC INFORM SCI           41
> 430      1990
>  WHITE HD              J AM SOC INFORM SCI           32
> 163      1981
>
>
> When responding, please attach my original message
> ______________________________________________________________
> _________
> Eugene Garfield, PhD.  email: garfield at codex.cis.upenn.edu
> home page: www.eugenegarfield.org
> Tel: 215-243-2205 Fax 215-387-1266
> President, The Scientist LLC. www.the-scientist.com
> Chairman Emeritus, ISI www.isinet.com
> Past President, American Society for Information Science and
> Technology
> (ASIS&T)  www.asis.org
> ______________________________________________________________
> _________
>
>
>
> ISSN:
> 1532-2882
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.asis.org/pipermail/sigmetrics/attachments/20031204/ee3901c3/attachment.html>


More information about the SIGMETRICS mailing list