AW: [SIGMETRICS] skewed citation distributions should not be averaged
Bornmann, Lutz
lutz.bornmann at GV.MPG.DE
Wed Aug 31 15:00:14 EDT 2011
Dear David,
The mean is strongly influenced by the publications with high citation
counts. The median - another measure of central tendency - is not. Whether
there is such an effect of (a few) highly-cited papers can be checked by
comparing the mean with the median for the citations of a publications set.
If the mean is significantly higher than the median highly cited papers are
effective here.
Best,
Lutz
---------------------------------------
Dr. Dr. habil. Lutz Bornmann
Max Planck Society
Administrative Headquarters
Hofgartenstr. 8
80539 Munich
Tel.: 089/2108-1265
Email: bornmann at gv.mpg.de
WWW: www.lutz-bornmann.de
ResearcherID: http://www.researcherid.com/rid/A-3926-2008
________________________________
Von: ASIS&T Special Interest Group on Metrics im Auftrag von David A.
Pendlebury
Gesendet: Mi 31.08.2011 20:34
An: SIGMETRICS at LISTSERV.UTK.EDU
Betreff: Re: [SIGMETRICS] skewed citation distributions should not be
averaged
Dear Professor Leydesdorff,
Thank you for your reply.
I noticed your example of individuals at the University of Amsterdam in your
paper - and such small data sets are of course subject to many difficulties.
My question arose because of the strong statement -- without qualification --
in your paper:
"Citation distributions are so skewed that using the mean or any other
central tendency measure is ill-advised."
Best wishes, David
________________________________
From: ASIS&T Special Interest Group on Metrics
[mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Loet Leydesdorff
Sent: Wednesday, August 31, 2011 11:11 AM
To: SIGMETRICS at LISTSERV.UTK.EDU
Subject: Re: [SIGMETRICS] skewed citation distributions should not be
averaged
Dear David:
Wolfgang Glaenzel precisely defined the conditions:
either. This is a misbelief. According to the
central limit theorem, the distribution of the
means of random samples is approximately
normal for a large sample size, provided the
underlying distribution of the population is in
the domain of attraction of the Gaussian distribution.
In other words, sample means approach a
normal distribution regardless of the distribution
of the population if the number of observations
is large enough and the first statistical moments
are finite. Consequently, means and shares of
different samples drawn from the same populations
can be compared with each other and the
significance of the deviation can be determined.
Gangan Prathap's contribution is interesting in this context because using a
physical metaphor, he distinguished between "energy" and "exergy". The
difference (E - X), in his opinion, is "a kind of entropy"-indeed, "a kind
of" because the dimensionality of energy and entropy is different. If one
assumes "a kind of ideal gas," then one can compute with the mean. In
evaluation research, however, we don't have so large number of observations
that the constraints can be neglected. There is no reason to assume that the
CLT is valid. For example, there are principles in science such as
preferential attachment that operate against the assumption of a tendency to
the mean.
Instead of showing this each time, the approach of using percentiles does not
have to make the assumption. The hundred percentiles can follow the citation
curve as a continuous variable ("quantiles"). One can use non-parametric
statistics (which is available for 50 or so years) instead. Instead of
determining the deviation from the mean, one can test the observation against
the expectation (as when using chi-square). The specification of the
expectation can enrich the research design.
Best wishes,
Loet
Means and shares are used as unbiased estimators
of the expected value and the corresponding
probabilities, respectively. Furthermore, in the
case of skewed discrete distributions the mean
value is superior to median. The underlying
methods of application of mathematical statistics
have been described, among others, by
Schubert and Glänzel (1983), Glänzel and Moed
(2002) and reliability-related statistics have been
regularly and successfully applied to bibliometrics
since. These statistical properties have severe
effects on ranking issues as well. Different
ranks can prove as ties because the underlying
indicator values might not differ significantly
(cf. Glänzel and Debackere 2007).
The myth of the inapplicability of Gaussian
statistics in a bibliometric context actually arose
from a misunderstanding, namely from the assumed
comparison of individual observations
with a standard. However, that is not what statistics
does.
--David Pendlebury
________________________________
From: ASIS&T Special Interest Group on Metrics
[mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Loet Leydesdorff
Sent: Tuesday, August 30, 2011 11:10 PM
To: SIGMETRICS at LISTSERV.UTK.EDU
Subject: [SIGMETRICS] skewed citation distributions should not be averaged
A Rejoinder on Energy versus Impact Indicators
<http://arxiv.org/abs/1108.5845>
Scientometrics (in press)
Citation distributions are so skewed that using the mean or any other central
tendency measure is ill-advised. Unlike G. Prathap's scalar measures (Energy,
Exergy, and Entropy or EEE), the Integrated Impact Indicator (I3) is based on
non-parametric statistics using the (100) percentiles of the distribution.
Observed values can be tested against expected ones; impact can be qualified
at the article level and then aggregated.
pdf available at http://arxiv.org/ftp/arxiv/papers/1108/1108.5845.pdf
** apologies for cross postings
________________________________
Loet Leydesdorff
Professor, University of Amsterdam
Amsterdam School of Communications Research (ASCoR)
Kloveniersburgwal 48, 1012 CX Amsterdam.
Tel. +31-20-525 6598; fax: +31-842239111
loet at leydesdorff.net <mailto:loet at leydesdorff.net> ;
http://www.leydesdorff.net/ <http://www.leydesdorff.net/>
Visiting Professor, ISTIC, <http://www.istic.ac.cn/Eng/brief_en.html>
Beijing; Honorary Fellow, SPRU, <http://www.sussex.ac.uk/spru/> University of
Sussex
More information about the SIGMETRICS
mailing list