skewed citation distributions should not be averaged

Wed Aug 31 14:10:31 EDT 2011

Dear David:

Wolfgang Glaenzel precisely defined the conditions:

either. This is a misbelief. According to the

central limit theorem, the distribution of the

means of random samples is approximately

normal for a large sample size, provided the

underlying distribution of the population is in

the domain of attraction of the Gaussian distribution.

In other words, sample means approach a

normal distribution regardless of the distribution

of the population if the number of observations

is large enough and the first statistical moments

are finite. Consequently, means and shares of

different samples drawn from the same populations

can be compared with each other and the

significance of the deviation can be determined.

Gangan Prathap’s contribution is interesting in this context because using a
physical metaphor, he distinguished between “energy” and “exergy”. The
difference (E – X), in his opinion, is “a kind of entropy”—indeed, “a kind
of” because the dimensionality of energy and entropy is different. If one
assumes “a kind of ideal gas,” then one can compute with the mean. In
evaluation research, however, we don’t have so large number of observations
that the constraints can be neglected. There is no reason to assume that the
CLT is valid. For example, there are principles in science such as
preferential attachment that operate against the assumption of a tendency to
the mean. 

Instead of showing this each time, the approach of using percentiles does
not have to make the assumption. The hundred percentiles can follow the
citation curve as a continuous variable (“quantiles”). One can use
non-parametric statistics (which is available for 50 or so years) instead.
Instead of determining the deviation from the mean, one can test the
observation against the expectation (as when using chi-square). The
specification of the expectation can enrich the research design.

Best wishes, 

Loet

Means and shares are used as unbiased estimators

of the expected value and the corresponding

probabilities, respectively. Furthermore, in the

case of skewed discrete distributions the mean

value is superior to median. The underlying

methods of application of mathematical statistics

have been described, among others, by

Schubert and Glänzel (1983), Glänzel and Moed

(2002) and reliability-related statistics have been

regularly and successfully applied to bibliometrics

since. These statistical properties have severe

effects on ranking issues as well. Different

ranks can prove as ties because the underlying

indicator values might not differ significantly

(cf. Glänzel and Debackere 2007).

The myth of the inapplicability of Gaussian

statistics in a bibliometric context actually arose

from a misunderstanding, namely from the assumed

comparison of individual observations

with a standard. However, that is not what statistics

does.

--David Pendlebury

  _____  

From: ASIS&T Special Interest Group on Metrics
[mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Loet Leydesdorff
Sent: Tuesday, August 30, 2011 11:10 PM
To: SIGMETRICS at LISTSERV.UTK.EDU
Subject: [SIGMETRICS] skewed citation distributions should not be averaged

A Rejoinder on Energy versus Impact Indicators
<http://arxiv.org/abs/1108.5845> 
Scientometrics (in press)

Citation distributions are so skewed that using the mean or any other
central tendency measure is ill-advised. Unlike G. Prathap's scalar measures
(Energy, Exergy, and Entropy or EEE), the Integrated Impact Indicator (I3)
is based on non-parametric statistics using the (100) percentiles of the
distribution. Observed values can be tested against expected ones; impact
can be qualified at the article level and then aggregated. 

pdf available at http://arxiv.org/ftp/arxiv/papers/1108/1108.5845.pdf 

** apologies for cross postings

  _____  

Loet Leydesdorff 

Professor, University of Amsterdam
Amsterdam School of Communications Research (ASCoR)
Kloveniersburgwal 48, 1012 CX Amsterdam.
Tel. +31-20-525 6598; fax: +31-842239111

 <mailto:loet at leydesdorff.net> loet at leydesdorff.net ;
<http://www.leydesdorff.net/> http://www.leydesdorff.net/ 
Visiting Professor, ISTIC,  <http://www.istic.ac.cn/Eng/brief_en.html>
Beijing; Honorary Fellow, SPRU,  <http://www.sussex.ac.uk/spru/> University
of Sussex 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.asis.org/pipermail/sigmetrics/attachments/20110831/43e59e0c/attachment.html>