Loet Leydesdorff loet at LEYDESDORFF.NET
Sat Apr 19 03:07:58 EDT 2014

Dear Jesper: 

I found your recent paper in Scientometrics at
51-5 . Thank you for noting this; I’ll read. 


My contribution was not meant anti-intellectually, but I just wished to
express that I like it when problems can be made empirically and critique
leads to alternative operationalization. As noted, I therefore added in
response to your comments the effect sizes to so that colleagues could
see the counting rules (in addition to testing the differences in rankings
for statistical significance; the confidence intervals can be found at the
webpage of CWTS at ).


I agree that the specification of uncertainty is very central to the






Loet Leydesdorff 

Professor Emeritus, University of Amsterdam
Amsterdam School of Communications Research (ASCoR)

 <mailto:loet at> loet at ;
Honorary Professor, SPRU,  <> University of

Guest Professor Zhejiang Univ. <> , Hangzhou;
Visiting Professor, ISTIC,  <>

Visiting Professor, Birkbeck <> , University of London;
<> &hl=en  


From: ASIS&T Special Interest Group on Metrics
[mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Jesper Wiborg Schneider
Sent: Saturday, April 19, 2014 8:00 AM
Subject: Re: [SIGMETRICS] Papers


Dear Lutz and Loet,


As I said this is not an issue of more references to me and certainly not
more perfunctory ones like in this paper, they bring nothing.  Of course I
should NOT get credit for “introducing” effect sizes or CIs into our field –
this is ludicrous, but I was surprised to see that some of the important
“challenges” I bring forward, especially in the JoI paper (extended recently
in a Scientometrics paper, the existence I think Lutz is aware of) are
basically ignored in this new work (or subsumed in a perfunctory reference)
which I presume is written for our community and thus goes into “our
literature” on this topic.  


These issue cannot be brushed aside as “meta theoretical” and “repetitive”.
Loet I certainly agree that the use of inferential statistics should be
related to the research questions, or rather to the design and settings used
to answer these questions, but you cannot – as you seem to want – remove the
“meta theoretical” questions underlying their use.  It is simply absurd in a
scholarly field to whish such things to go away.  They are there and you
have to confront them, like or not.  There is more to statistical inferences
than calculation.  Statistical inference is based on different theories and
assumptions.  Both are needed for understanding, interpretation and
knowledge claims depend upon them.  When it comes to uncertainty the issues
are so problematic and unresolved that most people just put the “binoculars
in front of the blind eye” and continue to use them and make claims based
upon them which in many cases we need to consider as basically flawed or as
Ioannidis has claimed “
 most published research findings are false” – I
guess such practice causes the “repetitiveness” you dislike.  


Yes both of you responded to some of my criticisms in two brief letters,
basically arguing for the use of statistical significance tests in the
frequentists conception supported by effects sizes and CIs.  Fair enough,
but what about the important issues of when to use them, how to interpret
the results they produce etc.?  If you have read my papers you will notice
that I (also) endorse the use of effect sizes and that I argue that CIs are
superior to null hypothesis significance tests.  But you will also notice –
contrary to their endorsement in the current paper by Lutz -that I am also
highly critical of CIs and that I certainly do not see any point in using
them as pseudo significance tests, especially since they can relive us from
the problematic null hypothesis.  This is not just my personal opinions;
they are legitimate claims which I think should be reflected upon.  Lutz and
his coauthor write in the paper:


“CIs provide a feel for the precision of measures. Put another way, they
show the range that the true value of the mean may plausibly fall in. For
example, if the observed mean was 40, the 95% CI might range between 35 and
45. So, while 40 is our “best guess” as to what the mean truly is, values
ranging between 35 and 45 are also plausible alternative values.”


I am not sure how people will perceive this, but in my reading the
definition is unclear on the basic facts about frequentist CIs – which also
happen to be their inherent weaknesses – namely that you cannot determine
whether the “true” value of the parameter lies within the one interval you
happen to calculate – it does or it does not – and therefore you need the
long-run experimental interpretation which, as I have argued, is seldom
addressed (especially not its consequences) and is not per se something that
gives meaning or work equally well in all fields and situations, especially
not in observational studies in the social sciences (they may work in
physics, biology or experimental psychology?).  I am afraid that the above
definition could leave readers with the impression that what we have is a
fixed interval with a certain probability (95%) of including the “true
parameter value” and we therefore have a feeling about the ‘uncertainty’ –
but this is not the case!


As I said I was surprised that in this new paper/chapter by Lutz, where the
aim I guess is to teach colleagues some better practices, that none of the
contested issues brought forward are reflected upon or at least pointed to
so that readers are made aware of these issues (e.g., frequentist
interpretation, logical issues, randomness or rather lack of, implausible
null hypotheses).  As I see it, these issues are not “resolved” in your
letters and I reacted because I thought that two perfunctory references, one
to me and one to you, was an “understatement” of these unresolved problems
in paper where it is relevant to mention them.  I do not argue that each and
every question should be brought forward for discussion, but since CIs are
endorsed, their assumptions and difficult interpretations could have been
scrutinized more, but that is my opinion.  I can understand that you see
this differently – fair enough and let’s leave it there.  


Kind regards Jesper




Jesper W. Schneider
Senior Researcher, PhD

Aarhus University
Business and Social Sciences
Danish Centre for Studies in Research & Research Policy,

Department of Political Science & Government

Bartholins Allé 7

building 1331, room 027

DK-8000 Aarhus C

T: +45 8716 5241
M:  <mailto:jws at> jws at
W:  <>



From: ASIS&T Special Interest Group on Metrics
[mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Bornmann, Lutz
Sent: 17 April 2014 09:47
Subject: Re: [SIGMETRICS] Papers


Dear Jesper,


I agree to Loet. It is not clear to me, what you expect Jesper. You are
cited in our papers, but I think it wouldn’t be inappropriate to mention
that you are the first one who used effect size measures (confidence
intervals etc.) in bibliometrics. Since many years, I and many colleagues
used it. (I already used it in my master thesis at the end of the 1990s.)






From: ASIS&T Special Interest Group on Metrics
[mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Loet Leydesdorff
Sent: Thursday, April 17, 2014 8:15 AM
Subject: Re: [SIGMETRICS] Papers


Dear Jesper, 


Is there anything new to add to this debate? We thought that referencing the
argument would be sufficient in this context. 


At the time, we responded more fully in Bornmann & Leydesdorff (2013) and
Leydesdorff (2013), and added power analysis (Cohen, 1988) to the
statistical test of the Leiden (2011) rankings, available at (Leydesdorff & Bornmann, 2012) in
response to your contributions (Schneider 2012 and 2013). 


In my opinion, the issue of using significance testing, confidence
intervals, and/or power analysis is to be decided from the perspective of
the functionality of answering research questions. Otherwise, the debate
tends to remain meta-theoretical and one risks to become repetitive.







Bornmann, L., & Leydesdorff, L. (2013). Statistical Tests and Research
Assessments: A comment on Schneider (2012). Journal of the American Society
for Information Science and Technology, 64(6), 1306-1308. 

Leydesdorff, L. (2013). Does the specification of uncertainty hurt the
progress of scientometrics? Journal of Informetrics, 7(2), 292-293. 

Leydesdorff, L., & Bornmann, L. (2012). Testing Differences Statistically
with the Leiden Ranking. Scientometrics, 92(3), 781-783.

Schneider, J. W. (2012). Testing University Rankings Statistically: Why this
Perhaps is not such a Good Idea after All. Some Reflections on Statistical
Power, Effect Size, Random Sampling and Imaginary Populations. In É.
Archambault, Y. Gingras & V. Larivière (Eds.), Science & Technology
Indicators (STI) 2012 (Vol. 2, pp. 719-732). Montreal: Universite de Quebec
a Montreal.

Schneider, J. W. (2013). Caveats for using statistical significance test in
research assessments. Journal of Informetrics, 7(1), 50-62.


-----Original Message-----
From: ASIS&T Special Interest Group on Metrics
[mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Jesper Wiborg Schneider
Sent: Wednesday, April 16, 2014 8:55 PM
Subject: Re: [SIGMETRICS] Papers




Dear Lutz,


Interesting paper, the latter one, and interesting to see how the 'debate'
in our field is reflected in the references you and your coauthor give:


"In bibliometrics, it has been also recommended to go beyond statistical
significance testing (Bornmann & Leydesdorff, 2013; Schneider, 2012)."


I guess you can call this quote an understatement, at least from my
perspective. I do not think anyone recommended to go 'beyond statistical
significance testing' in scientometrics/bibliometrics before I criticized
the current practice in 'Caveats for using statistical significance tests in
research assessments" first published in the Arxiv in 2011:
<> and later in
2013 in Journal of Informetrics. 

In 2012, at the STI conference I exteneded the critic in the paper you
mention in the quote, discussing one of your papers on university rankings
and exemplifying the use of effect sizes in relation to such rankings, in
fact the use of Cohen's h in relation to the proportion of top 10 percent
highly cited papers - basically the same example you bring forward in this

Only then - as far as I can follow the ever faster publishing chronology -
did you and other colleagues react to some my criticisms, including an
endorsement of the use of effect sizes and CI until then not visible.

Now I do not hunger for more references or the like, but I would appreciate
that when we in the community have a debate or thread that such a
debate/thread is outlined thoroughly and honestly in the review section -
the purpose with a review. This case is not the first one and it gives one
the impression that our literature is not read ... or worse ...? I am not
sure whether this paper is under review, but I guess me writing this mail is
the risk you run when announcing this on the this list. 


Kind regards Jesper






From: ASIS&T Special Interest Group on Metrics [SIGMETRICS at LISTSERV.UTK.EDU]
on behalf of Bornmann, Lutz [lutz.bornmann at GV.MPG.DE]

Sent: 16 April 2014 15:53


Subject: [SIGMETRICS] Papers


BRICS countries and scientific excellence: A bibliometric analysis of most
frequently-cited papers Lutz Bornmann<
<>>, Caroline Wagner<
<>>, Loet Leydesdorff<


(Submitted on 14 Apr 2014)


The BRICS countries (Brazil, Russia, India, and China, and South Africa) are
noted for their increasing participation in science and technology. The
governments of these countries have been boosting their investments in
research and development to become part of the group of nations doing
research at a world-class level. This study investigates the development of
the BRICS countries in the domain of top-cited papers (top 10% and 1% most
frequently cited papers) between 1990 and 2010. To assess the extent to
which these countries have become important players on the top level, we
compare the BRICS countries with the top-performing countries worldwide. As
the analyses of the (annual) growth rates show, with the exception of
Russia, the BRICS countries have increased their output in terms of most
frequently-cited papers at a higher rate than the top-cited countries
worldwide. In a further step of analysis for this study, we generate
co-authorship networks among authors of highly cited papers for four time
points to view changes in BRICS participation (1995, 2000, 2005, and 2010).
Here, the results show that all BRICS countries succeeded in becoming part
of this network, whereby the Chinese collaboration activities focus on the


Available at:  <>



The substantive and practical significance of citation impact differences
between institutions: Guidelines for the analysis of percentiles using
effect sizes and confidence intervals Richard Williams<
<>>, Lutz Bornmann<


(Submitted on 12 Apr 2014)


In our chapter we address the statistical analysis of percentiles: How
should the citation impact of institutions be compared? In educational and
psychological testing, percentiles are already used widely as a standard to
evaluate an individual's test scores - intelligence tests for example - by
comparing them with the percentiles of a calibrated sample. Percentiles, or
percentile rank classes, are also a very suitable method for bibliometrics
to normalize citations of publications in terms of the subject category and
the publication year and, unlike the mean-based indicators (the relative
citation rates), percentiles are scarcely affected by skewed distributions
of citations. The percentile of a certain publication provides information
about the citation impact this publication has achieved in comparison to
other similar publications in the same subject category and publication
year. Analyses of percentiles, however, have not always been presented in
the most effective and meaningful way. New APA guidelines (American
Psychological Association, 2010) suggest a lesser emphasis on significance
tests and a greater emphasis on the substantive and practical significance
of findings. Drawing on work by Cumming (2012) we show how examinations of
effect sizes (e.g. Cohen's d statistic) and confidence intervals can lead to
a clear understanding of citation impact differences.


Available at:  <>




Dr. Dr. habil. Lutz Bornmann

Division for Science and Innovation Studies Administrative Headquarters of
the Max Planck Society Hofgartenstr. 8

80539 Munich

Tel.: +49 89 2108 1265

Mobil: +49 170 9183667

Email:  <mailto:bornmann at at>
bornmann at<mailto:bornmann at>

WWW:  <><>

ResearcherID:  <>

ResearchGate:  <>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 6171 bytes
Desc: image001.jpg
URL: <>

More information about the SIGMETRICS mailing list