methodological problem

Ulf Sandström usandstrom at TELE2.SE
Sun Mar 9 11:14:17 EDT 2008


Dear colleagues,

We are trying to find a method for producing relevant productivity measures
for different fields of research. When we talk about field factors we look
for a method that considers the different conditions of production,
different journal sets, different acceptance behavior from journals,
different standards, and all factors that should be included in an
explanation as to why chemistry produces more papers per researcher than
engineering and math researchers. For this we need to classify articles into
different normalization groups. One strategy would, of course, be to use the
ISI subject codes, but we are not content with that. Firstly, we need
aggregate groups, secondly, in our understanding ISI subject categories is
not a perfect way of organizing scientific fields into macro classes. The
subject codes might be good for bibliograhical reasons, but not for this
type of bibliometrics and the critique against subject categories have been
discussed on this list several times. 

Instead, we were thinking about using bibliographic coupling in order to
achieve distinct groups of fields through a clustering method. But, would we
get a result that took the different conditions into consideration with a
method of that type? Our answer would be no to that question. There are
different branches of science, basic and applied, that refers to more or
less the same references but the authors live in very different social
practices and different conditions for productivity. This goes for citations
as well so co-citations will not solve the problem. 

Our question is therefore, how to produce significant groups (fields) for
normalization of productivity?

We propose a new method and we invite colleagues to discuss the different
stages of our method and other problems related to the method. The main
point in the method is that we cluster journals according to the behavior of
researchers publishing in these journals. From this follows that we have the
idea that researchers from different conditions of productivity to a large
extent uses different journals. 

First, we use the Nordic countries as a reference base for publication
productivity at Swedish universities. We start by unification of address
data in Web of Science downloads for all Nordic universities during a four
year period (2003-2006). Next, we perform a unification of author names at
all Nordic universities (54,000 unique names). We have to do this because
otherwise it would be hard to count publications per author as people move
around and are at different universities and research institutes (and
companies) during the period. The author data is then used as a basis for a
clustering procedure where the clusters are made out of links between
journals according to the behavior of the Nordic authors. If the same author
publishes in two different journals a connection between the journals is
established. We use a variation of the clustering method proposed by Boyack
& Klavans in their Madrid ISSI conference paper 2007. Accordingly, we use
the links between journals as input in VxOrd to produce coordinates. The
clustering is performed with co-ordinates as variable. We use a cosinus
normalization for degree of linkage. 

One problem is the fact that all journals do not have links. Those journals
that do not have at least 2 links to 2 other journals are taken out of the
clustering procedure. Around 2,500 journals out of 6,000 are taken out, but
these only count for 10 % of the articles. We also take away the 80 most
linked journals  and put them into the clustering procedure at the last
stage. So, we perform the clustering in two rounds and receive approx. 130
cluster and at that stage we put in the 80 big journals and in the last
clustering we receive about 50 clusters. Lastly, we manually put in the
missing 2,500 journals in a manual procedure based on subject category. 

We were wondering if there are any viable alternatives to this methodology
or another strategy to achieve clusters (fields) of research communities
based on conditions for productivity? 

If anyone would like to have more substance to this question we have an
article Sandstrom, Ulf & Sandstrom, Erik (2007) "A metric for academic
performance applied to Australian universities 2001-2004" at the E-LIS
archive: (ID-code: 11776) 

It should be underlined that in paper we followed a different strategy for
clustering of papers compared to the one we propose above. In the paper we
used the author behavior over ISI subject codes as ground for clustering.

The article is a methodological attempt to show that all necessary
information on publication output from universities is already available
from the Web of Science. By using what could be called an iceberg
methodology we develop productivity figures per university and combines that
with field normalized citations rates. The result is a "size-dependent"
indicator for universities in a specific country. In order to use the
proposed method you will have to decide on a reference base for the country
in focus; for Sweden we used the Nordic countries. The method was developed
for Sweden and is presented within the recently issued governmental white
paper on competitive funding or formula-based block-grants (see
<> at page 21-28). Just
as an illustration, in this article, we apply the method to Australian


In a series of papers, published during second half of the 1980s, the
Budapest group (Braun, Glänzel, Telcs and Schubert) proposed that
bibliometric distributions are to be characterized as Waring distributions.
We use their methodology in order to establish a reference value for
academic production within macro classes. From this we develop a combined
performance model for academic research and apply the model to Australian
research. This model take advantage of, first, field normalized publication
rates (the productivity dimension) and, second, field normalized citation
rates (the quality dimension). Based on ISI-data the performance of
Australian universities is depicted in a more resource-efficient way than
other models. 


KEYWORDS: performance based funding; formula-based funding, generalized
waring distributions; bibliometrics


Best regards, 

Ulf Sandström

Linkoping University, Sweden

 <mailto:ulfsa at> ulfsa at


Linköpings universitet                 Kungl Tekniska Högskolan

ISAK                                         Industriell dynamik

581 83 Linköping                        100 44 Stockholm


0708-137376                              08-790 9810


Besök min hemsida

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the SIGMETRICS mailing list