Morris, Steven (BA) Steven.Morris at BAKERHUGHES.COM
Sat Jan 27 11:55:27 EST 2007


I agree that you'd probably only find a weak correlation between number
of references cited and citations received if you don't distinguish
between the type of paper (review or not) and the way it is used as a
reference (well-cited exemplar reference or not).

In my mind the relation is very much tied to the dynamics of specialty
growth.  In a recent paper [1] I asserted that after a discovery that
prompts the birth of a specialty, there is a period of rapid growth in
the specialty where scientists extend the discovery, and present
evidence to support those extensions. The discovery paper and other
early important papers become heavily cited 'exemplar references' during
this growth period. At the end of the growth period, 'consolidation'
review papers appear that codify and summarize the newly generated base
knowledge in the new specialty. These consolidation papers can become
highly cited exemplar references in the sense that they are cited as
summaries of collected base knowledge. Some of these reviews become
highly cited, some don't,  I suspect it has to do both with timing
(written at a point when the newly generated knowledge was ready to be
codified), quality and comprehensiveness, and perceived authority of the
review author. 

Given the growth and exemplar process described above, you'd expect the

1) Discovery papers, written before all the base knowledge in the
specialty is generated, wouldn't cite many references, but would be
cited heavily. I think there is evidence out there that discovery papers
tend to have few references. I heard Kate McCain mention this once at a
conference ;-),   but I don't have a reference to support that. 

2) Consolidation papers, written to summarize base knowledge immediately
after initial growth, would cite many references and be cited heavily.
Here, the problem is that only some of the consolidation papers become
exceptionally heavily cited exemplar references (the winning reviews
that provide the first good consolidation of the new knowledge), while
others may just be cited at a 'normal' rate for reviews, which is
probably a greater rate than non-review papers.  

Some notes: 

1) There is certainly evidence that the mean number of references per
paper increases over time. I've read this in the literature (though I
can't recall where) and I've seen this in all specialty specific data
sets where I've bothered to check it. I think this is function of
specialty growth:  The network of base knowledge in the specialty gets
more intricate as the specialty grows and 'fills in the blanks', so
authors of later papers have to cite more 'marker references' (Hargens'
term [3]) to describe the position of the contribution of their papers
in the network of base knowledge in the specialty...    

2) There is a correlation between the mean number of references per
paper and the length of the papers. Evidence for this is given by
Abt[2]. So any correlations you find between number of references in the
paper and the number of citations it receives may be related to length
of papers. 

3) In my experience, I find that the distribution of the number of
references per paper is log normally distributed and that the mode of
that distribution varies from one specialty to another.  Now, this fact
totally baffles me.  What social or cognitive process would cause this
distribution to appear?   Is it tied to the same process that governs
the distribution of length of papers? Some sort of proportional growth
process? It's a mystery wrapped in an enigma!  If you figure out what
generates that log-normal distribution, I'll send you a one pound bottle
of Tupelo honey as a prize....

Some other notes: 

If you want to study the correlation of references per paper to
citations received, I suggest the following:

1) Gather specialty-specific collections of papers for your studies. The
heterogeneity in a large multiple-specialty study will totally screw up
the statistics...   You should get  about 1000 papers citing about
20,000 references for each specialty study...
2) Separate your references in the collection into 'exemplar' and
'non-exemplar', you can do this by applying a citation threshold, see
3) Arrange the exemplar references serially by the order of their
appearance in the specialty.  I have some SQL queries I can send you for
doing this. 
4) Look for 'discovery' references at the beginning of this sequence,
and 'consolidation' references at the end of the sequence. 
5) Study the correlation for 6 classes of reference: 1- general
references, 2- general references less exemplar references, 3- discovery
exemplar references, 4- consolidation exemplar references, 5- general
review references, 6- general review references less exemplar



[1] Morris, S. A., 2005,  "Manifestation of emerging specialties in
journal literature: a growth model of papers, references, exemplars,
bibliographic coupling, cocitation, and clustering coefficient
distribution" , JASIST, 56(2) 1250-1273
[2] Abt, H. A., 2000,  "The reference-frequency relation in the physical
sciences", Scientometrics, 49(3), 443-451. 
[3] Hargens, L. L., 2000, "Using the literature: Reference networks,
reference contexts, and the social structure of scholarship"  American
Sociological Review, 65(6), 846-865


Steven A. Morris, Ph.D
Electrical Engineer V, Technology Development Group
Houston Technology Center
2001 Rankin Road, Houston, Texas 77073
Office: 713-625-5055, Cell: 405-269-6576

-----Original Message-----
From: ASIS&T Special Interest Group on Metrics
[mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Stephen J Bensman
Sent: Saturday, January 27, 2007 8:30 AM
Subject: Re: [SIGMETRICS] question

It is well known that review articles summarizing research receive on
average more citations than other types of articles.  Your question is
considered in the book below:

Narin, F.  (1976).  Evaluative bibliometrics: The use of publication and
citation analysis in the evaluation of scientific activity.  Cherry
NJ: Computer Horizons, Inc.

Here Nariin write:

CHI (Narin, 1976, pp. 183-219) developed its "influence" method in a
prepared for the National Science Foundation.  In this report it
Garfield's impact factor as suffering from three basic faults (p. 184).
First, although the impact factor corrects for journal size, it does not
correct for average length of articles, and this caused journals, which
published longer articles such as review journals, to have higher impact

My guess is that you would find no or low correlation between length of
references and number of citations, but, if you used a chi-squared test
independence,  you a strong positive association with review articles
dominant in the high reference/high citation cell.  As usual,It would be
best to do this test with well-defined subject sets than globally to
the influence of exogenous subject variables.  However, Narin seems to
been of a different opinion in respect to correlation, so you might look
what he did.


Ronald Rousseau <ronald.rousseau at KHBO.BE>> on
07:33:34 AM

Please respond to ASIS&T Special Interest Group on Metrics
       <SIGMETRICS at>

Sent by:    ASIS&T Special Interest Group on Metrics
       <SIGMETRICS at>

cc:     (bcc: Stephen J Bensman/notsjb/LSU)

Subject:    [SIGMETRICS] question

Dear colleagues,

Is there a positive correlation between the length of a reference list
of a
publication and the number of citations received? Is this true (or not)
general, i.e. considering all types of publication? And what if one only
considers 'normal articles', this is when reviews and letters (and other
communications) are not taken into account?

Can someone point me to a reference?



Ronald Rousseau
KHBO (Association K.U.Leuven)- Industrial Sciences and Technology
Zeedijk 101    B-8400  Oostende   Belgium
Guest Professor at the Antwerp University School for Library and
   Science (UA - IBW)
E-mail: ronald.rousseau at
web page:

This message was sent using IMP 3.2.8, the Internet Messaging Program.

More information about the SIGMETRICS mailing list