question

Mon Jan 29 02:26:38 EST 2007

-----Dear Morris,

The question of relationship between the number of times a paper is
cited (citations received) and the number of references it contains
was firs posed in the 1960s (Derek J. De Solla Price, Science 149,
510-515, 1965).
I have studied a sample of 467 papers published in Scientometrics from
1999 to 2003 and counted thecitations they received until 2005. A
Chi-Squre test of independence showed that the two indicators are
dependent (at 0.01 level of significance). The linear correlation
coefficient between them turned out to be 0.799). Please see the
details of the paper from the Proocedings of International Workshop on
Webometrics, Informetrics and Scientometrics & Seventh Collnet
Meeting, 10-12 May 2006 Inist-Loria, Nancy-France.
--------------
> Adminstrative info for SIGMETRICS (for example unsubscribe):
> http://web.utk.edu/~gwhitney/sigmetrics.html
> 
> Ronald,
> 
> I agree that you'd probably only find a weak correlation between
number
> of references cited and citations received if you don't distinguish
> between the type of paper (review or not) and the way it is used as
a
> reference (well-cited exemplar reference or not).
> 
> In my mind the relation is very much tied to the dynamics of
specialty
> growth.  In a recent paper [1] I asserted that after a discovery
that
> prompts the birth of a specialty, there is a period of rapid growth
in
> the specialty where scientists extend the discovery, and present
> evidence to support those extensions. The discovery paper and other
> early important papers become heavily cited 'exemplar references'
during
> this growth period. At the end of the growth period, 'consolidation'
> review papers appear that codify and summarize the newly generated
base
> knowledge in the new specialty. These consolidation papers can
become
> highly cited exemplar references in the sense that they are cited as
> summaries of collected base knowledge. Some of these reviews become
> highly cited, some don't,  I suspect it has to do both with timing
> (written at a point when the newly generated knowledge was ready to
be
> codified), quality and comprehensiveness, and perceived authority of
the
> review author. 
> 
> Given the growth and exemplar process described above, you'd expect
the
> following:
> 
> 1) Discovery papers, written before all the base knowledge in the
> specialty is generated, wouldn't cite many references, but would be
> cited heavily. I think there is evidence out there that discovery
papers
> tend to have few references. I heard Kate McCain mention this once
at a
> conference ;-),   but I don't have a reference to support that. 
> 
> 2) Consolidation papers, written to summarize base knowledge
immediately
> after initial growth, would cite many references and be cited
heavily.
> Here, the problem is that only some of the consolidation papers
become
> exceptionally heavily cited exemplar references (the winning reviews
> that provide the first good consolidation of the new knowledge),
while
> others may just be cited at a 'normal' rate for reviews, which is
> probably a greater rate than non-review papers.  
> 
> Some notes: 
> 
> 1) There is certainly evidence that the mean number of references
per
> paper increases over time. I've read this in the literature (though
I
> can't recall where) and I've seen this in all specialty specific
data
> sets where I've bothered to check it. I think this is function of
> specialty growth:  The network of base knowledge in the specialty
gets
> more intricate as the specialty grows and 'fills in the blanks', so
> authors of later papers have to cite more 'marker references'
(Hargens'
> term [3]) to describe the position of the contribution of their
papers
> in the network of base knowledge in the specialty...    
> 
> 2) There is a correlation between the mean number of references per
> paper and the length of the papers. Evidence for this is given by
> Abt[2]. So any correlations you find between number of references in
the
> paper and the number of citations it receives may be related to
length
> of papers. 
> 
> 3) In my experience, I find that the distribution of the number of
> references per paper is log normally distributed and that the mode
of
> that distribution varies from one specialty to another.  Now, this
fact
> totally baffles me.  What social or cognitive process would cause
this
> distribution to appear?   Is it tied to the same process that
governs
> the distribution of length of papers? Some sort of proportional
growth
> process? It's a mystery wrapped in an enigma!  If you figure out
what
> generates that log-normal distribution, I'll send you a one pound
bottle
> of Tupelo honey as a prize....
> 
> Some other notes: 
> 
> If you want to study the correlation of references per paper to
> citations received, I suggest the following:
> 
> 1) Gather specialty-specific collections of papers for your studies.
The
> heterogeneity in a large multiple-specialty study will totally screw
up
> the statistics...   You should get  about 1000 papers citing about
> 20,000 references for each specialty study...
> 2) Separate your references in the collection into 'exemplar' and
> 'non-exemplar', you can do this by applying a citation threshold,
see
> [1].
> 3) Arrange the exemplar references serially by the order of their
> appearance in the specialty.  I have some SQL queries I can send you
for
> doing this. 
> 4) Look for 'discovery' references at the beginning of this
sequence,
> and 'consolidation' references at the end of the sequence. 
> 5) Study the correlation for 6 classes of reference: 1- general
> references, 2- general references less exemplar references, 3-
discovery
> exemplar references, 4- consolidation exemplar references, 5-
general
> review references, 6- general review references less exemplar
> references. 
> 
> Thanks,
> 
> Steve
> 
> [1] Morris, S. A., 2005,  "Manifestation of emerging specialties in
> journal literature: a growth model of papers, references, exemplars,
> bibliographic coupling, cocitation, and clustering coefficient
> distribution" , JASIST, 56(2) 1250-1273
> [2] Abt, H. A., 2000,  "The reference-frequency relation in the
physical
> sciences", Scientometrics, 49(3), 443-451. 
> [3] Hargens, L. L., 2000, "Using the literature: Reference networks,
> reference contexts, and the social structure of scholarship" 
American
> Sociological Review, 65(6), 846-865
> 
>  
> 
> =================================================
> Steven A. Morris, Ph.D
> Electrical Engineer V, Technology Development Group
> Baker-Atlas/INTEQ
> Houston Technology Center
> 2001 Rankin Road, Houston, Texas 77073
> Office: 713-625-5055, Cell: 405-269-6576
> 
> 
> -----Original Message-----
> From: ASIS&T Special Interest Group on Metrics
> [mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Stephen J Bensman
> Sent: Saturday, January 27, 2007 8:30 AM
> To: SIGMETRICS at LISTSERV.UTK.EDU
> Subject: Re: [SIGMETRICS] question
> 
> Adminstrative info for SIGMETRICS (for example unsubscribe):
> http://web.utk.edu/~gwhitney/sigmetrics.html
> 
> It is well known that review articles summarizing research receive
on
> the
> average more citations than other types of articles.  Your question
is
> considered in the book below:
> 
> Narin, F.  (1976).  Evaluative bibliometrics: The use of publication
and
> citation analysis in the evaluation of scientific activity.  Cherry
> Hill,
> NJ: Computer Horizons, Inc.
> 
> Here Nariin write:
> 
> CHI (Narin, 1976, pp. 183-219) developed its "influence" method in a
> report
> prepared for the National Science Foundation.  In this report it
> criticized
> Garfield's impact factor as suffering from three basic faults (p.
184).
> First, although the impact factor corrects for journal size, it does
not
> correct for average length of articles, and this caused journals,
which
> published longer articles such as review journals, to have higher
impact
> factors.
> 
> 
> 
> My guess is that you would find no or low correlation between length
of
> references and number of citations, but, if you used a chi-squared
test
> of
> independence,  you a strong positive association with review
articles
> dominant in the high reference/high citation cell.  As usual,It
would be
> best to do this test with well-defined subject sets than globally to
> avoid
> the influence of exogenous subject variables.  However, Narin seems
to
> have
> been of a different opinion in respect to correlation, so you might
look
> at
> what he did.
> 
> SB
> 
> 
> 
> 
> Ronald Rousseau <ronald.rousseau at KHBO.BE>@listserv.utk.edu> on
> 01/27/2007
> 07:33:34 AM
> 
> Please respond to ASIS&T Special Interest Group on Metrics
>        <SIGMETRICS at listserv.utk.edu>
> 
> Sent by:    ASIS&T Special Interest Group on Metrics
>        <SIGMETRICS at listserv.utk.edu>
> 
> 
> To:    SIGMETRICS at listserv.utk.edu
> cc:     (bcc: Stephen J Bensman/notsjb/LSU)
> 
> Subject:    [SIGMETRICS] question
> 
> Adminstrative info for SIGMETRICS (for example unsubscribe):
> http://web.utk.edu/~gwhitney/sigmetrics.html
> 
> Dear colleagues,
> 
> Is there a positive correlation between the length of a reference
list
> of a
> publication and the number of citations received? Is this true (or
not)
> in
> general, i.e. considering all types of publication? And what if one
only
> considers 'normal articles', this is when reviews and letters (and
other
> short
> communications) are not taken into account?
> 
> Can someone point me to a reference?
> 
> Thanks!
> 
> Ronald
> 
> 
> --
> Ronald Rousseau
> KHBO (Association K.U.Leuven)- Industrial Sciences and Technology
> Zeedijk 101    B-8400  Oostende   Belgium
> Guest Professor at the Antwerp University School for Library and
> Information
>    Science (UA - IBW)
> E-mail: ronald.rousseau at khbo.be
> web page:  http://users.telenet.be/ronald.rousseau
> 
> 
> 
> ----------------------------------------------------------------
> This message was sent using IMP 3.2.8, the Internet Messaging
Program.
>