Fwd: reply on OA to sigmetrics

Stevan Harnad harnad at ECS.SOTON.AC.UK
Fri Dec 8 09:33:22 EST 2006


This is submitted to Sigmetrics at the request of Henk Moed, whose  
email account is changed so he cannot post to Sigmetrics.
(I will reply shortly.) -- Stevan Harnad

Begin forwarded message:

>  Text to be submitted to Sigmetrics
>
>
> Dear Stevan,
>
> Below follow some replies to your comments on my preprint ‘The  
> effect of 'Open Access' upon citation impact: An analysis of  
> ArXiv's Condensed Matter Section’, available at http://arxiv.org/ 
> abs/cs.DL/0611060.
>
> Henk F. Moed
> Centre for Science and Technology Studies
> Leiden University, The Netherlands
> Moed at cwts.leidenuniv.nl
>
> 1. Early view effect
>
> In my case study on 6 journals in the field of condensed matter  
> physics, I concluded that the observed differences between the  
> citation age distributions of deposited and non-deposited ArXiv  
> papers can to a large extent – though not fully – be explained by  
> the publication delay of about six months of non-deposited articles  
> compared to papers deposited in ArXiv. This outcome provides  
> evidence for an early view effect upon citation impact rates, and  
> consequently upon ArXiv citation impact differentials (CID, my  
> term) or Arxiv Advantage (AA, your term)..
>
> You wrote: “The basic question is this: Once the AA has been  
> adjusted for the "head-start" component of the EA (by comparing  
> articles of equal age -the age of Arxived articles being based on  
> the date of deposit of the preprint rather than the date of  
> publication of the postprint), how big is that adjusted AA, at each  
> article age? For that is the AA without any head-start. Kurtz never  
> thought the EA component was merely a head start, however, for the  
> AA persists and keeps growing, and is present in cumulative  
> citation counts for articles at every age since Arxiving began”.
>
> Figure 2 in the interesting paper by Kurtz et al. (IPM, v. 41, p.  
> 1395-1402, 2005) does indeed show an increase in the very short  
> term average citation impact (my terminology; citations were  
> counted during the first 5 months after publication date) of papers  
> as a function of their publication date as from 1996. My  
> interpretation of this figure is that it clearly shows that the  
> principal component of the early view effect is the head-start: it  
> reveals that the share of astronomy papers deposited in ArXiv (and  
> other preprint servers) increased over time. More and more papers  
> became available at the date of their submission to a journal,  
> rather than on their formal publication date. I therefore conclude  
> that their findings for astronomy are fully consistent with my  
> outcomes for journals in the field of condensed matter physics.
>
> 2. Quality bias
>
> You wrote: “The fact that highly-cited articles (Kurtz) and  
> articles by highly-cited authors (Moed) are more likely to be  
> Arxived certainly does not settle the question of cause and effect:  
> It is just as likely that better articles benefit more from  
> Arxiving (QA) as that better authors/articles tend to Arxive/be- 
> Arxived more (QB)”
>
> I am fully aware that in this research context one cannot assess  
> whether authors publish their better papers in the ArXiv merely on  
> the basis of comparing citation rates of archived and non-archived  
> papers, and I mention this in my paper. Citation rates may be  
> influenced both by the ‘quality’ of the papers and by the access  
> modality (deposited versus non-deposited). This is why I estimated  
> author prominence on the basis of the citation impact of their non- 
> archived articles only. But even then I found evidence that  
> prominent, influential authors (in the above sense) are  
> overrepresented in papers deposited in ArXiv.
>
> But I did more that that. I calculated Arxiv Citation Impact  
> Differentials (CID, my term, or ArXiv Advantage, AA, your term) at  
> the level of individual authors. Next, I calculated the median CID  
> over authors publishing in a journal. How then do you explain my  
> empirical finding that for some authors the citation impact  
> differential (CID) or ArXiv Advantage is positive, for others it is  
> negative, while the median CID over authors does not significantly  
> differ from zero (according to a Sign test) for all journals  
> studied in detail except Physical Review B, for which it is only 5  
> per cent? If there is a genuine ‘OA advantage’ at stake, why then  
> does it for instance not lead to a significantly positive median  
> CID over authors? Therefore, my conclusion is that, controlling for  
> quality bias and early view effect, in the sample of 6 journals  
> analysed in detail in my study, there is no sign of a general ‘open  
> access advantage’ of papers deposited in ArXiv’s Condensed Matter  
> Section.
>
> 3. Productive versus less productive authors
>
> My analysis of differences in Citation Impact differentials between  
> productive and less productive authors may seem “a little  
> complicated”. My point is that if one selects from a set of papers  
> deposited in ArXiv a paper authored by a junior (or less  
> productive) scientist, the probability that this paper is co- 
> authored by a senior (or more productive) author is higher than it  
> is for a paper authored by a junior scientists but not deposited in  
> ArXiv. Next, I found that papers co-authored by both productive and  
> less productive authors tend to have a higher citation impact than  
> articles authored solely by less productive authors, regardless of  
> whether these papers were deposited in ArXiv or not. These outcomes  
> lead me to the conclusion is that the observed higher CID for less  
> productive authors compared to that of productive authors can be  
> interpreted as a quality bias.
>
> 4. General comments
>
> In the citation analysis by  Kurtz et al.  (2005), both the  
> citation and target universe contain a set of 7 core journals in  
> astronomy. They explain their finding of no apparent OA effect in  
> his study of these journals by postulating that “essentially all  
> astronomers have access to the core journals through existing  
> channels”. In my study the target set consists of a limited number  
> of core journals in condensed matter physics, but the citation  
> universe is as large as the total Web of Science database,  
> including also a number of more peripherical journals in the field.  
> Therefore, my result is stronger than that obtained by Kurtz at  
> al.: even in this much wider citation universe, I do not find  
> evidence for an OA advantage effect.
>
> I realize that my study is a case study, examining in detail 6  
> journals in one subfield. I fully agree with your warning that one  
> should be cautious in generalizing conclusions from case studies,  
> and that results for other fields may be different. But it is  
> certainly not an unimportant case. It relates to a subfield in  
> physics, a discipline that your pioneering and stimulating work  
> (Harnad and Brody, D-Lib Mag., June 2004) has analysed as well at a  
> more aggregate level. I hope that more case studies will be carried  
> out in the near future, applying the methodologies I proposed in my  
> paper.
>
> From: ASIS&T Special Interest Group on Metrics on behalf of Stevan  
> Harnad
> Sent: Mon 11/20/2006 22:49
> To: SIGMETRICS at LISTSERV.UTK.EDU
> Subject: [SIGMETRICS] Self-Archiving Impact Advantage: Quality  
> Advantage or Quality Bias?
>
> Adminstrative info for SIGMETRICS (for example unsubscribe):
> http://web.utk.edu/~gwhitney/sigmetrics.html
>
>     Self-Archiving Impact Advantage: Quality Advantage or Quality  
> Bias?
>
>                  Stevan Harnad
>
>     SUMMARY: In astrophysics, Kurtz found that articles that were
>     self-archived by their authors in Arxiv were downloaded and cited
>     twice as much as those that were not. He traced this enhanced  
> citation
>     impact to two factors: (1) Early Access (EA): The self-archived
>     preprint was accessible earlier than the publisher's version  
> (which
>     is accessible to all research-active astrophysicists as soon as
>     it is published, thanks to Kurtz's ADS system). (Hajjem, however,
>     found that in other fields, which self-archive only published
>     postprints and do have accessibility/affordability problems with
>     the publisher's version, self-archived articles still have  
> enhanced
>     citation impact.) Kurtz's second factor was: (2) Quality Bias  
> (QB),
>     a selective tendency for higher quality articles to be  
> preferentially
>     self-archived by their authors, as inferred from the fact that the
>     proportion of self-archived articles turns out to be higher among
>     the more highly cited articles.  (The very same finding is of  
> course
>     equally interpretable as (3) Quality Advantage (QA), a tendency  
> for
>     higher quality articles to benefit more than lower quality  
> articles
>     from being self-archived.) In condensed-matter physics, Moed has
>     confirmed that the impact advantage occurs early (within 1-3  
> years of
>     publication). After article-age is adjusted to reflect the date of
>     deposit rather than the date of publication, the enhanced  
> impact of
>     self-archived articles is again interpretable as QB, with  
> articles by
>     more highly cited authors (based only on their non-archived  
> articles)
>     tending to be self-archived more.  (But since the citation counts
>     for authors and for their articles are correlated, one would  
> expect
>     much the same outcome from QA too.) The only way to test QA vs. QB
>     is to compare the impact of self-selected self-archiving with
>     mandated self-archiving (and no self-archiving).  (The outcome is
>     likely to be that both QA and QB contribute, along with EA, to the
>     impact advantage.)
>
> Michael Kurtz's papers have confirmed that in astronomy/astrophysics
> (astro), articles that have been self-archived -- let's call this
> "Arxived" to mark it as the special case of depositing in the central
> Physics Arxiv -- are cited (and downloaded) twice as much as non- 
> Arxived
> articles. Let's call this the "Arxiv Advantage" (AA).
> http://arxiv.org/
>
>     Henneken, E. A., Kurtz, M. J., Eichhorn, G., Accomazzi, A., Grant,
>     C., Thompson, D., and Murray, S. S. (2006) Effect of E-printing
>     on Citation Rates in Astronomy and Physics. Journal of Electronic
>     Publishing, Vol. 9, No. 2
>     http://arxiv.org/abs/cs/0604061
>
>     Henneken, E. A., Kurtz, M. J., Warner, S., Ginsparg, P.,  
> Eichhorn, G.,
>     Accomazzi, A., Grant, C. S., Thompson, D., Bohlen, E. and  
> Murray, S.
>     S. (2006) E-prints and Journal Articles in Astronomy: a Productive
>     Co-existence (submitted to Learned Publishing)
>     http://arxiv.org/abs/cs/0609126
>
>     Kurtz, M. J., Eichhorn, G., Accomazzi, A., Grant, C. S.,  
> Demleitner,
>     M., Murray, S. S. (2005) The Effect of Use and Access on  
> Citations.
>     Information Processing and Management, 41 (6): 1395-1402
>     http://cfa-www.harvard.edu/~kurtz/kurtz-effect.pdf
>
> Kurtz analyzed AA and found that it consisted of at least 2  
> components:
>
> (1) EARLY ACCESS (EA): There is no detectable AA for old articles in
> astro: AA occurs while an article is young (1-3 years). Hence astro
> articles that were made accessible as preprints before publication  
> show
> more AA: This is the Early Access effect (EA). But EA alone does not
> explain why AA effects (i.e., enhanced citation counts) persist
> cumulatively and even keep growing, rather than simply being a
> phase-advancing of otherwise un-enhanced citation counts, in which  
> case
> simply re-calculating an article's age so as to begin at preprint
> deposit time instead of publication time should eliminate all AA  
> effects
> -- which it does not.
>
> (2) QUALITY BIAS (QB): (Kurtz called the second component
> "Self-Selection Bias" for quality, but I call it self-selection  
> Quality
> Bias, QB): If we compare articles within roughly the same
> citation/quality bracket (i.e., articles having the same number of
> citations), the proportion of Arxived articles becomes higher in the
> higher citation brackets, especially the top 200 papers. Kurtz
> interprets this is as resulting from authors preferentially Arxiving
> their higher-quality preprints (Quality Bias).
>
> Of course the very same outcome is just as readily interpretable as
> resulting from Quality Advantage (QA) (rather than Quality Bias (QB)):
> i.e., that the Arxiving benefits better papers more. (Making a
> low-quality paper more accessible by Arxiving it does not guarantee  
> more
> citations, whereas making a high-quality paper more accessible is more
> likely to do so, perhaps roughly in proportion to its higher quality,
> allowing it to be used and cited more according to its merit,
> unconstrained by its accessibility/affordability.)
>
> There is no way, on the basis of existing data, to decide between  
> QA and
> QB. The only way to measure their relative contributions would be to
> control the self-selection factor: randomly imposing Arxiving on  
> half of
> an equivalent sample of articles of the same age (from preprinting age
> to 2-3 years postpublication, reckoning age from deposit date, to
> control also for age/EA effects), and comparing also with self- 
> selected
> Arxiving.
>
> We are trying an approximation to this method, using articles  
> deposited
> in Institutional Repositories of institutions that mandate
> self-archiving (and comparing their citation counts with those of
> articles from the same journal/issue that have not been self- 
> archived),
> but the sample is still small and possibly unrepresentative, with many
> gaps and other potential liabilities. So a reliable estimate of the
> relative size of QA and QB still awaits future research, when
> self-archiving mandates will have become more widely adopted.
>
> Henk Moed's data on Arxiving in Condensed Matter physics (cond-mat)
> replicates Kurtz's findings in astro (and Davis/Fromerth's, in math):
>
>     Moed, H. F. (2006, preprint) The effect of 'Open Access' upon  
> citation
>     impact: An analysis of ArXiv's Condensed Matter Section
>     http://arxiv.org/abs/cs.DL/0611060
>
>     Davis, P. M. and Fromerth, M. J. (2007) Does the arXiv lead to
>     higher citations and reduced publisher downloads for mathematics
>     articles? Scientometics, accepted for publication.
>     http://arxiv.org/abs/cs.DL/0603056
>     See critiques:
>     http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/ 
> subject.html#5221
>     http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/5440.html
>
> Moed too has shown that in cond-mat the AA effect (which he calls CID
> "Citation Impact Differential") occurs early (1-3 years) rather than
> late (4-6 years), and that there is more Arxiving by authors of
> higher-quality (based on higher citation counts for their non-Arxived
> articles) than by lower-quality authors. But this too is just as  
> readily
> interpretable as the result of QB or QA (or both): We would of course
> expect a high correlation between an author's individual articles'
> citation counts and the author's average citation count, whether the
> author's citation count is based on Arxived or non-Arxived articles.
> These are not independent variables.
>
> (Less easily interpretable -- but compatible with either QA or QB
> interpretations -- is Moed's finding of a smaller AA for the "more
> productive" authors. Moed's explanations in terms of co-authorships
> between more productive and less productive authors, senior and  
> junior,
> seem a little complicated.)
>
> The basic question is this: Once the AA has been adjusted for the
> "head-start" component of the EA (by comparing articles of equal  
> age --
> the age of Arxived articles being based on the date of deposit of the
> preprint rather than the date of publication of the postprint), how  
> big
> is that adjusted AA, at each article age? For that is the AA  
> without any
> head-start. Kurtz never thought the EA component was merely a head
> start, however, for the AA persists and keeps growing, and is  
> present in
> cumulative citation counts for articles at every age since Arxiving
> began. This non-EA AA is either QB or QA or both. (It also has an
> element of Competitive Advantage, CA, which would disappear once
> everything was self-archived, but let's ignore that for now.)
>
>     Harnad, S. (2005) OA Impact Advantage = EA + (AA) + (QB) + QA +
>     (CA) + UA. Preprint.
>     http://eprints.ecs.soton.ac.uk/12085/
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.asis.org/pipermail/sigmetrics/attachments/20061208/409a92d3/attachment.html>


More information about the SIGMETRICS mailing list