Fwd: reply on OA to sigmetrics
Stevan Harnad
harnad at ECS.SOTON.AC.UK
Fri Dec 8 09:33:22 EST 2006
This is submitted to Sigmetrics at the request of Henk Moed, whose
email account is changed so he cannot post to Sigmetrics.
(I will reply shortly.) -- Stevan Harnad
Begin forwarded message:
> Text to be submitted to Sigmetrics
>
>
> Dear Stevan,
>
> Below follow some replies to your comments on my preprint ‘The
> effect of 'Open Access' upon citation impact: An analysis of
> ArXiv's Condensed Matter Section’, available at http://arxiv.org/
> abs/cs.DL/0611060.
>
> Henk F. Moed
> Centre for Science and Technology Studies
> Leiden University, The Netherlands
> Moed at cwts.leidenuniv.nl
>
> 1. Early view effect
>
> In my case study on 6 journals in the field of condensed matter
> physics, I concluded that the observed differences between the
> citation age distributions of deposited and non-deposited ArXiv
> papers can to a large extent – though not fully – be explained by
> the publication delay of about six months of non-deposited articles
> compared to papers deposited in ArXiv. This outcome provides
> evidence for an early view effect upon citation impact rates, and
> consequently upon ArXiv citation impact differentials (CID, my
> term) or Arxiv Advantage (AA, your term)..
>
> You wrote: “The basic question is this: Once the AA has been
> adjusted for the "head-start" component of the EA (by comparing
> articles of equal age -the age of Arxived articles being based on
> the date of deposit of the preprint rather than the date of
> publication of the postprint), how big is that adjusted AA, at each
> article age? For that is the AA without any head-start. Kurtz never
> thought the EA component was merely a head start, however, for the
> AA persists and keeps growing, and is present in cumulative
> citation counts for articles at every age since Arxiving began”.
>
> Figure 2 in the interesting paper by Kurtz et al. (IPM, v. 41, p.
> 1395-1402, 2005) does indeed show an increase in the very short
> term average citation impact (my terminology; citations were
> counted during the first 5 months after publication date) of papers
> as a function of their publication date as from 1996. My
> interpretation of this figure is that it clearly shows that the
> principal component of the early view effect is the head-start: it
> reveals that the share of astronomy papers deposited in ArXiv (and
> other preprint servers) increased over time. More and more papers
> became available at the date of their submission to a journal,
> rather than on their formal publication date. I therefore conclude
> that their findings for astronomy are fully consistent with my
> outcomes for journals in the field of condensed matter physics.
>
> 2. Quality bias
>
> You wrote: “The fact that highly-cited articles (Kurtz) and
> articles by highly-cited authors (Moed) are more likely to be
> Arxived certainly does not settle the question of cause and effect:
> It is just as likely that better articles benefit more from
> Arxiving (QA) as that better authors/articles tend to Arxive/be-
> Arxived more (QB)”
>
> I am fully aware that in this research context one cannot assess
> whether authors publish their better papers in the ArXiv merely on
> the basis of comparing citation rates of archived and non-archived
> papers, and I mention this in my paper. Citation rates may be
> influenced both by the ‘quality’ of the papers and by the access
> modality (deposited versus non-deposited). This is why I estimated
> author prominence on the basis of the citation impact of their non-
> archived articles only. But even then I found evidence that
> prominent, influential authors (in the above sense) are
> overrepresented in papers deposited in ArXiv.
>
> But I did more that that. I calculated Arxiv Citation Impact
> Differentials (CID, my term, or ArXiv Advantage, AA, your term) at
> the level of individual authors. Next, I calculated the median CID
> over authors publishing in a journal. How then do you explain my
> empirical finding that for some authors the citation impact
> differential (CID) or ArXiv Advantage is positive, for others it is
> negative, while the median CID over authors does not significantly
> differ from zero (according to a Sign test) for all journals
> studied in detail except Physical Review B, for which it is only 5
> per cent? If there is a genuine ‘OA advantage’ at stake, why then
> does it for instance not lead to a significantly positive median
> CID over authors? Therefore, my conclusion is that, controlling for
> quality bias and early view effect, in the sample of 6 journals
> analysed in detail in my study, there is no sign of a general ‘open
> access advantage’ of papers deposited in ArXiv’s Condensed Matter
> Section.
>
> 3. Productive versus less productive authors
>
> My analysis of differences in Citation Impact differentials between
> productive and less productive authors may seem “a little
> complicated”. My point is that if one selects from a set of papers
> deposited in ArXiv a paper authored by a junior (or less
> productive) scientist, the probability that this paper is co-
> authored by a senior (or more productive) author is higher than it
> is for a paper authored by a junior scientists but not deposited in
> ArXiv. Next, I found that papers co-authored by both productive and
> less productive authors tend to have a higher citation impact than
> articles authored solely by less productive authors, regardless of
> whether these papers were deposited in ArXiv or not. These outcomes
> lead me to the conclusion is that the observed higher CID for less
> productive authors compared to that of productive authors can be
> interpreted as a quality bias.
>
> 4. General comments
>
> In the citation analysis by Kurtz et al. (2005), both the
> citation and target universe contain a set of 7 core journals in
> astronomy. They explain their finding of no apparent OA effect in
> his study of these journals by postulating that “essentially all
> astronomers have access to the core journals through existing
> channels”. In my study the target set consists of a limited number
> of core journals in condensed matter physics, but the citation
> universe is as large as the total Web of Science database,
> including also a number of more peripherical journals in the field.
> Therefore, my result is stronger than that obtained by Kurtz at
> al.: even in this much wider citation universe, I do not find
> evidence for an OA advantage effect.
>
> I realize that my study is a case study, examining in detail 6
> journals in one subfield. I fully agree with your warning that one
> should be cautious in generalizing conclusions from case studies,
> and that results for other fields may be different. But it is
> certainly not an unimportant case. It relates to a subfield in
> physics, a discipline that your pioneering and stimulating work
> (Harnad and Brody, D-Lib Mag., June 2004) has analysed as well at a
> more aggregate level. I hope that more case studies will be carried
> out in the near future, applying the methodologies I proposed in my
> paper.
>
> From: ASIS&T Special Interest Group on Metrics on behalf of Stevan
> Harnad
> Sent: Mon 11/20/2006 22:49
> To: SIGMETRICS at LISTSERV.UTK.EDU
> Subject: [SIGMETRICS] Self-Archiving Impact Advantage: Quality
> Advantage or Quality Bias?
>
> Adminstrative info for SIGMETRICS (for example unsubscribe):
> http://web.utk.edu/~gwhitney/sigmetrics.html
>
> Self-Archiving Impact Advantage: Quality Advantage or Quality
> Bias?
>
> Stevan Harnad
>
> SUMMARY: In astrophysics, Kurtz found that articles that were
> self-archived by their authors in Arxiv were downloaded and cited
> twice as much as those that were not. He traced this enhanced
> citation
> impact to two factors: (1) Early Access (EA): The self-archived
> preprint was accessible earlier than the publisher's version
> (which
> is accessible to all research-active astrophysicists as soon as
> it is published, thanks to Kurtz's ADS system). (Hajjem, however,
> found that in other fields, which self-archive only published
> postprints and do have accessibility/affordability problems with
> the publisher's version, self-archived articles still have
> enhanced
> citation impact.) Kurtz's second factor was: (2) Quality Bias
> (QB),
> a selective tendency for higher quality articles to be
> preferentially
> self-archived by their authors, as inferred from the fact that the
> proportion of self-archived articles turns out to be higher among
> the more highly cited articles. (The very same finding is of
> course
> equally interpretable as (3) Quality Advantage (QA), a tendency
> for
> higher quality articles to benefit more than lower quality
> articles
> from being self-archived.) In condensed-matter physics, Moed has
> confirmed that the impact advantage occurs early (within 1-3
> years of
> publication). After article-age is adjusted to reflect the date of
> deposit rather than the date of publication, the enhanced
> impact of
> self-archived articles is again interpretable as QB, with
> articles by
> more highly cited authors (based only on their non-archived
> articles)
> tending to be self-archived more. (But since the citation counts
> for authors and for their articles are correlated, one would
> expect
> much the same outcome from QA too.) The only way to test QA vs. QB
> is to compare the impact of self-selected self-archiving with
> mandated self-archiving (and no self-archiving). (The outcome is
> likely to be that both QA and QB contribute, along with EA, to the
> impact advantage.)
>
> Michael Kurtz's papers have confirmed that in astronomy/astrophysics
> (astro), articles that have been self-archived -- let's call this
> "Arxived" to mark it as the special case of depositing in the central
> Physics Arxiv -- are cited (and downloaded) twice as much as non-
> Arxived
> articles. Let's call this the "Arxiv Advantage" (AA).
> http://arxiv.org/
>
> Henneken, E. A., Kurtz, M. J., Eichhorn, G., Accomazzi, A., Grant,
> C., Thompson, D., and Murray, S. S. (2006) Effect of E-printing
> on Citation Rates in Astronomy and Physics. Journal of Electronic
> Publishing, Vol. 9, No. 2
> http://arxiv.org/abs/cs/0604061
>
> Henneken, E. A., Kurtz, M. J., Warner, S., Ginsparg, P.,
> Eichhorn, G.,
> Accomazzi, A., Grant, C. S., Thompson, D., Bohlen, E. and
> Murray, S.
> S. (2006) E-prints and Journal Articles in Astronomy: a Productive
> Co-existence (submitted to Learned Publishing)
> http://arxiv.org/abs/cs/0609126
>
> Kurtz, M. J., Eichhorn, G., Accomazzi, A., Grant, C. S.,
> Demleitner,
> M., Murray, S. S. (2005) The Effect of Use and Access on
> Citations.
> Information Processing and Management, 41 (6): 1395-1402
> http://cfa-www.harvard.edu/~kurtz/kurtz-effect.pdf
>
> Kurtz analyzed AA and found that it consisted of at least 2
> components:
>
> (1) EARLY ACCESS (EA): There is no detectable AA for old articles in
> astro: AA occurs while an article is young (1-3 years). Hence astro
> articles that were made accessible as preprints before publication
> show
> more AA: This is the Early Access effect (EA). But EA alone does not
> explain why AA effects (i.e., enhanced citation counts) persist
> cumulatively and even keep growing, rather than simply being a
> phase-advancing of otherwise un-enhanced citation counts, in which
> case
> simply re-calculating an article's age so as to begin at preprint
> deposit time instead of publication time should eliminate all AA
> effects
> -- which it does not.
>
> (2) QUALITY BIAS (QB): (Kurtz called the second component
> "Self-Selection Bias" for quality, but I call it self-selection
> Quality
> Bias, QB): If we compare articles within roughly the same
> citation/quality bracket (i.e., articles having the same number of
> citations), the proportion of Arxived articles becomes higher in the
> higher citation brackets, especially the top 200 papers. Kurtz
> interprets this is as resulting from authors preferentially Arxiving
> their higher-quality preprints (Quality Bias).
>
> Of course the very same outcome is just as readily interpretable as
> resulting from Quality Advantage (QA) (rather than Quality Bias (QB)):
> i.e., that the Arxiving benefits better papers more. (Making a
> low-quality paper more accessible by Arxiving it does not guarantee
> more
> citations, whereas making a high-quality paper more accessible is more
> likely to do so, perhaps roughly in proportion to its higher quality,
> allowing it to be used and cited more according to its merit,
> unconstrained by its accessibility/affordability.)
>
> There is no way, on the basis of existing data, to decide between
> QA and
> QB. The only way to measure their relative contributions would be to
> control the self-selection factor: randomly imposing Arxiving on
> half of
> an equivalent sample of articles of the same age (from preprinting age
> to 2-3 years postpublication, reckoning age from deposit date, to
> control also for age/EA effects), and comparing also with self-
> selected
> Arxiving.
>
> We are trying an approximation to this method, using articles
> deposited
> in Institutional Repositories of institutions that mandate
> self-archiving (and comparing their citation counts with those of
> articles from the same journal/issue that have not been self-
> archived),
> but the sample is still small and possibly unrepresentative, with many
> gaps and other potential liabilities. So a reliable estimate of the
> relative size of QA and QB still awaits future research, when
> self-archiving mandates will have become more widely adopted.
>
> Henk Moed's data on Arxiving in Condensed Matter physics (cond-mat)
> replicates Kurtz's findings in astro (and Davis/Fromerth's, in math):
>
> Moed, H. F. (2006, preprint) The effect of 'Open Access' upon
> citation
> impact: An analysis of ArXiv's Condensed Matter Section
> http://arxiv.org/abs/cs.DL/0611060
>
> Davis, P. M. and Fromerth, M. J. (2007) Does the arXiv lead to
> higher citations and reduced publisher downloads for mathematics
> articles? Scientometics, accepted for publication.
> http://arxiv.org/abs/cs.DL/0603056
> See critiques:
> http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/
> subject.html#5221
> http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/5440.html
>
> Moed too has shown that in cond-mat the AA effect (which he calls CID
> "Citation Impact Differential") occurs early (1-3 years) rather than
> late (4-6 years), and that there is more Arxiving by authors of
> higher-quality (based on higher citation counts for their non-Arxived
> articles) than by lower-quality authors. But this too is just as
> readily
> interpretable as the result of QB or QA (or both): We would of course
> expect a high correlation between an author's individual articles'
> citation counts and the author's average citation count, whether the
> author's citation count is based on Arxived or non-Arxived articles.
> These are not independent variables.
>
> (Less easily interpretable -- but compatible with either QA or QB
> interpretations -- is Moed's finding of a smaller AA for the "more
> productive" authors. Moed's explanations in terms of co-authorships
> between more productive and less productive authors, senior and
> junior,
> seem a little complicated.)
>
> The basic question is this: Once the AA has been adjusted for the
> "head-start" component of the EA (by comparing articles of equal
> age --
> the age of Arxived articles being based on the date of deposit of the
> preprint rather than the date of publication of the postprint), how
> big
> is that adjusted AA, at each article age? For that is the AA
> without any
> head-start. Kurtz never thought the EA component was merely a head
> start, however, for the AA persists and keeps growing, and is
> present in
> cumulative citation counts for articles at every age since Arxiving
> began. This non-EA AA is either QB or QA or both. (It also has an
> element of Competitive Advantage, CA, which would disappear once
> everything was self-archived, but let's ignore that for now.)
>
> Harnad, S. (2005) OA Impact Advantage = EA + (AA) + (QB) + QA +
> (CA) + UA. Preprint.
> http://eprints.ecs.soton.ac.uk/12085/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.asis.org/pipermail/sigmetrics/attachments/20061208/409a92d3/attachment.html>
More information about the SIGMETRICS
mailing list