Open access to research worth A3 1.5bn a year

Thu Sep 29 16:50:55 EDT 2005

On Thu, 29 Sep 2005, Peter Banks wrote:

> omits the Anderson paper, which did not show an effect.

Steve Hitchcock will shortly include that paper (many thanks for pointing it
out), duly noting that it is based only on one journal, one 3-year sample
several years ago, might be comparing articles with different selection
criteria, seems to compare online with non-online rather than OA
with non-OA, and is refuted by the preponderance of subsequent evidence based
on much larger and wider samples (but, as you correctly note, much of it not
yet published, hence not yet peer-reviewed).

> The claim seems to come mainly from the paper of Brody et
> al (Citation Impact of Open Access Articles vs. Articles
> available only through subscription ("Toll-Access"),
> http://citebase.eprints.org/isi_study/). I think these authors have
> generalized their findings far beyond what the data can support. They
> are convincing when they stick to the core Physics/Mathematics papers
> in the ArXiv database, but not when they try to apply their method
> to other disciplines.

It is not Brody et al. that apply the method to other disciplines (in their
published work to date) but Hajjem et al., in their not yet published work.
Brody et al. published only on the Arxiv-based comparisons.

> Their method is to compare citations for papers in ArXiv vs. those
> that are not. Outside of the core physicians and mathematics
> literature, however, ArXiv contains very few papers from disciplines
> like medicine or social sciences. For most fields, Brody finds
> that the number of papers that are 0A are less than 1%--sometimes
> much less than 1%.

(1) I note that Dr. Banks seems to be ready to take unpublished results at
face value when they are congenial to his preferences: Brody et al's data
from other disciplines is on their unrefereed data site, not in their
refereed paper.

    http://citebase.eprints.org/isi_study/

(2) The unpublished data on other disciplines from the Brody et al. site
are merely pilot data, based on tiny samples. They have since been superseded
by far larger systematic sampling at the Hajjem et al data site, and there
the proportions of self-archived papers in other disciplines are found to be
much higher (though this too is probably an underestimate, as will be
discussed in the published version).

    http://www.crsc.uqam.ca/lab/chawki/ch.htm

> For these papers, which are likely to be highly
> specialized and relatively obscure, self-archiving probably does
> have a large effect. One can not conclude, however, that the same
> effect would occur for a widely-read, widely cited journal, like
> Pediatrics or Diabetes Care. I suspect--and the Anderson paper may
> hint at this--that the more widely read the journal, the less the
> citation advantage for OA.

We will take this a prediction, and I will ask Chawki Hajjem to check your
journals in particular, as special cases, and compare them to other
biomedical and non-biomedical journals.

But your assumptions about the overall proportions are contradicted by
the evidence. I will add, however, that so far the *size* of the average
OA/nonOA advantage in biomedicine -- though always present, as in other
fields -- is lower than in other fields, hovering at about 20% rather
than the 50-250% elsewhere, though it is higher in some biomed subfields. I
don't know the reason for this. I don't know whether it will hold up
with still larger samples, but it does not appear to be because of a
systematically lower self-archiving rate in biomedicine.

> What we need to study other disciplines are archives like
> ArXiv. Perhaps there are others in certain fields that could be
> mined for research.

No, we don't need to study centralised archives like Arxiv in
other disciplines, because since Arxiv (1991) there has been the
OAI-interoperability protocol (1999) which effectively made all
distributed archives equivalent and interoperable.  What is not clear,
howeverm is how much of the 15% self-archiving our robots are picking
up in their web-trawls are from arbitrary websites and how much from
OAI-compliant Institutional Repositories. This too will be analysed,
to test for any differential trends.

Stevan Harnad