> Elsevier said that citation rates of their journals had gone
> up considerably because of the increased access through wide-
> spread online availability of their journals...
> Online availability clearly increased the IF [journal citation
> impact factor]. In the FUTON subcategory, there was an IF gradient 
> favoring journals with freely available articles. ..."
> I think it is quite obvious why sources available with open access 
> will be used and cited more often than others...
> So the usefulness of open access is a matter of daily experience, 
> not so much of academic discussions whether there is any empirical 
> proof for a citation advantage of open access that may be isolated 
> by eliminating all possible confounders...
> That open access leads to more visibility and thereby potentially
> more citations is trivial, but this relative open access advantage
> will vary from journal to journal... 
> Due to the multitude of possible confounding factors I would not 
> believe any of the figures calculated by Stevan Harnad as the 
> cumulated lost impact, or conversely, the possible gain.

I couldn't quite follow the logic of this posting. It seemed to be
saying that, yes, there is evidence that OA increases impact, it is even
trivially obvious, but, no, we cannot estimate how much, because there
are possible confounding factors and the size of the increase varies.

All studies have found that the size of the OA impact differential varies
from field to field, journal to journal, and year to year. The range
of variation is from +25% to over +250% percent. But the differential
is always positive, and mostly quite sizeable. That is why I chose a
conservative overall estimate of +50% for the potential gain in impact if it
were not just the current 15% of research that was being made OA, but
also the remaining 85%. (If you think 50% is not conservative enough, use
the lower-bound 25%: You'll still find a substantial potential impact
gain/loss. If you think self-selection accounts for half the gain, split
it in half again: there's still plenty of gain, once you multiply by
85% of total citations.)

An interesting question that has since arisen (and could be answered by
similar studies) is this:

    Since it is known that (in science) the top 10% of articles published
    receive 90% of the total citations made (Seglen 1992), to what
    extent is the top 10% of articles published over-represented among
    the c. 15% of articles that are being spontaneously made OA by their
    authors today?

It is a logical possibility that all or most of the top 10% are already
among the 15% that are being made OA: I rather doubt it; but it would
be worth checking whether it is so. If it did turn out to be so, then
reaching 100% OA would be far less urgent and important than I had argued,
and OA mandates would likewise be less important.

The empirical studies of the relation between OA and impact have been
mostly motivated by the objective of accelerating the growth of OA -- and
thereby the growth of research usage and impact. Those who are confident
that the OA impact differential is merely or largely a non-causal
self-selection bias are encouraged to demonstrate that that is the case.

Note very carefully, though, that the observed correlation between OA
and citations takes the form of a correlation between the number of OA
articles, relative to non-OA articles, at each citation level. The more
highly cited an article, the more likely it is OA. This is true within
journals, and within and across years, in every field tested.

And this correlation can arise because more-cited articles are more
likely to be made OA *or* because articles that are made OA are more
likely to be cited (or both -- which is what I think is in reality
the case). It is certainly *not* the case that self-selection is the
default or null hypothesis, and that those who interpret the effect as
OA causing the citation increase hence have the burden of proof: The
situation is completely symmetric numerically; so your choice between the
two hypotheses is not based on the numbers, but on other considerations,
such as prima facie plausibility -- or financial interest.

Until and unless it is shown empirically that today's OA 15% already
contains all or most of the top-cited 10% (and hence 90% of what
researchers cite), I think it is a much more plausible interpretation
of the existing findings that OA is a cause of the increased usage and
citations, rather than just a side-effect of them, and hence that there
is usage and impact to be gained by providing and mandating OA. (I can
quite understand why those who have a financial interest in its being
otherwise [Craig et al. 2007] might prefer the other interpretation,
but clearly prima facie plausibility cannot be their justification.)

I also think that 50% of total citations is a plausible overall estimate
of the potential gain from OA, as long as it is understood clearly that
that the 50% gain does not apply to every article made OA. Many articles
are not found useful enough to cite no matter how accessible you make
them. The 50% citation gain will mostly accrue to the top 10% of articles,
as citations always do (though OA will no doubt also help to remedy some
inequities and will sometimes help some neglected gems to be discovered
and used more widely). In other words, the OA advantage to an article
will be roughly proportional to that article's intrinsic citation value
(independent of OA).

Other interesting questions: The top-cited articles are not evenly
distributed among journals. The top journals tend to get the top-cited
articles. It is also unlikely that journal subscriptions are evenly
distributed among journals: The top journals are likely to be subscribed
to more, and are hence more accessible.

So if someone is truly interested in these questions (as I am not!),
they might calculate a "toll-accessibility index" (TAI) for each article,
based on the number of researchers/institutions that have toll access to
the journal in which that article is published. An analysis of covariance
can then be done to see whether and how much the OA citation advantage
is reduced if one controls for the article's TAI. (I suspect the answer
will be: somewhat, but not much.)

Stevan Harnad

