Whether Self-Selected or Mandated, Open Access Increases Citation Impact for Higher Quality Research

On 7-Jan-10, at 6:50 AM, Philip Davis wrote:

> An interesting bit of research, although I have some methodological  
> concerns about how you treat the data, which may explain some  
> inconsistent and counter-intuitive results, see:
> http://j.mp/8LK57u
> A technical response addressing the methodology is welcome.
Thanks for the feedback. We reply to the three points of substance, in  
order of importance:

(1) LOG RATIOS: We analyzed log citation ratios to adjust for  
departures from normality. Logs were used to normalize the citations  
and attenuate distortion from high values. This approach loses some  
values when the log tranformation makes the denominator zero, but  
despite these lost data, the t-test results were significant, and were  
further confirmed by our second, logistic regression analysis. Moed's  
(2007) point was about (non-log) ratios that were not used in this  
study. We used the ratio of log citations and not the log of citation  
ratios. When we compare log3/log2 with log30/log20, we don't compare  
percentages with percentages (60% with 14%) because the citation  
values are transformed or normalized: the higher the citations, the  
stronger the normalisation. It is highly unlikely that any of this  
would introduce a systematic bias in favor of OA, but if the referees  
of the paper should call for a "simpler and more elegant" analysis to  
make sure, we will be glad to perform it.

(2) Effect Size: The size of the OA Advantage varies greatly from year  
to year and field to field. We reported this in Hajjem et al (2005),  
stressing that the important point is that there is virtually always a  
positive OA Advantage, absent only when the  sample is too small or  
the effect is measured too early (as in Davis et al's 2008 study). The  
consistently bigger OA Advantage in physics (Brody & Harnad 2004) is  
almost certainly an effect of the Early Access factor, because in  
physics, unlike in most other disciplines (apart from computer science  
and economics), authors tend to make their unrefereed preprints OA  
well before publication. (This too might be a good practice to  
emulate, for authors desirous of greater research impact.)

(3) Mandated OA Advantage? Yes, the fact that the citation advantage  
of mandated OA was slightly greater than that of self-selected OA is  
surprising, and if it proves reliable, it is interesting and worthy of  
interpretation. We did not interpret it in our paper, because it was  
the smallest effect, and our focus was on testing the Self-Selection/ 
Quality-Bias  hypothesis, according to which mandated OA should have  
little or no citation advantage at all, if self-selection is a major  
contributor to the OA citation advantage.

Our sample was 2002-2006. We are now analyzing 2007-2008. If there is  
still a statistically significant OA advantage for mandated OA over  
self-selected OA in this more recent sample too, a potential  
explanation is the inverse of the Self-Selection/Quality-Bias  
hypothesis (which, by the way, we do think is one of the several  
factors  that contribute to the OA Advantage, alongside the other  
contributors:  Early Advantage, Quality Advantage, Competitive  
Advantage, Download Advantage, Arxiv Advantage, and probably others).  http://openaccess.eprints.org/index.php?/archives/29-guid.html

The Self-Selection/Quality-Bias (SSQB) consists of better authors  
being more likely to make their papers OA, and/or authors being more  
likely to make their better papers OA, because they are better, hence  
more citeable. The hypothesis we tested was that all or most of the  
widely reported OA Advantage across all fields and years is just due  
to SSQB. Our data show that it is not, because the OA Advantage is no  
smaller when it is mandated. If it turns out to be reliably bigger,  
the most likely explanation is a variant of the "Sitting Pretty" (SP)  
effect, whereby some of the more comfortable authors have said that  
the reason they do not make their articles OA is that they think they  
have enough access and impact already. Such authors do not self- 
archive spontaneously. But when OA is mandated, their papers reap the  
extra benefit of OA, with its Quality Advantage (for the better, more  
citeable papers). In other words, if SSQB is a bias in favor of OA on  
the part of some of the better authors, mandates reverse an SP bias  
against OA on the part of others of the better authors. Spontaneous,  
unmandated OA would be missing the papers of these SP authors. http://www.eprints.org/openaccess/self-faq/#29.Sitting

There may be other explanations too. But we think any explanation at  
all is premature until it is confirmed that this new mandated OA  
advantage is indeed reliable and replicable. Phil further singles out  
the fact that the mandate advantage is present in the middle citation  
ranges and not the top and bottom. Again, it seems premature to  
interpret these minor effects whose unreliability is unknown, but if  
forced to pick an interpretation now, we would say it was because the  
"Sitting Pretty" authors may be the middle-range authors rather than  
the top ones...

Yassine Gargouri, Chawki Hajjem, Vincent Lariviere, Yves Gingras, Les  
Carr, Tim Brody, Stevan Harnad

Brody, T. and Harnad, S. (2004) Comparing the Impact of Open Access  
(OA) vs. Non-OA Articles in the Same Journals. D-Lib Magazine 10(6). http://eprints.ecs.soton.ac.uk/10207/

Davis, P.M., Lewenstein, B.V., Simon, D.H., Booth, J.G., Connolly,  
(2008) Open access publishing, article downloads, and citations:  
randomised controlled trial British Medical Journal 337:a568 http://www.bmj.com/cgi/reprint/337/jul31_1/a568

Hajjem, C., Harnad, S. and Gingras, Y. (2005) Ten-Year Cross- 
Disciplinary Comparison of the Growth of Open Access and How it  
Increases Research Citation Impact. IEEE Data Engineering Bulletin  
28(4) 39-47. http://eprints.ecs.soton.ac.uk/11688/

Moed, H. F. (2006) The effect of 'Open Access' upon citation impact:  
An analysis of ArXiv's Condensed Matter Section Journal of the  
American Society for Information Science and Technology 58(13)  
2145-2156 http://arxiv.org/abs/cs/0611060

> Stevan Harnad wrote:
>> Self-Selected or Mandated, Open Access Increases Citation Impact for
>> Higher Quality Research
>> http://arxiv.org/abs/1001.0361
>> Yassine Gargouri, Chawki Hajjem, Vincent Lariviere, Yves Gingras, Les
>> Carr, Tim Brody, Stevan Harnad
>> ABSTRACT: Articles whose authors make them Open Access (OA) by
>> self-archiving them online are cited significantly more than articles
>> accessible only to subscribers. Some have suggested that this "OA
>> Advantage" may not be causal but just a self-selection bias, because
>> authors preferentially make higher-quality articles OA. To test this
>> we compared self-selective self-archiving with mandatory
>> self-archiving for a sample of 27,197 articles published 2002-2006 in
>> 1,984 journals. The OA Advantage proved just as high for both.
>> Logistic regression showed that the advantage is independent of other
>> correlates of citations (article age; journal impact factor; number  
>> of
>> co-authors, references or pages; field; article type; or country) and
>> greatest for the most highly cited articles. The OA Advantage is  
>> real,
>> independent and causal, but skewed. Its size is indeed correlated  
>> with
>> quality, just as citations themselves are (the top 20% of articles
>> receive about 80% of all citations). The advantage is greater for the
>> more citeable articles, not because of a quality bias from authors
>> self-selecting what to make OA, but because of a quality advantage,
>> from users self-selecting what to use and cite, freed by OA from the
>> constraints of selective accessibility to subscribers only.
>> http://eprints.ecs.soton.ac.uk/18346/

