Quality Bias vs Quality Advantage

Stevan Harnad harnad at ECS.SOTON.AC.UK
Tue Feb 20 12:43:15 EST 2007

On Tue, 20 Feb 2007, Franck Laloe wrote:

> The Open Access Citation Advantage: Quality Advantage Or Quality Bias?
> http://openaccess.eprints.org/index.php?/archives/191-guid.html
> I have now read this text. Interesting!
> A few comments:
> 1) First, a detail. The text seems to imply that 
> what is called "Quality advantage" takes place 
> because each author selects his/her best articles 
> and posts only these on an open archive. 

Dear Franck:

Just the opposite! That is the Quality Bias (QB), not the Quality
Advantage (QA). (The Quality Advantage is the tendency for the better
articles to benefit more from OA: A bad article will not be cited even
if one thrusts it in every user's face! (And the majority of articles
are not cited.)

> My perception is different; I would say that, 
> statistically, among my colleagues the more 
> dynamic, productive, etc.. are more inclined to 
> use open archives, and that they are also the 
> more cited authors. 

The Quality Bias (QB) applies at both levels: A selective tendency for the
better authors to self-archive, which translates into a tendency for the
better articles to be self-archived, as well as a selective tendency to
self-archive one's better articles.

My colleague Michael Kurtz has even suggested that the best rationale for
OA self-archiving might be that the better researchers do it! ("Follow
the Leader!") But if it were true that their doing it has no causal
benefits, and is merely a superstitious bias, then there is no point in
the lesser researchers (or any researchers, for that matter) doing it --
any more than there is any point in lesser researchers smoking the same
brand of cigarettes as the better ones!

> A person selection in 
> disciplines and institutions, rather than a 
> selection article by article if you like.

I quite agree that Quality Bias is more likely to be a between-author
bias than a within-author bias. Henk Moed has found that authors don't
self-archive *all* their papers, but that there is an impact advantage
even for the non-self-archived papers of self-archiving authors.

> 2) The yellow/blue table with the equation AOO = 
> EA+QA+... is a little bit confusing when it uses 
> the notion of 100%OA. One should distinguish 
> between 100% in the discipline, or 100% in the 
> institution of the author, or even 100%  for an 
> isolated author who is a fan of the open 
> archives. For instance, CA disappears only in 
> 100%OA exists in the discipline, but has nothing 
> to do with 100% in the intuition of the author, etc..

You are quite right: 100% OA is used in two senses: (i) 100% of an
individual institution's output and (ii) 100% of the entire discipline's
output. The competitive Advantage (CA) will only vanish when 100% of a
discipline's output is OA. An individual institution continues to enjoy
a CA even after it has reached 100% OA for its own output, as long as
other institutions have not yet done so!

I suspect that the Competitive Advantage will be the biggest start-up
incentive toward accelerating and mandating OA self-archiving.

But even after an entire discipline (or all disciplines) have reached
100% OA, there will still be (1) the Quality Advantage (QA: the better
articles will have more usage and impact than they would have had
without OA), (2) the Usage Advantage (UA: more downloading and reading,
even without citations, than there would have been before OA) and (3) the
Early Advantage (EA, the earlier the postprint, or even the preprint,
is made OA, the more impact). CA, however, will be gone at 100%
(discipline-level) OA, and so will QB.

> In passing, I note that EA also disappears in 
> this case, while it is not mentioned.

I am afraid you are mistaken about that: Not only QA and UA but also
EA remain at 100% OA; it is only CA and QB that vanish: Even when
there is 100% (discipline-wide) OA self-archiving of the peer-reviewed
postprint from the moment of acceptance for publication, there is still
the pre-refereeing preprint, and in principle that (less reliable, but
sometimes valuable) route of posting early drafts can stretch back quite
a bit before publication, giving some researchers an extra advantage
(and risk!) if they post earlier...

> I note in passing that all the advantages studied 
> in this text would disappear in case of high 
> percentage of OAP (OA publishing, what Stevan 
> calls gold); fortunately, other advantages would subsist.

Gold OA publishing -- which basically amounts to the publisher archiving
the paper instead of the author himself -- can confer all the advantages
of Green OA self-archiving (provided the accepted, peer-reviewed draft is
made OA immediately upon acceptance). In that regard there is absolutely
no difference between Green and Gold OA with regard to QA, QB, CA or
UA. But EA (the Early Advantage) pertains only to author self-archiving:
Journals are not in a position to "publish" drafts of papers they have
not yet accepted to publish! Only authors can do (or authorise) that.

> 3) Now the interpretation of the data. This is 
> the most difficult part; it is very delicate to 
> prove something from the effect of mandate.
> (i) For instance, one could argue that the author 
> selection effect described in my 1) takes place 
> at the level of the institutions: the best ones 
> with the best researches are more inclined to 
> push their scientist (or force them) to deposit 
> their production. If this is true (I am not 
> arguing it is in reality, I do not know), all 
> the calculation is biassed.

You are quite right that *if* the Universities of Southampton, Minho, Tasmania,
and QUT, plus CERN, are already among the best in the world, then that
explains why their articles have higher citation counts than articles
published in the same journal and year from all other institutions.

But if the reason why Southampton, Minho, Tasmania, etc. and not other
institutions, already have self-archiving mandates is instead that OA
activists like Tom Cochrane and Paula Callan (pro-VC and librarian of
QUT), Eloy Rodriques (library director of Minho), Arthur Sale (Tasmania),
and Jens Vigen and Joanne Yeomans (CERN) have managed to persuade their
institutions to mandate OA self-archiving, and to help ensure that the
mandates are complied with, then their OA advantage is rather unlikely
to be just corollary to a tautological Quality Bias (QB)...

(I invite you to ponder, in a relaxed moment, the question of why the University
of Southampton's web-impact "G-Factor" is the UK's 3rd highest, and 25th
in the world, ahead of (for example) Columbia University and Yale: Proud
as I am of Southampton's excellent research, I cannot help thinking that
this high impact has a good deal to do with Southampton ECS's self-archiving
mandate, the first in the world. And if you consult the ECS download logs,
you will find that the lion's share of the hits are indeed ECS's mandated
EPrints Archive.)


(By the way, those who are absolutely bent on invoking the epicycles
of the Quality Bias interpretation have another obvious means of saving
their increasingly far-fetched theory: Since none of the mandated
institutions have as yet quite reached 100% OA (though CERN is close),
one could argue that their "OA Advantage" is really just a result of
a self-selective non-compliance bias: Their worst authors are the ones
not yet complying with the mandate!)

> (ii) More serious is the fact that, even in the 
> presence of mandate/strong encouragement, such as 
> in the case of Wellcome Trust and NIH, the collection 
> proportion is around 10%, not more. So, at least 
> for the moment, there is still room for an 
> enormous bias of the conclusion by "quality advantage".

Again, I think you mean QB not QA here, and it sounds as if you're indeed
invoking the Ptolemeic interpretation I mentioned above.

So let me point out that our analyses are *not* based on the Wellcome Mandate,
for which there are as yet no data, nor, a fortiori, on the NIH "public
access policy," which is not even a mandate, has a compliance rate of 4%
and is now universally acknowledged to be a failure because it did not
mandate self-archiving.

Our data are based on the 5 first institutional mandates: The Southampton
ECS and CERN Institutional Repositories' percentage OA is much closer
to 100% than 10%; in the other other three (Minho, QUT, Tasmania) the
figures for the past two years are a good deal higher than 10% too! 10%
is more like the %OA for unmandated Institutional Repositories.

Best wishes,

Stevan Harnad
American Scientist Open Access Forum

Chaire de recherche du Canada			Professor of Cognitive Science    
Ctr. de neuroscience de la cognition	Dpt. Electronics & Computer Science
Université du Québec à Montréal			University of Southampton         
Montréal, Québec						Highfield, Southampton
Canada  H3C 3P8							SO17 1BJ United Kingdom
http://www.crsc.uqam.ca/				http://www.ecs.soton.ac.uk/~harnad/

