Open Access Week: Series of reports on OA

Éric Archambault eric.archambault at SCIENCE-METRIX.COM
Thu Oct 23 10:12:34 EDT 2014

Apologies for cross-posting

As part of Open Access Week 2014, a series of six reports on open access, produced for the European Commission (EC), were posted yesterday on the Science-Metrix website:

These reports were produced as part of the EC efforts to monitor the development of open access (OA) availability of peer-reviewed papers in addition to examining policies to promote OA data and scientific publications.

The core report in the series provides definitions for OA scientific papers to address some of the shortcomings of existing definitions which are far too incomplete to grasp the full spectrum of situations encountered while measuring OA availability.

The following definitions are suggested:

A: Access-can be open (free), restricted or paid; with unrestricted or restricted usage rights; quality controlled or not; pre-print (pre-referring), post-print (post-referring), or published version (with final copy editing and page layout); immediate or delayed; permanent or transient.

OA: Open Access-freely available online to all.

IOA: Ideal OA-free; quality controlled (peer-reviewed or editorially controlled); with unrestricted usage rights (e.g. CC BY); in final, published form; immediate; permanent.

RA: Restricted Access-access restricted to members of a group, club, or society.

PA: Paid Access-access restricted by a pay wall; includes subscription access, licensed access, and pay-to-view access.

Restricted OA-free but with download restrictions (e.g. registration required, restricted to manual download, HTML-only as opposed to self-contained format such as PDF) or re-use rights (e.g. CC NC).

Green OA-OA provided before or immediately after publication by author self-archiving.

Gold OA-immediate OA provided by a publisher, sometimes with paid for publication fee. Note that several Gold journals have right restriction: they are Gold ROA. For example, of the 38% of journals listed in the DOAJ that use a Creative Common licence, only 53% use the CC-BY licence that would allow them to qualify for the IOA definition above (Herb, 2014).
Gold OA Journal-journal offering immediate cover-to-cover access.
Gold OA Article-immediately accessible paper appearing in a Gold journal, or in a PA journal (the latter is also sometimes referred to as hybrid open access).

ROA: Robin Hood OA or Rogue OA-Available for free in spite of restrictions, usage rights, or copyrights (overriding RA, PA, Restricted OA). As the publishers' copyright policies and self-archiving rules are compiled by the University of Nottingham in the SHERPA/RoMEO database, Rogue OA is synonymous with Robin Hood OA.

DOA: Delayed OA-access after a delay period or embargo.
Delayed Green OA-free online access provided by the author after a delay (due to author's own delay to make available for free) or embargo period (typically imposed by publisher).
Delayed Gold OA-free online access provided by the publisher after a delay (e.g. change of policy that makes contents available for free) or embargo period.
Delayed Gold OA Journal-Journal offering cover-to-cover access after an embargo period or after a delay.
Delayed Gold OA Article-Paper appearing in a Gold journal or in a PA journal (the latter is also sometimes referred to as hybrid open access) which is available after an embargo period or after a delay.

TOA: Transient OA-free online access during a certain time.
Transient Green OA-free online access provided by the author for a certain time which then disappears. Note that a substantial part of Green OA could be Transient Green OA due to the unstable nature of the internet, websites, and institutional repositories, many of which are not updated or maintained after a period of time and are therefore susceptible to deletion in subsequent institutional website overhauls. There are also integrator repositories that can change access rules, for example after being acquired by a third party.
Transient Gold OA-free but temporary online access provided by the publisher, instead of permanent. Sometimes appears as part of promotion. Note that some Gold journals and articles sometimes become paid access after a certain time, because of revised strategies by a publisher or because they are sold to another publisher who instaures paid access.

Looking forward, we need to understand these various forms of OA availability. It was beyond the scope of this project to measure all these forms but it is an essential element to address. For example, Robin Hood OA has hardly been measured and is somewhat of a taboo subject. Transiency is another ill-understood subject that should be addressed by fundamental questions such as; What is the percentage of OA papers which are transient and why is this occurring?

Relative to these definitions, the report has shortcomings. In the present reports, the following operational definitions were used to perform measurement:

Green OA: refers to papers which are self-archived by authors and available on institutional repositories as listed in OpenDOAR and/or in ROAR. Listings in OpenDOAR and ROAR which correspond to known Gold OA Journals were set aside. Aggregator sites such as CiteSeerX were not considered here, since, even though they access article submissions, they do not constitute a repository in the classical sense. Likewise, articles in the main PubMed Central sites were not counted as Green as they have curtailed usage rights or limited download rights.[3] Because it is commonly difficult to determine whether a paper was self-archived before, at the same time or after publication and also how long it will be available on the internet, Green OA includes Green OA, Delayed Green and Transient Green. Note that some of these articles may not respect restrictions placed by journal publishers (many of whose rules can be found on SHERPA/ROMEO)[4] and therefore contain a certain number of Robin Hood OA papers. Finally, only articles which could be downloaded without user registrations were considered.

Gold Journals OA: refers to papers appearing in journals listed in the Directory of Open Access Journals (DOAJ)[5] and on the PubMed Central list of journals.[6] When a paper is published during the first year that a journal appears in the DOAJ, it is not counted. This is a conservative decision due to the fact that one cannot determine whether a journal started publishing Gold articles early or late during the year. For PubMed Central, only open access journals with full participation and immediate access were considered to be Gold, hence all journals with an embargo and in the 'NIH Portfolio' were not considered. Thus, this category covers articles appearing in Gold journals and excludes delayed Gold as well as piecemeal Gold (Gold articles in paid access journals, also called hybrid OA).

Other OA: refers to pretty much everything that could be found on the web by a determined researcher and downloaded for free and which was not part of the Green and Gold operational definitions above. This comprises articles appearing in journals with an embargo period (Delayed Gold OA); articles appearing on authors' webpages and elsewhere (both Green OA and Rogue OA); articles appearing on aggregator sites such as ResearchGate and CiteSeerX in addition to PubMed Central. The category comprises both transiently and permanently accessible items as there are no reliable ways to ascertain at measurement time whether an item will be permanently accessible or not.

Total OA: The mutually exclusive sum of Green OA, Gold Journal OA, and Other OA.

These definitions, though they made sense from an operational point-of-view, are inadequate for the future. They were used in response to comments received on last year's series of reports. They were a stopgap measure and reflected what could be done on the project's budget and with the tools available. More detailed work is required, preferably on a large scale such as was done in this study (sample larger than 1 million randomly selected articles).

An important aspect of the study which we hope will be followed by other metrology undertakings on OA availability is the use of: 1) large scale measurement to reduce statistical error; 2) use of calibration sample to determine adjustment by counting precisely recall and precision of the large scale measurement apparatus; 3) applying the calibration to the measured quantities. With hindsight, the application of the second part of the technique is a weak point of the study as the sample size was too small (500) and added an error of ± 4.5 percentage points. The manual calibration should be closer to 10,000 randomly selected papers to establish a gold standard to reduce additional error to about 1 percentage point (simplified discussion here, please see report D1.8 for a more elaborate discussion).

Discussion of the source of data's characteristics is also essential. We need to have a more in-depth understanding of OA availability per country. I strongly suspect that countries that are not covered by WoS and Scopus are more likely to have a greater propensity to diffuse knowledge openly (and more so for the former, which partly explains why measuring OA with WoS provides lower scores). Combining WoS with no calibration for recall and precision can lead to a very serious underestimation of OA availability (missing more than 40% of the actual count of all peer-reviewed papers). It is likely that this study also underestimates OA availability because of the inadequate non-English language scientific literature in Scopus.

Another important contribution of the report is the examination of the scientific impact of OA vs. non-OA literature with three scores: 1) normalised impact of all literature (=1.0); 2) normalised impact of OA literature; 3) normalised impact of non-OA literature. Using a one-million article sample shows the deleterious effect, on average, of non-espousing an OA diffusion strategy. Data are also presented on broad fields of knowledge and show that green OA is king for impact yet even the younger (on average) gold journals are showing greater impact than the more-established (on average) subscription-based journals in several fields. Seriously designed studies are required to control for embargo to understand how DOA papers are disadvantaged in terms of scientific impact relative to immediate OA.

These results are presented at length in the report which can be downloaded from here:

A review of OA policies for scientific publication can be found here:

A review of OA policies for scientific data can be found here:

A comparative analysis of OA policies for scientific publications and data can be found here:

A synthesis report on OA availability and policies can be found here:

Finally, the short version of this synthesis can be found here:

Have a great Open Access Week and we hope you will appreciate these weekend readings.

Yours sincerely

Eric Archambault, Ph.D.
President and CEO | Président-directeur général
Brussels | Montréal | Washington
1335, Mont-Royal E
Montréal, QC  H2J 1Y6

T. 1.514.495.6505 x.111
F. 1.514.495.6523
E-mail: eric.archambault at<mailto:eric.archambault at>



[3] The PubMed Central site mentions 'You may NOT use any kind of automated process to download articles in bulk from the main PMC site. PMC will block the access of any user who is found to be violating this policy'. See



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the SIGMETRICS mailing list