How to Compare IRs and CRs dwojick at HUGHES.NET
Sun Feb 10 07:02:42 EST 2008


My point is that one should not consider (and design) OA in isolation. 
OA should be viewed as part of a systematic change in the way we do 
science. Or, to put it another way, OA has to be justified in terms of 
the benefits it will provide. OA is disruptive and costly so the 
benefits must be correspondingly great. 

The benefits of OA in science lie in increased efficiency of 
communication. What I call better, faster science. But access is only 
part of the communication process. I am working the other part -- 
getting the stuff to the people who need it as efficiently as possible 
(findability). My point is that my part of the system has something to 
say about your part. Less metaphorically, OA design issues like IR 
versus CR need to consider the delivery (or findability) issue, perhaps 
even being determined by them. 

My specific point was that your IR solution to OA looks like it 
creates problems with my delivery solution. Perhaps we can discus this.

As for the research, it was very preliminary. We just took one issue 
of each of several major journals, in physics and chemistry, and 
manually (intelligently) searched the web for each article. Starting by 
author typically worked better than by title or text. We got a good 
success rate. I should point out that much, perhaps most, of web 
available science is not on Google. It is in the deep web.


----Original Message----
From: harnad at ECS.SOTON.AC.UK
Date: 02/09/2008 7:01 PM
Subj: Re: [SIGMETRICS] How to Compare IRs and CRs

On Sat, 9 Feb 2008, David E. Wojick wrote:

> I disagree Steve (and I am doing staff work for the US Federal
> Interagency Working Group that is grappling with these issues).

Which issues? OA's target content is the 2.5 million annual articles
published in the planet's 25,000 peer-reviewed journals, across all
scholarly and scientific disciplines, in all languages.

> Mind you I am all for OA, but integrating all the web accessible 
> science is far from trivial.

I agree. But (1) OA is not about integrating all of web accessible
science; nor is it (2) only about science; nor is it (3) about making
all science web-accessible.

It's first and foremost about making the 2.5 million annual articles
published in the planet's 25,000 peer-reviewed journals, across all
scholarly and scientific disciplines, in all languages freely 
on the web,

> Google, Google Scholar,,,
> etc., each have large, irrational hunks. It is far from clear that 
> tens of thousands of independent IR's is going to help.

OA is not about adding tens of thousands of empty IRs to existing web
content. It is about getting the 2.5 million annual articles published
in the planet's 25,000 peer-reviewed journals, across all scholarly 
scientific disciplines, in all language into their authors' OA IRs.

> Also, journal articles are not my favorite content, because they 
> to be one to two years after the research and are too short.

But journal articles are OA's target content. And OA means getting 
freely accessible online immediately upon acceptance for publication,
not 1-2 years afterward.

> I prefer conference presentations, reports, even awards and news,
> to journals. We are trying to speed up science and journals are the
> tail end of research.

Those are all fine, and welcome in IRs, over and above OA's target
content; but OA's target content -- the 2.5 million annual articles
published in the planet's 25,000 peer-reviewed journals, across all
scholarly and scientific disciplines, in all languages -- is OA's
immediate priority.

> So OA is a worthy cause but only a small part of the policy
> picture. Findability of key information is the core issue.

Findability may be a problem for other causes, but it is not a problem
for OA (which is the only cause I am talking about). Absence, not
findability, is OA's problem.

> BTW I did some research that suggests that 60-80% of the journal 
> or something roughly equivalent, is findable for free if you poke 
> long enough, in some disciplines anyway.

I would be very interested to see that research, to find out in what
fields that is true, and in what time-slice. I am aware of a few 
(mostly in physics) where it is true, but always happy to learn of 
Our robot studies, across fields and years, find 5% to 15% of content,
depending on field (and that's using google).

Stevan Harnad

> David Wojick
>> On Sat, 9 Feb 2008, dwojick at wrote:
>>> Steve, I am concerned when you say the following --
>>>> "It's from the local repositories that the local produce can then 
>>>> "harvested" (the limitations of a mixed metaphor!) to some 
>>>> site, if desired, or just straight to an indexer like Google 
>>>> or Citebase."
>>> OA in 10's of 1,000's of IRs is virtually worthless without some 
>>> good, central, global, search capability. How to build this 
>>> is far from clear.
>>> David Wojick
>> The answer is as simple as it is certain: OA's problem today is 
>> not *search*.
>> What is missing is 85+% of OA's target content (2.5M annual 
>> in 25K peer-reviewed journals), not the means of searching it! 
>> search power -- both implemented and under development -- is orders 
>> magnitude richer than the OA database for which it is intended.
>> Figure out a way to fill all the world's university IRs with 100% 
>> their annual article output, and the rest is a piece of cake.
>> Keep fussing about the dessert when there's still no main course, 
>> you have a recipe for prolonging the hunger of your esteemed guests
>> even longer than they've already endured it (for over a decade and 
>> half to date).
>> (The way is already figured out, by the way: it's the institutional
>> Green OA Self-Archiving Mandate. What still needs effort is getting 
>> universities to go ahead and adopt them, instead of waiting 
>> while fussing instead about preservation, copyright, publishing 
>> -- and improved search engines!)
>> Stevan Harnad
>> If you have adopted or plan to adopt a policy of providing Open 
>> to your own research article output, please describe your policy 
>>    BOAI-1 ("Green"): Publish your article in a suitable toll-access 
>> OR
>>    BOAI-2 ("Gold"): Publish your article in an open-access journal 
>>    a suitable one exists.
>> AND
>>    in BOTH cases self-archive a supplementary version of your 
>>    in your own institutional repository.
> -- 
> "David E. Wojick, PhD" <WojickD at>
> Senior Consultant for Innovation
> Office of Scientific and Technical Information
> US Department of Energy
> 391 Flickertail Lane, Star Tannery, VA 22654 USA
> 540-858-3136
> provides my bio and 
past client list.
presents some of my own research on information structure and dynamics.

More information about the SIGMETRICS mailing list