Interoperability - subject classification/terminology

Stevan Harnad harnad at ECS.SOTON.AC.UK
Tue Nov 18 12:48:59 EST 2003


On Tue, 18 Nov 2003, Franklin, Rosemary (franklra) wrote:

> The only quibble with your bet is that humanities scholars/researchers often
> work in the realm of abstract (soft) ideas and arguments which are not so
> easily searched and retrieved, while the sciences are concrete (hard)with
> data and vocabulary more easily discovered.  How do you search nuances?

I don't know of any evidence that inverted full-text boolean search
is any less effective in one field than another. (Does anyone have any
such evidence?)

Stevan Harnad

> -Original Message-----
> From: Stevan Harnad [mailto:harnad at ecs.soton.ac.uk]
> Sent: Friday, November 14, 2003 12:07 PM
> To: BOAI Forum
> Cc: september98-forum at amsci-forum.amsci.org
> Subject: [BOAI] Re: Interoperability - subject
> classification/terminology [bcc][faked-from][mx]
>
>
> On Thu, 13 Nov 2003, Franklin, Rosemary (franklra) wrote:
>
> > Generally you are searching in natural language, depending on the fields
> > tagged and how the file is organized.  Portals such as the HUMBUL site and
> > others organized around broad subject areas are value-added OAI searching
> > and have controlled vocabulary added, or they are in the process of
> adding.
>
> I would like to make a bet about values that will prove to be worth and not
> worth
> adding to a full-text corpus of refereed research journal articles. (Note
> that
> this bet pertains *only* to the refereed journal article corpus, but that
> does
> include all disciplines, including the humanities):
>
> Until and unless XML tagging of the full-texts themselves prevails -- a
> desirable outcome that is largely independent of the urgent goal of open
> access -- nothing will come even close to matching (let alone beating)
> the power of boolean search over the inverted full-texts, google-style
> (but restricted to the OAI-compliant domain).
>
> Please remember that most researchers currently search their abstracts
> databases and
> their toll-access journal content databases without the help of any subject
> classification taxonomies. This will continue to be the case for the
> open-access
> full-text database, once it grows to a significant size. Journal articles --
> especially when they include inverted full-text -- are not, and never
> were, searched via prepackaged subject classifications or taxonomies
> or aggregations. And even those taxonomies and aggregations that exist
> were generated by machine analysis of the database rather than by human
> classification. (In other words, they were generated by "semantic-web"
> -- i.e., syntactic-web! -- computations on the full-text database.)
>
> See Subject Thread:
>     "Interoperability - subject classification/terminology"
>     http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2384.html
>
> I know that especially in the humanities, many scholars and librarians are
> betting
> otherwise. It will be interesting to see what the outcome turns out to be.
>
> But let it be stressed again: This has nothing to do with open access,
> except
> inasmuch as it is extremely important not to hold back open access for even
> one
> microsecond in order to wait for classification/taxonomy values to be added
> -- any
> more than open access should be delayed in any way to wait for preservation
> values
> to be added.
>
> The intuitive point to keep in mind is that we are talking about OAI
> eprint space, not google space. Needle/haystack problems in google space
> vanish when it is contracted to just the OAI eprint subspace. OAI eprint
> space
> consists of the yearly 2,500,000 articles in the planet's 24,000
> peer-reviewed
> journals in all fields and languages, before (preprints) and after peer
> review (postprints).
>
> http://www.eprints.org/self-faq/#What-is-Eprint
>
> Stevan Harnad
>
> NOTE: Complete archive of the ongoing discussion of providing open
> access to the peer-reviewed research literature online is available at
> the American Scientist September Forum (98 & 99 & 00 & 01 & 02 & 03):
>     http://amsci-forum.amsci.org/archives/september98-forum.html
>     http://www.cogsci.soton.ac.uk/~harnad/Hypermail/Amsci/index.html
>     Posted discussion to: september98-forum at amsci-forum.amsci.org
>
> Dual Open-Access Strategy:
>     BOAI-2 ("gold"): Publish your article in a suitable open-access
>             journal whenever one exists.
>     BOAI-1 ("green"): Otherwise, publish your article in a suitable
>             toll-access journal and also self-archive it.
>     http://www.soros.org/openaccess/read.shtml
>     http://www.ecs.soton.ac.uk/~harnad/Temp/berlin.htm
> http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0026.gif
> http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0021.gif
> http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0024.gif
> http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0028.gif
>



More information about the SIGMETRICS mailing list