Open access?

Fil Menczer fil at INDIANA.EDU
Thu Apr 12 00:27:38 EDT 2012


On Wed, Apr 11, 2012 at 1:53 PM, Stephen J Bensman <notsjb at lsu.edu> wrote:
> Filippo,
> Three questions back to you.  First, if I understand you correctly, then Google has a contract with Elsevier that allows its spiders access to the Elsevier publication database.  If so, then I not only stand corrected but can rely more on the accuracy of Google Scholar data.  In general, my research has shown that Google Scholar totally validates of the findings of Eugene Garfield on scientific journals in his Law of Concentration and importance of review journals.  Google Scholar and WoS validate each other, which I find comforting.  However, GS retrieves data in a fashion that provides insights that neither WoS or Scopus can provides because it retrieves further down the authorship structure and places these do not visit.  If GS can fully access Elsevier journals, then why is that fearsome "web crawlers verboten" sign posted on the Elsevier SciVerse Web site.

Because they do not want just anyone to obtain their data by crawling
and scraping their website. As they write, "customers or commercial
entities are not allowed to "deep index" Elsevier Internet files
except on a contractual basis where indexing rights have been
defined". So the policy is directed at entities that, unlike Google
Scholar, do not not have a contract allowing them access.

> Second, what the hell is an "API."

An Application Programming Interface is a way to obtain data from a
web service, programmatically. See:
http://en.wikipedia.org/wiki/Application_programming_interface

> Third, if Google Scholar can access the Elsevier publication database, then who in the hell needs Scopus, which costs a bundle?  LSU has better places to spend its limited money than a redundant database.

Google Scholar allows end-users to query its index but does not give
third parties the right to
crawl/scrape/download/store/duplicate/mirror/index its data. As I
mentioned, this is because of contractual obligations with publishers.
This is also why they do not provide an API (which would make it easy
for third parties and researchers to get their data), and why services
such as Scholarometer and PoP are client based. A server would be
blocked as infringing on GS terms of service. The Scholarometer
service allows end-users to share the data they get from GS with the
community, which is done through the Scholarometer API. On the other
hand, wIth Scopus for instance, you pay to get access to the data.

I hope this helps. All the best,

Fil -- bit.ly/filmenczer



More information about the SIGMETRICS mailing list