Coming Attractions: Garfield, Narin, & PageRank

Stephen J Bensman notsjb at LSU.EDU
Fri Dec 13 12:56:11 EST 2013


Jeroen,
I have posted the paper in arXiv, and it will be ready for viewing , Monday, December 16, 1:00 GMT.  If you are in a hurry, I can send you a pdf, if you give me your individual e-mail address.  I do not want to clutter up the list's mail boxes with my musings.  You can read it, and make up your own mind.  I would be very interested in your reaction.  One reason for posting on arXiv is to confront and correct issues before submitting to a journal.  That way you can at least confront your accusers instead of facing a referee with dictatorial powers.  That is like participating in Pickett's charge.  See URL below:

http://en.wikipedia.org/wiki/Pickett's_Charge

As for my reaction, I will just say that working at the cutting edge is not pleasant.  Nobody really know anything-including me-and everybody is opinionated as hell.

SB

From: ASIS&T Special Interest Group on Metrics [mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Bosman, J.M.
Sent: Thursday, December 12, 2013 11:59 PM
To: SIGMETRICS at LISTSERV.UTK.EDU
Subject: Re: [SIGMETRICS] Coming Attractions: Garfield, Narin, & PageRank

Dear Stephen,

Although I do not quite get what you are trying to tell me in your reaction, I await your paper with interest.

Best,
Jeroen

Op 12 dec. 2013 om 21:46 heeft "Stephen J Bensman" <notsjb at LSU.EDU<mailto:notsjb at LSU.EDU>> het volgende geschreven:
Jeroen,
Thank you for the comment.  With any luck I should have it posted tomorrow.

I have a doctorate in history from one of the top universities in the US in this field-the University of Wisconsin at Madison (Go Badgers!).  I  have gone through historical documentation, and the evidence is overwhelming.  Moreover, the damn thing works,  I got the results that would be predicted by Garfield's law of concentration and his view of the importance of review journals.  I have also done a Google Cites on myself, and it captures me quite well, pretty much replicating what ISI cites say about me.  I was amazed that Google did it in about 45 seconds.  The option is that I can make it public, or I can keep it private, but Harzing's program yields the same results.  You cannot investigate GS without her program.

Page and Brin are astute businessmen, and you really do not think that they will waste the time and money developing a different algorithm just for GS.  If you think that, then I have a  bridge that I can sell you.  They run their operation with cheap computers that you can buy off the shelf at Wal-Mart.

GS has an advantage in that it retrieves from institutional repositories, and it seems that institutional repositories are replacing journals as the main source of developing science information.  Physics has been that way for a long time.  Journals are really just archival.  All universities, including LSU, are developing institutional repositories for this reason.

GS threatens many vested interests, and I have catching a lot of flak.  It is amazing how little understood GS is-almost deliberately so.  This is my way of throwing down the gauntlet.

Respectfully,

SB


From: ASIS&T Special Interest Group on Metrics [mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Bosman, J.M.
Sent: Thursday, December 12, 2013 1:56 PM
To: SIGMETRICS at LISTSERV.UTK.EDU<mailto:SIGMETRICS at LISTSERV.UTK.EDU>
Subject: Re: [SIGMETRICS] Coming Attractions: Garfield, Narin, & PageRank

I am looking forward to you paper, but I wonder how much really is known about ranking in Scholar. Google states about it "Google Scholar aims to rank documents the way researchers do, weighing the full text of each document, where it was published, who it was written by, as well as how often and how recently it has been cited in other scholarly literature". There are so many vagueries here.
First, the bulk of GS search results do not have full text, they are 'citation' parsed from reference list of primary indexed papers. Second, it says that Google weighs who wrote an article, but how is that measured in? Citations indeed play an imporant role that canot be switched off and that results in rankings that for most searches give you old stuff. The majority of GS users do not realize that and thus often miss out on the latest research findings.
I also wonder whether Google uses the same kind of PageRank in GS but then with citation numbers. Or do they also take into account weblinks to the various versions of papers? If so, how are these link numbers for various versions added up or corrected for? And if Google uses pagerank based on weblinks to papers combined with pagerank type relevance based on citations, how does the resulting hybrid ranking work? Of course that is a company secret they won't share, but I wonder if there is anybody who has deduced how things really work behind the screens of GS? I will add anything I learn here to our GS guide at http://libguides.library.uu.nl/googlescholar_en

Best,
Jeroen
---------
Jeroen Bosman
Utrecht University Library



Op 12 dec. 2013 om 16:29 heeft "Stephen J Bensman" <notsjb at LSU.EDU<mailto:notsjb at LSU.EDU>> het volgende geschreven:
I will soon be posting on arXiv an article entitled "Eugene Garfield, Francis Narin, and PageRank:  the Theoretical Bases of the Google Search Engine."  Below is its abstract:

Abstract
This paper presents a test of the validity of using Google Scholar (GS) to evaluate the publications of researchers.  It does this by first comparing the theoretical premises on which the GS search engine PageRank algorithm operates to those on which Garfield based his theory of citation indexing.  It finds that the basic premise is the same, i.e., that subject sets of relevant documents are defined semantically better by linkages than by words.  Google incorporated this premise into PageRank, amending it with the addition of the citation influence method developed by Francis Narin and the staff of Computer Horizons, Inc. (CHI).  This method weighted more heavily citations from documents which themselves were more heavily cited.  Garfield himself essentially had also incorporated this method into his theory of citation indexing by restricting as far as possible the coverage of the Science Citation Index (SCI) to a small multidisciplinary core of journals most heavily cited.  Stealing a page from Garfield's book, the paper presents a test of the validity of GS by tracing its citations to the h-index works of 5 Nobel laureates in chemistry-the discipline in which Garfield began his pioneering research-with Anne-Wil Harzing's revolutionary Publish-or-Perish (PoP) software that has established bibliographic and statistical control over the GS database.  Most of these works were journal articles, and the rankings of the journals in which they appeared by both total cites (TC) and impact factor (IF) at the time of their publication were analyzed.  The results conformed to the findings of Garfield through citation analysis, confirming his law of concentration and view of the importance of review articles.  As a byproduct of this finding, it is shown that Narin had totally misunderstood and mishandled citations from review journals.  The evidence of this paper is conclusive:  Garfield's theory of citation indexing and PageRank validate each other, and Eugene Garfield is the grandfather of the Web search engine.

I will post this article as soon as my wife finishes her proofreading and copyediting.  I will inform you when it has been posted, but I wanted to get out as soon as possible the basic findings of the paper.

Respectfully,


Stephen J Bensman, Ph.D.
LSU Libraries
Lousiana State University
Baton Rouge, LA 70803
USA



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.asis.org/pipermail/sigmetrics/attachments/20131213/027e5dc8/attachment.html>


More information about the SIGMETRICS mailing list