A new metrics-related book focused on academic search engines
David Wojick
dwojick at CRAIGELLACHIE.US
Fri Oct 10 10:41:22 EDT 2014
Dear Stephen,
In what sense are links semantic? I do not understand the concept. I think
of semantics as the science of the meaning of words.
David
At 10:37 AM 10/10/2014, you wrote:
>Adminstrative info for SIGMETRICS (for example unsubscribe):
>http://web.utk.edu/~gwhitney/sigmetrics.html
>Jeoren,
>This is a revolution with deep roots. Garfield laid out the main premise
>of the Google search engine in an article he published in Science in 1955
>on citation indexing. It is an accelerating revolution that now is
>reaching warp speed.
>
>The main reason Google delivers more relevant sets than Microsoft is that
>it semantically works by links and not words. This enables it to take
>advantage of the power-law linkage structure of the WWW to zero in on the
>most important and relevant documents.
>
>I wish to hell that arXiv would finally post our working paper, where we
>prove all this with economics Nobelists. Then I can vet our theories.
>
>Respectfully,
>
>Stephen J Bensman, Ph.D
>LSU Libraries
>Lousiana State University
>Baton Rouge, LA 70803
>
>PS I am a historian by training, and there is nothing that is outdated for
>me. Older, highly cited stuff is of the greatest interest, for we may be
>looking at the influence of time and the degree of incorporation.
>
>From: ASIS&T Special Interest Group on Metrics
>[mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Bosman, J.M. (Jeroen)
>Sent: Thursday, October 09, 2014 4:41 PM
>To: SIGMETRICS at LISTSERV.UTK.EDU
>Subject: Re: [SIGMETRICS] A new metrics-related book focused on academic
>search engines
>
>Adminstrative info for SIGMETRICS (for example unsubscribe):
><http://web.utk.edu/~gwhitney/sigmetrics.html>http://web.utk.edu/~gwhitney/sigmetrics.html
>
>Stephen,
>
>Thanks for your insightful elaboration. The ideas stem from about 1935
>(Otlet), 1945 (Bush) and 1955 (Garfield), the implementation from the
>early sixties in SCI, futher ideas in 1976 (Narin) and 1989 (Berners-Lee)
>and Google elaborated on that in 1996 with PageRank and a hydrid . So I
>doubt that the revolution takes a just a decade. It already has taken some
>decades and will take some more decades, for the change is not restricted
>to discovery but includes distribution as well, just as with the printing
>press and scholarly journal. So probably the 'revolution' will only be
>complete when at some point in the future the academic book, journal and
>paper are replaced by instant production/publication/discovery, for
>instance in a smart nanopublications type of way? Also I think that for
>the system to collapse Google Scholar is not a conditio sine qua non.
>ArXiv (1991) and Citeseer (1998) are way older than GS and together they
>have revolutionized search and distribution more than GS has done, albeit
>in a much more restricted field of physics and information science.
>
>On a less theoretical note, you say that MAS has been proven wrong and
>Google Scholar may be wright. But every other day I have to tell my
>students that in order to get relevant stuff they need to use GS pubyear
>filters, because if they don't they will end up using highly cited but
>outdated stuff. Over 95% of my students (>500 each year) had never
>realised this! By the way, I am not saying that MAS does a better job in
>this respect and I am a fan of Google Scholar.
>
>Best,
>Jeroen Bosman
>@jeroenbosman
>
>Op 9 okt. 2014 om 22:27 heeft "Stephen J Bensman"
><<mailto:notsjb at LSU.EDU>notsjb at LSU.EDU> het volgende geschreven:
>Adminstrative info for SIGMETRICS (for example unsubscribe):
><http://web.utk.edu/~gwhitney/sigmetrics.html>http://web.utk.edu/~gwhitney/sigmetrics.html
>
>Jeroen
>Here is summary of what I think that we are involved in with academic
>search engines:
>
>Academic search engines are an extremely complex topic, since we are now
>engaged in an information revolution on the same scale as the invention of
>the printing press in the 15th century and the scientific journal in the
>17th century, except what was accomplished took centuries then, and we
>will do it in a decade or so now. One facet of this information
>revolution is that what was once semantically defined by words is now
>semantically defined by linkages. On top of it, this information
>revolution is entwined with a scientific revolution on the power-law
>distributional structure of nature and society that was launched as a
>result of the development of the World Wide Web.
>
>Given the complexity of this thing, we need some sort of standardization,
>so we can better deal with it. There has to be some sort of agreement on
>what is right and what is wrong. MAS seems to be based on a systemnumber
>of word tokens in given documentthat was proven wrong and ineffective in
>semantically defining relevant document sets. For me it is very hard to
>grasp that a Googlebot crawled out of a garage in Palo Alto in 2004, and
>suddenly an entire system began to collapse and be replaced by something
>else. This took less than 10 years. The Chinese have a curse about
>living in interesting times, and our times are sure interesting in this sense.
>
>Respectfully,
>
>Stephen J Bensman
>LSU Libraries
>Lousiana State University
>Baton Rouge, LA 70803
>USA
>
>
>
>
>From: ASIS&T Special Interest Group on Metrics
>[<mailto:SIGMETRICS at LISTSERV.UTK.EDU>mailto:SIGMETRICS at LISTSERV.UTK.EDU]
>On Behalf Of Bosman, J.M. (Jeroen)
>Sent: Thursday, October 09, 2014 2:40 PM
>To: <mailto:SIGMETRICS at LISTSERV.UTK.EDU>SIGMETRICS at LISTSERV.UTK.EDU
>Subject: Re: [SIGMETRICS] A new metrics-related book focused on academic
>search engines
>
>Adminstrative info for SIGMETRICS (for example unsubscribe):
><http://web.utk.edu/~gwhitney/sigmetrics.html>http://web.utk.edu/~gwhitney/sigmetrics.html
>
>Isidro, Stephen, Enrique,
>
>Thanks. I already downloaded the book and started reading. Hoewever I do
>not applaud the fact that MAS is coming to a standstill. I think it offers
>some very nice options and even unique things (ASAIK) such as the citation
>contexts. I also do not understand why it is necessary to have a single
>standard in order to be able to assess how the WWW revolutionizes the
>scholarly information system. Stephen, could you elaborate on why you
>think that is necassary? Could that assessment not include various
>parallel lines of development of these systems? And perhaps we already
>need an addendum to the book with today's news of the launch of Paperity.
>
>Best,
>Jeroen
>
>
>
>
>
>Op 9 okt. 2014 om 18:23 heeft "Stephen J Bensman"
><<mailto:notsjb at LSU.EDU>notsjb at LSU.EDU> het volgende geschreven:
>Enrique,
>Thank you for this information. It simplifies matters. At least MAS no
>longer needs to be taken into account, and we can focus on Google
>Scholar. If we are going to make assessments on how the WWW is
>revolutionizing the scientific/scholarly information system, we have to
>have a single standard, and that is Google. The problems are complex
>enough without the need to compare competitive systems. Life was better
>and easier when the SCI was the single standard just as it was when peer
>ratings were the only standard
>
>SB.
>
>
>
>From: ASIS&T Special Interest Group on Metrics
>[<mailto:SIGMETRICS at LISTSERV.UTK.EDU>mailto:SIGMETRICS at LISTSERV.UTK.EDU]
>On Behalf Of Enrique Orduña
>Sent: Thursday, October 09, 2014 9:47 AM
>To: <mailto:SIGMETRICS at LISTSERV.UTK.EDU>SIGMETRICS at LISTSERV.UTK.EDU
>Subject: Re: [SIGMETRICS] A new metrics-related book focused on academic
>search engines
>
>Adminstrative info for SIGMETRICS (for example unsubscribe):
><http://web.utk.edu/~gwhitney/sigmetrics.html>http://web.utk.edu/~gwhitney/sigmetrics.html
>
>Dear friends,
>
>Interesting issues all of them. And of course I already purchased a copy
>of Ortega's book :)
>
>As regards Microsoft Academic Search, and PoP software, we must take into
>account that MAS is completely outdated. This issue is detected by Ortega
>in his book. Moreover it was published by EC3 Research group by means of a
>working paper few months ago. A more in-depth analysis has been performed,
>which has been recently accepted for publication, where we study this drop
>of coverage according to disciplines, universities and journals.
>
>Therefore, MAS cannot be used now for quantitative purposes. Additionally,
>the MAS API does not work properly with queries that return hit count
>estimates surpassing 1,000 results. And we can add finally all sometimes
>unknown legal considerations in the reuse of Bing results due to Microsoft
>copyright.
>
>Finally, some official voices from Microsoft announced that MAS results
>will be integrated into Bing results, in an ongoing processs.
>
>As regards Google Scholar, as Isidro said, "site" command may be used both
>in Google and Google Scholar. But be carefull, because search commands are
>changing in Scholar. For example the combination of "site" and "filetype"
>stopped working. In any case, site command in Google and Bing sometimes
>get us unexpected results in terms of coverage.
>
>Best,
>
>Enrique
>
>On Thu, Oct 9, 2014 at 4:32 PM, Stephen J Bensman
><<mailto:notsjb at lsu.edu>notsjb at lsu.edu> wrote:
>Adminstrative info for SIGMETRICS (for example unsubscribe):
><http://web.utk.edu/~gwhitney/sigmetrics.html>http://web.utk.edu/~gwhitney/sigmetrics.html
>
>Isidro,
>Thanks for the information. I am looking forward to hearing from
>Jose. He and I are already in close contact on these matters. I
>definitely want you two to vet the paper we have done. It should be ready
>soon. I screwed up in posting in it on arXiv, and it may take a while to
>correct my stupidity of submitting the damn thing multiple times, because
>I did not know what I was doing.
>
>You have already answered one of my questions. The former Yahoo research
>engine was based upon AltVista, which defined documentary sets by
>words. It was this system that Page tested and rejected as delivering
>incoherent, irrelevant sets. Instead Page incorporated Garfield's theory
>of citation indexing, which defines relevant sets by linkages. He
>strengthened this by also incorporating Narin's influential method. Doing
>this delivered clearer more relevant sets than AltVista. Multiple
>linkages are better at semantically defining sets that multiple token
>words. If your book presents these facts, then I can strangle Microsoft
>Academic in its cradle, as Churchill once said of a certain political
>system that now seems to have come back into vogue.
>
>I hope to get the book and hear from Jose.
>
>Respectfully,
>
>Stephen J Bensman, Ph.D
>LSU Libraries
>Lousiana State University
>Baton Rouge, LA 70803
>USA
>
>
>
>-----Original Message-----
>From: ASIS&T Special Interest Group on Metrics
>[mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Isidro F. Aguillo
>Sent: Thursday, October 09, 2014 9:07 AM
>To: <mailto:SIGMETRICS at LISTSERV.UTK.EDU>SIGMETRICS at LISTSERV.UTK.EDU
>Subject: Re: [SIGMETRICS] A new metrics-related book focused on academic
>search engines
>
>Adminstrative info for SIGMETRICS (for example unsubscribe):
><http://web.utk.edu/~gwhitney/sigmetrics.html>http://web.utk.edu/~gwhitney/sigmetrics.html
>
>Dear Stephen,
>
>Ooops!
>
>Sorry, I am not the author of the book. it was written by my collaborator
>and friend José Luis Ortega, also in this forum, so you can expect an
>answer from him soon.
>
>But, I can give a few hints to some of your questions. Bing is using the
>technology of the former Yahoo search engine. I do not know exactly the
>way Bing works but my feeling is they are using visits as main criteria.
>Probably there are far more variables involved, but number of visits play
>a similar role to links in Google`s PageRank. Of course, it is also
>possible links are also taken into account.
>
>Microsoft Academic Search is a completely different animal. Really it is a
>traditional bibliographic database, but I must recognize that although
>they are using h-index, I am unable to understand the rankings they
>publish. To my knowledge, MAS and Bing are completely independent
>products. On the contrary, Google and Google Scholar are closely interlinked.
>
>Regarding web indicators I use number of webpages under different levels
>of web addresses, like for example number of webpages in the webservers of
>your university
>
>site:<http://lsu.edu>lsu.edu
>
>This syntax is valid for Google, Bing and even Google Scholar.
>
>Best regards,
>
>
>
>On 09/10/2014 15:36, Stephen J Bensman wrote:
> > Adminstrative info for SIGMETRICS (for example unsubscribe):
> >
> <http://web.utk.edu/~gwhitney/sigmetrics.html>http://web.utk.edu/~gwhitney/sigmetrics.html
> >
> > Isidro,
> > Thanks for writing this book-- Academic Search Engines: A Quantitative
> Outlook. I am having LSU Libraries buy a copy of it, so you have sold at
> least one. I hope that you have discussed the differences between how
> the Google and Microsoft search engines operate. I understand how
> PageRank operates, but I do not understand how Bing operates. All I know
> is that you obtain much better results with Google than with Microsoft,
> which seems to be quite new. I have tested them both.
> >
> > For your information, Harzing has now interfaced her PoP program with
> Microsoft Academic as well as Google Scholar. Now you can really run
> comparative tests between Google and Microsoft. You seem to get better
> results with her PoP than with the Microsoft Academic site itself. At
> least her rankings are much better, although it is quite obvious from her
> program that Microsoft coverage is much weaker.
> >
> > As a matter of curiosity, what metric did you use to measure the
> quantitative aspects? You cannot use standard bibliographic
> classifications such as number of books, journals, journal articles,
> working papers, etc. etc., because I do not think that either Google or
> Microsoft can identify these. The Web has no authority structure
> whatever. You are not dealing with OCLC WorldCat. It must be something
> like megabytes of data or something like that.
> >
> > We are finishing a paper on how Google Scholar operates. I'd like you
> to vet it when we have it ready.
> >
> > Respectfully,
> >
> > Stephen J Bensman, Ph.D.
> > LSU Libraries
> > Lousiana State University
> > Baton Rouge, LA 70803
> > USA
> >
> >
> > -----Original Message-----
> > From: ASIS&T Special Interest Group on Metrics
> > [mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Isidro F. Aguillo
> > Sent: Wednesday, October 08, 2014 6:27 AM
> > To: <mailto:SIGMETRICS at LISTSERV.UTK.EDU>SIGMETRICS at LISTSERV.UTK.EDU
> > Subject: [SIGMETRICS] A new metrics-related book focused on academic
> > search engines
> >
> > Adminstrative info for SIGMETRICS (for example unsubscribe):
> >
> <http://web.utk.edu/~gwhitney/sigmetrics.html>http://web.utk.edu/~gwhitney/sigmetrics.html
> >
> > José Luis Ortega. Academic Search Engines: A Quantitative Outlook.
> > Elsevier, 2014. Chandos Information Professional Series ISBN
> > 1780634722, 9781780634722
> >
> >
> <http://store.elsevier.com/Academic-Search-Engines/Jose-Luis-Ortega/isb>http://store.elsevier.com/Academic-Search-Engines/Jose-Luis-Ortega/isb
> > n-9781843347910/
> >
> >
> > Academic Search Engines: intends to run through the current panorama of
> the academic search engines through a quantitative approach that analyses
> the reliability and consistence of these services. The objective is to
> describe the main characteristics of these engines, to highlight their
> advantages and drawbacks, and to discuss the implications of these new
> products in the future of scientific communication and their impact on
> the research measurement and evaluation. In short, Academic Search
> Engines presents a summary view of the new challenges that the Web set to
> the scientific activity through the most novel and innovative searching
> services available on the Web.
> >
> > Key Features:
> > · This is the first approach to analyze search engines exclusively
> addressed to the research community in an integrative handbook.
> > · This book is not merely a description of the web functionalities of
> these services; it is a scientific review of the most outstanding
> characteristics of each platform, discussing their significance with
> recent investigations.
> > · This book introduces an original methodology based on a quantitative
> analysis of the covered data through the extensive use of crawlers and
> harvesters which allow going in depth into how these engines are working.
> >
> > José Luis Ortega (CCHS-CSIC) is a web researcher in the Spanish
> National Research Council (CSIC). He achieved a fellowship in the
> Cybermetrics Lab of the CSIC, where he finished his doctoral studies
> (2003-8). In 2005, he was employed by the Virtual Knowledge Studio of the
> Royal Netherlands Academy of Sciences and Arts, and in 2008 he took up a
> position as information scientist in the CSIC. He now continues his
> collaboration with the Cybermetrics Lab in research areas such as
> webometrics, web usage mining, visualization of information, academic
> search engines and social networks for scientists.
> >
>
>
>--
>
>************************************
>Isidro F. Aguillo, HonDr.
>The Cybermetrics Lab, IPP-CSIC
>Grupo Scimago
>Madrid. SPAIN
>
><mailto:isidro.aguillo at csic.es>isidro.aguillo at csic.es
>ORCID 0000-0001-8927-4873
>ResearcherID: A-7280-2008
>Scholar Citations SaCSbeoAAAAJ
>Twitter @isidroaguillo
>Rankings Web <http://webometrics.info>webometrics.info
>************************************
>
>
>---
>Este mensaje no contiene virus ni malware porque la protección de avast!
>Antivirus está activa.
><http://www.avast.com>http://www.avast.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.asis.org/pipermail/sigmetrics/attachments/20141010/efcb7585/attachment.html>
More information about the SIGMETRICS
mailing list