Does Google Scholar contain all highly cited documents (1950-2013)?

Alberto Martín Martín albertomartin101 at GMAIL.COM
Fri Oct 31 08:02:06 EDT 2014

Dear colleagues, continuing the path our group started back in 2008 to
unveil de inner workings of Google Scholar and test its potential as a tool
for research evaluation (, we are pleased to
present our latest work after five months of arduous work: a study about
the highly cited documents according to Google Scholar for the period

The objective is to confirm if it is possible to accurately identify all
highly cited documents in Google Scholar. We present the top 25, as well as
the top 1% most cited documents (a total of 640), from a sample of 64,000
documents collected from Google Scholar, which are also made available to
the community in the suplementary materials.

After describing various aspects of these documents like their languages,
the file format in which they are made available, and how many are freely
accessible, we try to answer some questions that currently hang over Google
Scholar’s head like a sword of Damocles and could determine its acceptance
as reliable tool for scientific evaluation. With such a large and pertinent
sample, the like of which has never been used in any similar studies, we
believe we can give solid answers (although admittedly not definitive) to
the questions that have been recently discussed in various scientific
forums. The questions are these:

¾      How many of the highly cited documents indexed by GS are also
indexed by WoS?

¾      Is there a correlation between the number of citations that these
highly cited documents have received in GS and the number of citations they
have received in WoS?

¾      How many versions of these highly cited documents has GS detected?

¾      Is there a correlation between the number of versions GS has
detected for these documents, and the number citations they have received?

¾      Is there a correlation between the number of versions GS has
detected for these documents, and their position in the search engine
result pages?

¾      Is there some relation between the positions these documents occupy
in the search engine result pages, and the number of citations they have

You may access the full text of this document in the following link:

Best regards,

Alberto Martín, Enrique Orduña, Juan Manuel Millán & Emilio Delgado

EC3: Evaluación de la Ciencia y de la Comunicación Científica

Universidad de Granada and Universidad Politécnica de Valencia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the SIGMETRICS mailing list