[Sigmetrics] 1findr: research discovery & analytics platform

Emilio Delgado López-Cózar edelgado at ugr.es
Thu Apr 26 14:51:42 EDT 2018


 

First of all, I would like to congratulate the team behind _1FINDR_
for releasing this new product. New scientific information systems with
an open approach that make their resources available to the scientific
community are always welcome. A few days ago another system was launched
(_LENS_ https://www.lens.org), and not many weeks ago _DIMENSIONS_ was
launched (https://app.dimensions.ai). The landscape of scientific
information systems is becoming increasingly more populated. Everyone is
moving: new platforms with new features, new actors with new ideas, and
old actors trying to adapt rather than die...

In order to be able to
build a solid idea of what _1FINDR_ is and how it has been constructed,
I woul like to formulate some questions, since I haven't found their
answer in the website:

What is the main source of the metadata to those
89,561,005 articles? Is it perhaps _CROSSREF_?

How have peer-reviewed
journals been identified?

Are all document types in these journals
covered? Editorial material, news, comments, book reviews...

How have
OA versions of the documents been identified?

How have articles been
categorised in subjects?

To what extent has _GOOGLE SCHOLAR_ data been
used to build _1FINDR_?

We think this information will help assess
exactly what 1findr offers that is not offered by other platforms.

Kind
regards,

--- 

Emilio Delgado López-Cózar
Facultad de Comunicación y
Documentación
Universidad de
Granada
http://scholar.google.com/citations?hl=es&user=kyTHOh0AAAAJ
https://www.researchgate.net/profile/Emilio_Delgado_Lopez-Cozar
http://googlescholardigest.blogspot.com.es

Dubitando
ad veritatem pervenimus (Cicerón, De officiis. A. 451...)
Contra facta
non argumenta
A fructibus eorum cognoscitis eos (San Mateo 7, 16)

El
2018-04-25 22:50, Kevin Boyack escribió: 

> Éric, 
> 
> … and thanks to
you for being so transparent about what you're doing! 
> 
> Kevin 
> 
>
FROM: SIGMETRICS ON BEHALF OF Éric Archambault
> SENT: Wednesday, April
25, 2018 9:05 AM
> TO: Anne-Wil Harzing ; sigmetrics at mail.asis.org
>
SUBJECT: Re: [Sigmetrics] 1findr: research discovery & analytics
platform 
> 
> Anne-Wil, 
> 
> Thank you so much for this review. We
need that kind of feedback to prioritize development. 
> 
> Thanks a lot
for the positive comments. We are happy that they reflect our design
decisions. Now, onto the niggles (all fair points in current version).

> 
> An important distinction of our system - at this stage of
development - is that our emphasis is on scholarly/scientific/research
work published in peer-reviewed/quality controlled journals (e.g. we
don't index trade journals and popular science magazines such as New
Scientist - not a judgment on quality, many of them are stunningly good,
they are just not the type we focus on for now). This stems from work
conducted several years ago for the European Commission. We got a
contract at Science-Metrix to measure the proportion of articles
published in peer-reviewed journals. We discovered (discovery being a
big term considering what follows) that 1) OA articles were hard to find
and count (numerator in the percentage), and 2) there wasn't a database
that comprised all peer-reviewed journals (denominator in the
percentage). Consequently, we had to work by sampling, but hard core
bibliometricians like the ones we are at Science-Metrix like the idea of
working on population level measurement. At Science-Metrix, our
bibliometric company, we have been using licensed bibliometric versions
of the Web of Science and Scopus. Great tools, very high quality data
(obvious to anyone who has worked on big bibliographic metadata),
extensive coverage and loads of high quality, expensive to implement
smart enrichment. However, when measuring, we noticed, as did many
others, that the databases emphasized Western production to the
detriment of the Global South, emerging countries, especially in Asia,
and even the old Cold War foe in which the West lost interest after the
fall of the wall. 1findr is addressing this - it aims to find as much OA
as possible and to index everything peer-reviewed and academic level
published in journals. We aim to expand to other types of content with a
rationally designed indexing strategy, but this is what we are
obstinately focusing on for now. 
> 
> -We are working on linking all
the papers within 1findr with references/citations. This will create the
first rationally designed citation network: from all peer-reviewed
journals to all peer-reviewed journals, regardless of language, country,
field of research (we won't get there easily or soon). We feel this is
scientifically a sound way to measure. Conferences and books are also
important, but currently when we take them into account in citations, we
have extremely non-random lumps of indexed material, and no one can say
what the effect on measured citations is. My educated guess is that this
is extremely biased - book coverage is extremely linguistically biased,
conference proceedings indexing is extremely field biased
(proportionately way more computer and engineering than other fields).
If we want to turn scientometrics into a proper science we need proper
measurement tools. This is the long-term direction of 1findr. It won't
remain solely in the discovery field, it will become a scientifically
designed tool to measure research, with clearly documented strengths and
weaknesses. 
> 
> -We still need to improve our coverage of OA. Though
we find twice as many freely downloadable papers in journals than
Dimensions, Impact Story finds about 8% OA for papers with a DOI for
which we haven't found a copy yet (one reason we have more OA as a
percentage of journal articles is that in 1findr we find much OA for
articles without DOIs). We are working on characterizing a sample of
papers which are not OA on the 1findr side, but which ImpactStory finds
in OA. A glimpse at the data reveals some of these are false positives,
but some of them reflect approaches used by ImpactStory that we have not
yet implemented (Heather and Jason are smart, and we can all learn from
them - thanks to their generosity). There are also transient problems we
experienced while building 1findr. For example, at the moment, we have
challenges with our existing Wiley dataset and we need to update our
harvester for Wiley's site. Would be nice to have their collaboration,
but they have been ignoring my emails for the last two months… Shame,
we're only making their papers more discoverable and helping world users
find papers for which article processing charges were paid for. We need
the cooperation of publishers to do justice to the wealth of their
content, especially hybrid OA papers. 
> 
> -We know we have several
papers displaying a "404". We are improving the oaFindr link resolver
built in 1findr to reduce this. Also we need to scan more frequently for
change (we have to be careful there as we don't want to overwhelm
servers; many of the servers we harvest from are truly slow and we want
to be nice guys), and we need to continue to implement smarter
mechanisms to avoid 404. Transiency of OA is a huge challenge. We have
addressed several of the issues, but this takes time and our team has a
finite size, and as you note, several challenges, and big ambitions at
the same time. 
> 
> -We are rewriting our "help" center. Please be
aware that using no quote does full stemming, using single quote does
stemming, but words need be in the same order in the results. Double
quotes should be used for non-stemmed, exact matches. This is a really
powerful way of searching. 
> 
> Fuel cell = finds articles with fuel
and cell(s) 
> 
> 'fuel cell' = finds articles with both fuel cell and
fuel cells 
> 
> "fuel cell" = finds articles strictly with fuel cell
(won't return fuel cells only articles) 
> 
> Once again, thanks for the
review, and apologies for the lengthy reply. 
> 
> Éric 
> 
> ERIC
ARCHAMBAULT, PHD 
> 
> CEO | Chef de la direction 
> 
> C.
1.514.518.0823 
> 
> eric.archambault at science-metrix.com [16] 
> 
>
science-metrix.com [17] & 1science.com [18] 
> 
> FROM: SIGMETRICS
<sigmetrics-bounces at asist.org [19]> ON BEHALF OF Anne-Wil Harzing
>
SENT: April-24-18 5:11 PM
> TO: sigmetrics at mail.asis.org [20]
> SUBJECT:
Re: [Sigmetrics] 1findr: research discovery & analytics platform 
> 
>
Dear all, 
> 
> I was asked (with a very short time-frame) to comment on
1Findr for an article in Nature (which I am not sure has actually
appeared). I was given temporary login details for the Advanced
interface. 
> 
> As "per normal" with these kind of requests only one of
my comments was actually used. So I am posting all of them here in case
they are of use to anyone (and to Eric and his team in fine-tuning the
system). 
> 
> ================ 
> 
> As I had a very limited amount of
time to provide my comments, I tried out 1Findr by searching for my own
name (I have about 150 publications including journal articles, books,
book chapters, software, web publications and white papers) and some key
terms in my own field (international management). 
> 
> WHAT I LIKE
> 
>
Simple and intuitive user interface with fast response to search
requests, much faster than with some competitor products where the
website takes can take ages to load. The flexibility of the available
search options clearly reflects the fact that this is an offering built
by people with a background in Scientometrics. 
> 
> A search for my own
name showed that coverage at the author level is good, it finds more of
my publications than both the Web of Science and Scopus, but fewer than
Google Scholar and Microsoft Academic. It is approximately on par with
CrossRef and Dimensions though all three services (CR, Dimensions and
Findr) have unique publications that the other service doesn't cover. 
>

> As far as I could assess, topic searches worked well with flexible
options to search in title, keywords and abstracts. However, I have not
tried these in detail. 
> 
> Provides a very good set of subjects for
filtering searches that - for the disciplines I can evaluate - shows
much better knowledge of academic disciplines and disciplinary
boundaries than is reflected in some competitor products. I particularly
like the fact that there is more differentiation in the Applied
Sciences, the Economic and Social Sciences and Arts & Humanities than in
some other databases. This was sorely needed. 
> 
> There is a quick
summary of Altmetrics such as tweets, Facebook postings and Mendeley
readers. Again I like the fact that a simple presentation is used,
rather than the "bells & whistle" approach with the flashy graphics of
some other providers. This keeps the website snappy and provides an
instant overview. 
> 
> There is good access to OA versions and a
"1-click" download of all available OA versions [for a maximum of 40
publications at once as this is the upper limit of the number of records
on a page]. I like the fact that it finds OA versions from my personal
website (www.harzing.com [21]) as well as OA versions in university
repositories and gold OA versions. However, it doesn't find all OA
versions of my papers (see dislike below). 
> 
> WHAT I DISLIKE
> 
>
Although I like the fact that Findr doesn't try to be anything and
everything leading to a cluttered user interface, for me the fact that
it doesn't offer citation metrics limits its usefulness. Although I
understand its focus is on finding literature (which is fair enough)
many academics - rightly or wrongly - use citations scores to assess
which articles to prioritize articles for downloading and reading. 
> 
>
The fact that it doesn't yet find all Open Access versions that Google
Scholar and Microsoft Academic do. All my publications are available in
OA on my website, but Findr does not seem to find all of them. Findr
also doesn't seem to source OA versions from ResearchGate. Also several
OA versions resulted in a _"404. The requested resource is not found."_

> 
> The fact that it only seems to cover journal articles. None of my
books, book chapters, software, white papers or web publications were
found. Although a focus on peer-reviewed work is understandable I think
coverage of books and book chapters is essential and services like
Google Scholar, Microsoft Academic and CrossRef do cover books. 
> 
>
NIGGLES
> 
> There are duplicate results for quite a few of my articles,
usually "poorer" versions (i.e. without full text/abstract/altmetric
scores) it would be good if the duplicates could be removed and only the
"best" version kept 
> 
> Automatic stemming of searches is awkward if
you try to search for author names in the "general" search (as many
users will do). In my case (Harzing) it results in hundreds of articles
on the Harz mountains obscuring all of my output. 
> 
> Preferred search
syntax should be clearer as many users will search authors with initials
only (as this is what works best in other databases). In Findr this
provides very few results as there are "exact" matches only, whereas in
other databases initial searches are interpreted as initial + wildcard.

> 
> More generally needs better author disambiguation. Some of my
articles can only be found when searching for a-w harzing, a very
specific rendition of my name. 
> 
> When Exporting Citations the order
seems to reverts to alphabetical order of the first author, not the
order that was on the screen. 
> 
> Best wishes,
> Anne-Wil 
> 
> PROF.
ANNE-WIL HARZING 
> 
> Professor of International Management
> Middlesex
University London, Business School
> 
> WEB: Harzing.com [22] - TWITTER:
@awharzing [23] - GOOGLE SCHOLAR: Citation Profile [24]
> NEW: Latest
blog post [25] - SURPRISE: Random blog post [26] - FINALLY: Support
Publish or Perish [27] 
> 
> On 24/04/2018 21:51, Bosman, J.M. (Jeroen)
wrote: 
> 
>> Of course there is much more to say about 1Findr. What I
have seen so far is that the coverage back to 1944 is very much akin to
Dimensions, probably because both are deriving the bulk of their records
from Crossref. 
>> 
>> Full text search is relatively rare among these
systems. Google Scholar does it. Dimensions does it on a subset. And
some publisher platform support it, as do some OA aggragators. 
>> 
>>
Apart from these two aspects (coverage and full text search support),
there are a lot of aspects and (forthcoming) 1Findr functionalities that
deserve scrutiny, not least the exact method of OA detection (and
version priority) of course. 
>> 
>> Jeroen Bosman 
>> 
>> Utrecht
University Library 
>> 
>> -------------------------
>> 
>> FROM:
SIGMETRICS [sigmetrics-bounces at asist.org [13]] on behalf of David Wojick
[dwojick at craigellachie.us [14]]
>> SENT: Tuesday, April 24, 2018 8:59
PM
>> TO: Mark C. Wilson
>> CC: sigmetrics at mail.asis.org [15]
>>
SUBJECT: Re: [Sigmetrics] 1findr: research discovery & analytics
platform 
>> 
>> There is a joke that what is called "rapid prototyping"
actually means fielding the beta version. In that case every user is a
beta tester.
>> 
>> It is fast and the filter numbers are useful in
themselves. Some of the hits are a bit mysterious. It may have unique
metric capabilities. Too bad that advanced search is not available for
free.
>> 
>> David
>> 
>> At 02:34 PM 4/24/2018, Mark C. Wilson wrote:

>> 
>>> Searching for my own papers I obtained some wrong records and
the link to arXiv was broken. It does return results very quickly and
many are useful. I am not sure whether 1science intended to use everyone
in the world as beta-testers.
>>> 
>>>> On 25/04/2018, at 06:16, David
Wojick <dwojick at craigellachie.us [9] > wrote:
>>>> 
>>>> It appears not
to be doing full text search, which is a significant limitation. I did a
search on "chaotic" for 2018 and got 527 hits. Almost all had the term
in the title and almost all of the remainder had it in the abstract.
Normally with full text, those with the term only in the text are many
times more than those with it in title, often orders of magnitude
more.
>>>> 
>>>> But the scope is impressive, as is the ability to
filter for OA.
>>>> 
>>>> David
>>>> 
>>>> David Wojick, Ph.D.
>>>>
Formerly Senior Consultant for Innovation
>>>> DOE OSTI
https://www.osti.gov/ [10] 
>>>> 
>>>> At 08:00 AM 4/24/2018, you wrote:

>>>> 
>>>>> Content-Language: en-US
>>>>> Content-Type:
multipart/related;
>>>>> type="multipart/alternative";
>>>>>
boundary="----=_NextPart_001_00EE_01D3DBBD.BC977220"
>>>>> 
>>>>>
Greetings everyone,
>>>>> 
>>>>> Today, 1science announced the official
launch of 1findr, its platform for research discovery and analytics.
Indexing 90 million articles­of which 27 million are available in
OA­it represents the largest curated collection worldwide of scholarly
research. The platform aims to include all articles published in
peer-reviewed journals, in all fields of research, in all languages and
from every country.
>>>>> 
>>>>> Here are a few resources if
youâEUR(tm)re interested in learning more:
>>>>> 
>>>>> * p; Access
1findr platform: www.1findr.com [1]
>>>>> * p; Visit the 1findr website:
www.1science.com/1findr [2]
>>>>> * p; Send in your questions:
1findr at 1science.com [3]
>>>>> * p; See the press release:
www.1science.com/1findr-public-launch [4] 
>>>>> 
>>>>> Sincerely,
>>>>>

>>>>> GrÃ(c)goire
>>>>> 
>>>>> GrÃ(c)goire CôtÃ(c)
>>>>> President |
PrÃ(c)sident 
>>>>> Science-Metrix 
>>>>> 1335, Mont-Royal E
>>>>>
MontrÃ(c)al, QC H2J 1Y6
>>>>> Canada
>>>>> 
>>>>> T. 1.514.495.6505
x115
>>>>> T. 1.800.994.4761
>>>>> F. 1.514.495.6523
>>>>>
gregoire.cote at science-metrix.com [5]
>>>>> www.science-metrix.com
[6]
>>>>> 
>>>>> Content-Type: image/png;
>>>>>
name="image001.png"
>>>>> Content-Description: image001.png
>>>>>
Content-Disposition: inline;
>>>>> creation-date=Tue, 24 Apr 2018
12:00:30 GMT;
>>>>> modification-date=Tue, 24 Apr 2018 12:00:30
GMT;
>>>>> filename="image001.png";
>>>>> size=1068
>>>>> Content-ID:

>>>>> 
>>>>> Content-Type: image/png;
>>>>> name="image002.png"
>>>>>
Content-Description: image002.png
>>>>> Content-Disposition:
inline;
>>>>> creation-date=Tue, 24 Apr 2018 12:00:30 GMT;
>>>>>
modification-date=Tue, 24 Apr 2018 12:00:30 GMT;
>>>>>
filename="image002.png";
>>>>> size=1109
>>>>> Content-ID: 
>>>>> 
>>>>>
_______________________________________________
>>>>> SIGMETRICS mailing
list
>>>>> SIGMETRICS at mail.asis.org [7]
>>>>>
http://mail.asis.org/mailman/listinfo/sigmetrics [8]
>>>> 
>>>>
_______________________________________________
>>>> SIGMETRICS mailing
list
>>>> SIGMETRICS at mail.asis.org [11]
>>>>
http://mail.asis.org/mailman/listinfo/sigmetrics [12]
>> 
>>
_______________________________________________
>> 
>> SIGMETRICS
mailing list
>> 
>> SIGMETRICS at mail.asis.org
>> 
>>
http://mail.asis.org/mailman/listinfo/sigmetrics
 

Links:
------
[1]
http://www.1findr.com/
[2] http://www.1science.com/1findr
[3]
mailto:1findr at 1science.com
[4]
http://www.1science.com/1findr-public-launch
[5]
mailto:gregoire.cote at science-metrix.com
[6]
http://www.science-metrix.com/
[7] mailto:SIGMETRICS at mail.asis.org
[8]
http://mail.asis.org/mailman/listinfo/sigmetrics
[9]
mailto:dwojick at craigellachie.us
[10] https://www.osti.gov/
[11]
mailto:SIGMETRICS at mail.asis.org
[12]
http://mail.asis.org/mailman/listinfo/sigmetrics
[13]
mailto:sigmetrics-bounces at asist.org
[14]
mailto:dwojick at craigellachie.us
[15]
mailto:sigmetrics at mail.asis.org
[16]
mailto:eric.archambault at science-metrix.com
[17]
http://www.science-metrix.com/
[18] http://www.science-metrix.com/
[19]
mailto:sigmetrics-bounces at asist.org
[20]
mailto:sigmetrics at mail.asis.org
[21] http://www.harzing.com
[22]
https://harzing.com
[23] https://twitter.com/awharzing
[24]
https://scholar.google.co.uk/citations?user=v0sDYGsAAAAJ
[25]
https://harzing.com/blog/.latest?redirect
[26]
https://harzing.com/blog/.random
[27]
https://harzing.com/resources/publish-or-perish/donations
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.asis.org/pipermail/sigmetrics/attachments/20180426/122e7ae1/attachment-0001.html>


More information about the SIGMETRICS mailing list