[Sigmetrics] 1findr: research discovery & analytics platform

Emilio Delgado López-Cózar edelgado at ugr.es
Fri May 4 13:43:41 EDT 2018


 

Thank you very much Éric for your very detailed response, which
undoubtedly has enabled us to form a idea of what 1findr is, what's
behind it, and what it intends to be in the future. I entirely agree
with Bianca that, after proper refining, this text would make a very
nice addition to your FAQs. I understand that the enormous amount of
work that must have gone into creating the product hasn't left you with
the necessary time to improve the information pages. 

Certainly,
transparency of what has been done and how it has been done is vital to
garner credibility, especially considering how many scientific
information discovery platforms have sprouted in the recent months, and
the platforms that may come in the future. 

I would like to comment on
some of your answers. 

 1. WHAT IS THE MAIN SOURCE OF THE METADATA TO
THOSE 89,561,005 ARTICLES? IS IT PERHAPS CROSSREF? 

 You are using
about 75 million records extracted from CrossRef. What I would like to
know is how many of de 89,561,005 articles that 1findr displays to users
come from CrossRef. I understand that that figure makes up 17% of the
450 million records that enter your pipeline. It is important to know
this, because the quality of the data that is displayed depends on the
quality of the sources used. It is not enough to know that Microsoft
Academic contains 173,763,013, publications, Scilit 109,972,957, Lens
104,653,320, and Dimensions 91,994,266. It is also necessary to know
what their sources are and how they integrate them. 

 2. HOW HAVE
PEER-REVIEWED JOURNALS BEEN IDENTIFIED? 

If we agree that a
peer-reviewed journal is a journal that sistematically subjects all the
manuscripts they receive to review by referees selected _ad hoc_ for
each article, the fact that 1findr contains "… more than 85,000
journals, and (…) has content for close to 80,000 of them" is
extraordinary. A extraordinarily high figure if we compare it to the
11,000 journals of Cabell's _curated list_
(https://www2.cabells.com/about-whitelist [40]), 20,000 in Web of
Science, 22,000 in Scopus (all of them supposedly implementing rigorous
peer-review), or the 60,000 that Ulrichs' Directory approximately
contains (although I don't trust this figure much). 

I have dedicated
part of my academic life to propose indicators of publishing quality to
journals in which peer-review played a central role
(https://goo.gl/MWzRAb [41] ); to dissect the journal selection criteria
of Web of Science (https://goo.gl/DjGMA1 [42]) and Medline
(http://goo.gl/ARyCn4 [43]), the two databases that up to now have been
more reliable about this; to evaluate thousands of Spanish journals
according to these criteria (https://doi.org/10.1108/EUM0000000007147
[44]); to build citations index (http://ec3.ugr.es/in-recs [45]) and
directories of Spanish journals that were assessed according to their
compliance with scientific publication standards
(http://epuc.cchs.csic.es/resh [46]), and I must say that it is an
arduous, complex, and problematic task methodologically speaking. To
begin with, because it is based on formal declarations, and not on the
verification of editorial practices. One thing is to declare what you
do, and another thing entirely, to do what you say you do. Therefore,
there is fundamental problem of reliability. Secondly, because editorial
practices change over time. A non-peer-reviewed journal may start to
apply peer-review at any moment. What do we do with the articles they
published before applying peer-review? Thirdly, because peer-review has
not been a widespread standard since always in all scientific
disciplines. Although it has been usually used in the natural and
experimental sciences since the 1960s, it was an uncommon practice in
the social sciences and humanities until recent years. In some
disciplines (especially arts, law sciences), the very pillars of the
system are questioned: judging the scientific merit of a work in
disciplines whose cognitive nature is based on value judgements (ideas,
concepts)? In Spain, where approximately 2-3% of the world population of
scientific journals is produced, peer-review was an exotic rarity in
1990s, except in journals in experimental sciences, where there was
peer-review, but even there, it was not implemented rigorously.


Therefore, allow me to doubt that the 89M scientific publications
offered by _1findr_ are peer-reviewed documents. 

I believe that what
is urgent today is to avoid that predatory journals (those that declare
that they review articles in 24, 48, 72 hours, or not more than 2 weeks)
enter these information systems. When Bohannon
(https://doi.org/10.1126/science.342.6154.60 [47]) uncovered these scam
operations we saw how they had proliferated, and what's worse, how some
journal directories (DOAJ) had fallen for it and contained journals of
questionable reputation. Using white and black lists that serve as a
filter and a tool to inform authors of what journals are trustworthy is
what's really indispensable. How has 1findr approached this issue? Is it
safe to say that it doesn't contain predatory journals? 

3. ARE ALL
DOCUMENT TYPES IN THESE JOURNALS COVERED? EDITORIAL MATERIAL, NEWS,
COMMENTS, BOOK REVIEWS... 

When I asked about the document types
processed by _1findr_ it wasn't my intention to ask for a policy of
exhaustivity (cover to cover), but I wanted to know how you had
approached the issue of identifying "the types of documents considered
to be "original contributions to knowledge"(…) e.g. "articles", "notes",
and "reviews". It is important to remember that defining the typology of
a document is not an easy task, and that even Web of Science or Scopus
have not been able to solve this issue completely. There are many
discrepancies in how each of these databases defines the typology of the
publications they cover. This happens frequently with review articles
(which, by the way, don't contain "original knowledge"). There are also
abundant internal inconsistencies in the databases. 

This difficulty is
caused by the fact that the sections in which documents are organized
within journals are as diverse as the disciplines and specialties of
knowledge. There aren't two journals that are alike; there isn't a
standardization in the denominations of journal sections, which makes
this task extraordinarily difficult. As in other aspects of scientific
publishing, it is easier with journals in the natural and experimental
sciences, and more difficult in the social sciences, arts, and
humanities, where in many cases, there aren't even sections. Although it
might be possible to develop heuristics based on specific words, or the
analysis of bibliographic references, the error rate would probably be
high. 

This is what has happened with _1findr_. Its filters haven't
been able to differentiate research papers, reviews, and notes, from the
rest of the items that are included in journals: book reviews, news,
letters, editorials, or other items that are not technically scientific
documents but that are today published as individual pdfs: summaries,
indices, lists of advisory and editorial boards, information for
authors, guidelines, advertisements, lists of reviewers… see the
following (not exhaustive) list as an example: 

N DOC 

OA 

book
review 

616,154 

79,922 

editorial 

316,548 

127,442 

news


196,506 

42,687 

Letters editor 

26,867 

6,339 

information for
authors 

87,709 

32,633 

instructions for authors 

31,201 

7,327


guidelines for authors 

14,726 

5,572 

editorial board 

18,251


5,543 

advisory board 

2,845 

970 

contents 

313,435 

101,843


index issue 

19,769 

7,887 

Author Index 

5,429 

2,672 

subject
index 

5,998 

2,782 

Avertissement 

567 

200 

Masthead 

1,264


1,095 

Front Matter 

8,939 

5,424 

4. HOW HAVE ARTICLES BEEN
CATEGORIZED IN SUBJECTS? 

What would have been really innovative would
have been to introduce an article-level subject classification. The easy
path and what traditional databases have been doing is to extend a
journal-level clasiffication to the articles published in those
journals… We look forward to see how you tackle this fundamental as well
as delicate issue in bibliometrics… 

 5. WHAT DOES 1FINDR OFFER THAT IS
NOT OFFERED BY OTHER PRODUCTS? 

I disagree with the claim that "What
really distinguishes 1findr from all other sources of data we know of is
that we really care about global research". 

There are already
products, some with a large user base, that already serve that function.
Or do not Google Scholar/Google, Microsoft Academic/Bing, Dimensions,
Lens…or even social networks such as ResearchGate and Academia.edu
already offer comprehensive coverage and access to free versions of the
documents? 

Academic search engines rely on comprehensive citation
graphs and full text indexing for their relevance ranking algorithms.
1findr is surprisingly lacking in this respect, considering the product
was developed by bibliometricians. Thus, academic search engines provide
users with the ability to find relevant scientific information of any
type, and in many cases, they also offer free access to the documents
(https://osf.io/preprints/socarxiv/k54uv/ [48]). I think that in order
to become attractive to most users, _1findr_ should offer a great deal
more value that it currently offers. A search engine that relies only on
basic metadata, even if it has millions of links to free full texts, is
not enough. 

I agree with your position of indexing the knowledge
produced in the south and east, which is mainly ignored by mainstream
indexed. Anglo-centrism kills scientific diversity (which is now more
necessary than ever), especially in the social sciences and humanities.
Science is carried out across all the globe and manifests in multiple
forms… 

By the way, 1findr only displays a maximum of 10 pages of
results with 10 results each: 100 documents for any given search. The
also meager 1000 results Google Scholar displays seem huge by
comparison… I think it would be an error for _1findr_ to restrict the
number of search results displayed for a query. One of the main
criticisms we always mention in regard to Google Scholar is the lack of
a public API, which other systems offer (like Mendeley), even if it is
not completely free (like the one Microsoft Academic). Do you plan to
offer an API? It would make sense in a platform where OA is such an
important aspect. 

In any case, long live competition… and I would like
to extend my congratulations for the product again. Just the fact that
you have built it is commendable in itself. 

best regards 

---
Emilio
Delgado López-Cózar
Facultad de Comunicación y Documentación
Universidad
de
Granada
http://scholar.google.com/citations?hl=es&user=kyTHOh0AAAAJ
https://www.researchgate.net/profile/Emilio_Delgado_Lopez-Cozar
http://googlescholardigest.blogspot.com.es

Dubitando
ad veritatem pervenimus (Cicerón, De officiis. A. 451...)
Contra facta
non argumenta
A fructibus eorum cognoscitis eos (San Mateo 7, 16)

El
2018-05-01 00:53, Éric Archambault escribió: 

> Thank you very much
Emilio. Please find out answers to your questions: 
> 
> * WHAT IS THE
MAIN SOURCE OF THE METADATA TO THOSE 89,561,005 ARTICLES? IS IT PERHAPS
_CROSSREF_?
> 
> Yes, we do use Crossref, which is one of the best
sources of data around. 1findr is built with the help of an
enrichment/clustering/mashup/filtering data processing pipeline. There
are currently 75 million records from Crossref out of the total 450
million records entering the pipeline (17%). This doesn't mean that 17%
of the records are straight from Crossref, it means that Crossref
currently represents 17% of the ingredients used to produce 1findr. In
the next few months, we'll add content of BASE [29]
https://www.base-search.net/ [30] and CORE [31] https://core.ac.uk/ [32]
as these organizations have accepted to share the fruits of their
labour. This will certainly help to fill gaps in 1findr and further
increase quality, help us produce complete records, and generally
increase depth and breadth. Please encourage BASE and CORE; they are
providing an extremely useful public service. We are examining the best
way to add these sources to our datastore, which will then increase to
close to 700 million bibliographic records. We think 1findr will then be
able to add 5-10 million records we may not have yet, and using these
sources and others we will likely surpass 100 million records this year,
which will help users be assured that they search closer to the full
population of articles published in peer-reviewed journals. 
> 
> * HOW
HAVE PEER-REVIEWED JOURNALS BEEN IDENTIFIED?
> 
> In a nutshell, through
a long learning curve and an expensive self-financed compilation
process. We have been building a list of peer-reviewed journals since
about 2002 with the first efforts being initiated at Science-Metrix when
we started the company. We pursued and intensified the efforts at
1science starting as soon as we spun off the company in 2014, and we now
use tens of methods to acquire candidate journals and we are always
adding new ones. We are constantly honing the list, adding journals,
withdrawing journals we find do not meet our criteria and for which we
have evidence of quality-reviewing avoidance. In short, the journals
included in 1findr need to be scholarly/scientific/research journals and
be peer-reviewed/refereed, which most of the time means having
references, and this is to the exclusion of trade journals and popular
science magazines. This working definition works really well in the
health and natural sciences and in most of the arts, humanities and
social sciences, and is somewhat more challenged in architecture, the
most un-typical field in academia. Currently, the white list that
contributes to building 1findr contains more than 85,000 journals, and
1findr already has content for close to 80,000 of them. This journal
list is itself curated, clustered, enriched, and filtered from a larger
dataset stored in a system containing more than 300,000 entries. We feel
we are converging on an exhaustive inventory of the contemporary active
journals, but we still have work to do to identify the whole
retrospective list of relevant journals as far back as 1665. 
> 
> * ARE
ALL DOCUMENT TYPES IN THESE JOURNALS COVERED? EDITORIAL MATERIAL, NEWS,
COMMENTS, BOOK REVIEWS...
> 
> The bibliographic database that powers
1findr is presently mostly used as a specialized discovery system for
documents published in peer-reviewed journals. However, this datastore
has been built from the ground up to evolve into a powerful bibliometric
database. As such, we have concentrated our efforts on the types of
documents considered to be "original contributions to knowledge". These
are the document types that are usually counted in bibliometric studies,
e.g. "articles", "notes", and "reviews". 1findr is positively biased
towards these. That said, for most of the journals, we have been
collecting material from cover-to-cover, but many items with no author
currently stay in the datastore, and have not made their way to 1findr
yet. We will change our clustering/filtering rules in the next few
months to include more material types, and 1findr will grow in size by
several million records as a consequence of adding more news, comments,
and similar types of documents. 
> 
> * HOW HAVE OA VERSIONS OF THE
DOCUMENTS BEEN IDENTIFIED?
> 
> Using focused harvesters, 1findr
scrutinizes the web in search of metadata sources which are likely to
correspond to scholarly publications. To reduce the amount of upstream
curation required, our system harvests only relevant metadata, which is
used to build the datastore with its 450 million metadata records. When
the system clusters documents and freely finds downloadable versions of
the papers, it takes note of this. At 1science, we use the definition of
"gratis" open access suggested by Peter Suber. This means that articles
are freely downloadable, readable, printable, but may or may not have
rights attached. For example, disembargoed gold open access (gold open
access means made available either directly or in a mediated manner by
publishers) made available through a moving pay wall/free access model
are frequently associated with residual rights, whereas green open
access (green OA means archived by a party other than a publisher or
other than a publisher's mediator - Scielo and PubMedCentral being
examples of such mediators) are more frequently without. We code OA
versions based on these definitions of green and gold. The OA colouring
scheme has nothing to do with usage rights, or with the fact that a
paper is a preprint, a postprint (author's final peer-reviewed
manuscript) or a version of record. Who makes the paper available, what
rights there are, and what version of the manuscript is made available
are three dimensions we are careful not to conflate. Most of the
operational definitions we use in 1findr find their root in the study
Science-Metrix conducted for the European Commission on the measurement
of the percentage of articles published in peer-reviewed journals. 
> 
>
http://science-metrix.com/sites/default/files/science-metrix/publications/d_1.8_sm_ec_dg-rtd_proportion_oa_1996-2013_v11p.pdf
[33] 
> 
> You can also find other reports on OA for this and more
recent projects on Science-Metrix' selected reports list: 
> 
>
http://science-metrix.com/en/publications/reports [34] 
> 
> * HOW HAVE
ARTICLES BEEN CATEGORIZED IN SUBJECTS?
> 
> To classify articles, we use
the CC BY classification created by Science-Metrix and used in its
bibliometric studies: 
> 
>
http://www.science-metrix.com/en/classification [35] 
> 
>
http://science-metrix.com/sites/default/files/science-metrix/sm_journal_classification_106_1.xls
[36] 
> 
> This classification is available in more than 20 languages,
and we are currently working on version 2.0. For the time being, 1findr
uses the Science-Metrix classification to perform a journal-level
classification of articles, but stay tuned for article-level
classification of articles. 
> 
> * TO WHAT EXTENT HAS _GOOGLE SCHOLAR_
DATA BEEN USED TO BUILD _1FINDR_?
> 
> We have used Google Scholar for
tactical purposes, to do cross-checks and for benchmarking. We do not
scrape Google Scholar or use Google Scholar metadata. There are
vestigial traces of Google Scholar in our system and between 1.8% and
4.4% of the hyperlinks to gratis OA papers which are used in 1findr
could come from that source. These are progressively being replaced with
refreshed links secured from other sources. 
> 
> What really
distinguishes 1findr from all other sources of data we know of is that
we really care about global research. We haven't seen anyone else doing
as much work as we've done to make accessible the extraordinarily
interesting activity that can be found in the long tail of science and
academia. Just like most researchers, we care to have access to the
material from the top tier publishers and we're really open to working
with them to make their articles more discoverable and more useful for
them and for the whole world. But we do not focus solely on the top
tiers. The focus of 1science is on the big picture in the scholarly
publishing world. Our research in the last 10 years has revealed that
thousands of journals have emerged with the global transition to open
access, and there are thousands of journals in the eastern part of the
world and the Global South that were traditionally ignored and saw their
journals unfairly being shunned by the mainstream indexes. We are not
creating a product that isolates Eastern or Southern contents from a
core package centered on the West. There is 1 science, and it should be
conveniently accessible in 1 place and this is why we created 1findr. 
>

> Cordially 
> 
> Éric 
> 
> ERIC ARCHAMBAULT, PHD 
> 
> CEO | Chef de
la direction 
> 
> C. 1.514.518.0823 
> 
>
eric.archambault at science-metrix.com [37] 
> 
> science-metrix.com [38] &
1science.com [39] 
> 
> FROM: SIGMETRICS ON BEHALF OF Emilio Delgado
López-Cózar
> SENT: April-26-18 2:52 PM
> TO: sigmetrics at mail.asis.org
>
SUBJECT: Re: [Sigmetrics] 1findr: research discovery & analytics
platform 
> 
> First of all, I would like to congratulate the team
behind _1FINDR_ for releasing this new product. New scientific
information systems with an open approach that make their resources
available to the scientific community are always welcome. A few days ago
another system was launched (_LENS_ https://www.lens.org), and not many
weeks ago _DIMENSIONS_ was launched (https://app.dimensions.ai). The
landscape of scientific information systems is becoming increasingly
more populated. Everyone is moving: new platforms with new features, new
actors with new ideas, and old actors trying to adapt rather than
die...
> 
> In order to be able to build a solid idea of what _1FINDR_
is and how it has been constructed, I woul like to formulate some
questions, since I haven't found their answer in the website:
> 
> What
is the main source of the metadata to those 89,561,005 articles? Is it
perhaps _CROSSREF_?
> 
> How have peer-reviewed journals been
identified?
> 
> Are all document types in these journals covered?
Editorial material, news, comments, book reviews...
> 
> How have OA
versions of the documents been identified?
> 
> How have articles been
categorised in subjects?
> 
> To what extent has _GOOGLE SCHOLAR_ data
been used to build _1FINDR_?
> 
> We think this information will help
assess exactly what 1findr offers that is not offered by other
platforms.
> 
> Kind regards,
> 
> --- 
> 
> Emilio Delgado
López-Cózar
> 
> Facultad de Comunicación y Documentación
> 
>
Universidad de Granada
> 
>
http://scholar.google.com/citations?hl=es&user=kyTHOh0AAAAJ
> 
>
https://www.researchgate.net/profile/Emilio_Delgado_Lopez-Cozar
> 
>
http://googlescholardigest.blogspot.com.es
> 
> Dubitando ad veritatem
pervenimus (Cicerón, De officiis. A. 451...)
> 
> Contra facta non
argumenta
> 
> A fructibus eorum cognoscitis eos (San Mateo 7, 16)
> 
>
El 2018-04-25 22:50, Kevin Boyack escribió: 
> 
>> Éric, 
>> 
>> … and
thanks to you for being so transparent about what you're doing! 
>> 
>>
Kevin 
>> 
>> FROM: SIGMETRICS ON BEHALF OF Éric Archambault
>> SENT:
Wednesday, April 25, 2018 9:05 AM
>> TO: Anne-Wil Harzing ;
sigmetrics at mail.asis.org [16]
>> SUBJECT: Re: [Sigmetrics] 1findr:
research discovery & analytics platform 
>> 
>> Anne-Wil, 
>> 
>> Thank
you so much for this review. We need that kind of feedback to prioritize
development. 
>> 
>> Thanks a lot for the positive comments. We are
happy that they reflect our design decisions. Now, onto the niggles (all
fair points in current version). 
>> 
>> An important distinction of our
system - at this stage of development - is that our emphasis is on
scholarly/scientific/research work published in peer-reviewed/quality
controlled journals (e.g. we don't index trade journals and popular
science magazines such as New Scientist - not a judgment on quality,
many of them are stunningly good, they are just not the type we focus on
for now). This stems from work conducted several years ago for the
European Commission. We got a contract at Science-Metrix to measure the
proportion of articles published in peer-reviewed journals. We
discovered (discovery being a big term considering what follows) that 1)
OA articles were hard to find and count (numerator in the percentage),
and 2) there wasn't a database that comprised all peer-reviewed journals
(denominator in the percentage). Consequently, we had to work by
sampling, but hard core bibliometricians like the ones we are at
Science-Metrix like the idea of working on population level measurement.
At Science-Metrix, our bibliometric company, we have been using licensed
bibliometric versions of the Web of Science and Scopus. Great tools,
very high quality data (obvious to anyone who has worked on big
bibliographic metadata), extensive coverage and loads of high quality,
expensive to implement smart enrichment. However, when measuring, we
noticed, as did many others, that the databases emphasized Western
production to the detriment of the Global South, emerging countries,
especially in Asia, and even the old Cold War foe in which the West lost
interest after the fall of the wall. 1findr is addressing this - it aims
to find as much OA as possible and to index everything peer-reviewed and
academic level published in journals. We aim to expand to other types of
content with a rationally designed indexing strategy, but this is what
we are obstinately focusing on for now. 
>> 
>> -We are working on
linking all the papers within 1findr with references/citations. This
will create the first rationally designed citation network: from all
peer-reviewed journals to all peer-reviewed journals, regardless of
language, country, field of research (we won't get there easily or
soon). We feel this is scientifically a sound way to measure.
Conferences and books are also important, but currently when we take
them into account in citations, we have extremely non-random lumps of
indexed material, and no one can say what the effect on measured
citations is. My educated guess is that this is extremely biased - book
coverage is extremely linguistically biased, conference proceedings
indexing is extremely field biased (proportionately way more computer
and engineering than other fields). If we want to turn scientometrics
into a proper science we need proper measurement tools. This is the
long-term direction of 1findr. It won't remain solely in the discovery
field, it will become a scientifically designed tool to measure
research, with clearly documented strengths and weaknesses. 
>> 
>> -We
still need to improve our coverage of OA. Though we find twice as many
freely downloadable papers in journals than Dimensions, Impact Story
finds about 8% OA for papers with a DOI for which we haven't found a
copy yet (one reason we have more OA as a percentage of journal articles
is that in 1findr we find much OA for articles without DOIs). We are
working on characterizing a sample of papers which are not OA on the
1findr side, but which ImpactStory finds in OA. A glimpse at the data
reveals some of these are false positives, but some of them reflect
approaches used by ImpactStory that we have not yet implemented (Heather
and Jason are smart, and we can all learn from them - thanks to their
generosity). There are also transient problems we experienced while
building 1findr. For example, at the moment, we have challenges with our
existing Wiley dataset and we need to update our harvester for Wiley's
site. Would be nice to have their collaboration, but they have been
ignoring my emails for the last two months… Shame, we're only making
their papers more discoverable and helping world users find papers for
which article processing charges were paid for. We need the cooperation
of publishers to do justice to the wealth of their content, especially
hybrid OA papers. 
>> 
>> -We know we have several papers displaying a
"404". We are improving the oaFindr link resolver built in 1findr to
reduce this. Also we need to scan more frequently for change (we have to
be careful there as we don't want to overwhelm servers; many of the
servers we harvest from are truly slow and we want to be nice guys), and
we need to continue to implement smarter mechanisms to avoid 404.
Transiency of OA is a huge challenge. We have addressed several of the
issues, but this takes time and our team has a finite size, and as you
note, several challenges, and big ambitions at the same time. 
>> 
>>
-We are rewriting our "help" center. Please be aware that using no quote
does full stemming, using single quote does stemming, but words need be
in the same order in the results. Double quotes should be used for
non-stemmed, exact matches. This is a really powerful way of searching.

>> 
>> Fuel cell = finds articles with fuel and cell(s) 
>> 
>> 'fuel
cell' = finds articles with both fuel cell and fuel cells 
>> 
>> "fuel
cell" = finds articles strictly with fuel cell (won't return fuel cells
only articles) 
>> 
>> Once again, thanks for the review, and apologies
for the lengthy reply. 
>> 
>> Éric 
>> 
>> ERIC ARCHAMBAULT, PHD 
>>

>> CEO | Chef de la direction 
>> 
>> C. 1.514.518.0823 
>> 
>>
eric.archambault at science-metrix.com [17] 
>> 
>> science-metrix.com [18]
& 1science.com [19] 
>> 
>> FROM: SIGMETRICS
<sigmetrics-bounces at asist.org [20]> ON BEHALF OF Anne-Wil Harzing
>>
SENT: April-24-18 5:11 PM
>> TO: sigmetrics at mail.asis.org [21]
>>
SUBJECT: Re: [Sigmetrics] 1findr: research discovery & analytics
platform 
>> 
>> Dear all, 
>> 
>> I was asked (with a very short
time-frame) to comment on 1Findr for an article in Nature (which I am
not sure has actually appeared). I was given temporary login details for
the Advanced interface. 
>> 
>> As "per normal" with these kind of
requests only one of my comments was actually used. So I am posting all
of them here in case they are of use to anyone (and to Eric and his team
in fine-tuning the system). 
>> 
>> ================ 
>> 
>> As I had a
very limited amount of time to provide my comments, I tried out 1Findr
by searching for my own name (I have about 150 publications including
journal articles, books, book chapters, software, web publications and
white papers) and some key terms in my own field (international
management). 
>> 
>> WHAT I LIKE
>> 
>> Simple and intuitive user
interface with fast response to search requests, much faster than with
some competitor products where the website takes can take ages to load.
The flexibility of the available search options clearly reflects the
fact that this is an offering built by people with a background in
Scientometrics. 
>> 
>> A search for my own name showed that coverage at
the author level is good, it finds more of my publications than both the
Web of Science and Scopus, but fewer than Google Scholar and Microsoft
Academic. It is approximately on par with CrossRef and Dimensions though
all three services (CR, Dimensions and Findr) have unique publications
that the other service doesn't cover. 
>> 
>> As far as I could assess,
topic searches worked well with flexible options to search in title,
keywords and abstracts. However, I have not tried these in detail. 
>>

>> Provides a very good set of subjects for filtering searches that -
for the disciplines I can evaluate - shows much better knowledge of
academic disciplines and disciplinary boundaries than is reflected in
some competitor products. I particularly like the fact that there is
more differentiation in the Applied Sciences, the Economic and Social
Sciences and Arts & Humanities than in some other databases. This was
sorely needed. 
>> 
>> There is a quick summary of Altmetrics such as
tweets, Facebook postings and Mendeley readers. Again I like the fact
that a simple presentation is used, rather than the "bells & whistle"
approach with the flashy graphics of some other providers. This keeps
the website snappy and provides an instant overview. 
>> 
>> There is
good access to OA versions and a "1-click" download of all available OA
versions [for a maximum of 40 publications at once as this is the upper
limit of the number of records on a page]. I like the fact that it finds
OA versions from my personal website (www.harzing.com [22]) as well as
OA versions in university repositories and gold OA versions. However, it
doesn't find all OA versions of my papers (see dislike below). 
>> 
>>
WHAT I DISLIKE
>> 
>> Although I like the fact that Findr doesn't try to
be anything and everything leading to a cluttered user interface, for me
the fact that it doesn't offer citation metrics limits its usefulness.
Although I understand its focus is on finding literature (which is fair
enough) many academics - rightly or wrongly - use citations scores to
assess which articles to prioritize articles for downloading and
reading. 
>> 
>> The fact that it doesn't yet find all Open Access
versions that Google Scholar and Microsoft Academic do. All my
publications are available in OA on my website, but Findr does not seem
to find all of them. Findr also doesn't seem to source OA versions from
ResearchGate. Also several OA versions resulted in a _"404. The
requested resource is not found."_ 
>> 
>> The fact that it only seems
to cover journal articles. None of my books, book chapters, software,
white papers or web publications were found. Although a focus on
peer-reviewed work is understandable I think coverage of books and book
chapters is essential and services like Google Scholar, Microsoft
Academic and CrossRef do cover books. 
>> 
>> NIGGLES
>> 
>> There are
duplicate results for quite a few of my articles, usually "poorer"
versions (i.e. without full text/abstract/altmetric scores) it would be
good if the duplicates could be removed and only the "best" version kept

>> 
>> Automatic stemming of searches is awkward if you try to search
for author names in the "general" search (as many users will do). In my
case (Harzing) it results in hundreds of articles on the Harz mountains
obscuring all of my output. 
>> 
>> Preferred search syntax should be
clearer as many users will search authors with initials only (as this is
what works best in other databases). In Findr this provides very few
results as there are "exact" matches only, whereas in other databases
initial searches are interpreted as initial + wildcard. 
>> 
>> More
generally needs better author disambiguation. Some of my articles can
only be found when searching for a-w harzing, a very specific rendition
of my name. 
>> 
>> When Exporting Citations the order seems to reverts
to alphabetical order of the first author, not the order that was on the
screen. 
>> 
>> Best wishes,
>> Anne-Wil 
>> 
>> PROF. ANNE-WIL HARZING

>> 
>> Professor of International Management
>> Middlesex University
London, Business School
>> 
>> WEB: Harzing.com [23] - TWITTER:
@awharzing [24] - GOOGLE SCHOLAR: Citation Profile [25]
>> NEW: Latest
blog post [26] - SURPRISE: Random blog post [27] - FINALLY: Support
Publish or Perish [28] 
>> 
>> On 24/04/2018 21:51, Bosman, J.M.
(Jeroen) wrote: 
>> 
>>> Of course there is much more to say about
1Findr. What I have seen so far is that the coverage back to 1944 is
very much akin to Dimensions, probably because both are deriving the
bulk of their records from Crossref. 
>>> 
>>> Full text search is
relatively rare among these systems. Google Scholar does it. Dimensions
does it on a subset. And some publisher platform support it, as do some
OA aggragators. 
>>> 
>>> Apart from these two aspects (coverage and
full text search support), there are a lot of aspects and (forthcoming)
1Findr functionalities that deserve scrutiny, not least the exact method
of OA detection (and version priority) of course. 
>>> 
>>> Jeroen
Bosman 
>>> 
>>> Utrecht University Library 
>>> 
>>>
-------------------------
>>> 
>>> FROM: SIGMETRICS
[sigmetrics-bounces at asist.org [13]] on behalf of David Wojick
[dwojick at craigellachie.us [14]]
>>> SENT: Tuesday, April 24, 2018 8:59
PM
>>> TO: Mark C. Wilson
>>> CC: sigmetrics at mail.asis.org [15]
>>>
SUBJECT: Re: [Sigmetrics] 1findr: research discovery & analytics
platform 
>>> 
>>> There is a joke that what is called "rapid
prototyping" actually means fielding the beta version. In that case
every user is a beta tester.
>>> 
>>> It is fast and the filter numbers
are useful in themselves. Some of the hits are a bit mysterious. It may
have unique metric capabilities. Too bad that advanced search is not
available for free.
>>> 
>>> David
>>> 
>>> At 02:34 PM 4/24/2018, Mark
C. Wilson wrote: 
>>> 
>>>> Searching for my own papers I obtained some
wrong records and the link to arXiv was broken. It does return results
very quickly and many are useful. I am not sure whether 1science
intended to use everyone in the world as beta-testers.
>>>> 
>>>>> On
25/04/2018, at 06:16, David Wojick <dwojick at craigellachie.us [9] >
wrote:
>>>>> 
>>>>> It appears not to be doing full text search, which
is a significant limitation. I did a search on "chaotic" for 2018 and
got 527 hits. Almost all had the term in the title and almost all of the
remainder had it in the abstract. Normally with full text, those with
the term only in the text are many times more than those with it in
title, often orders of magnitude more.
>>>>> 
>>>>> But the scope is
impressive, as is the ability to filter for OA.
>>>>> 
>>>>> David
>>>>>

>>>>> David Wojick, Ph.D.
>>>>> Formerly Senior Consultant for
Innovation
>>>>> DOE OSTI https://www.osti.gov/ [10] 
>>>>> 
>>>>> At
08:00 AM 4/24/2018, you wrote: 
>>>>> 
>>>>>> Content-Language:
en-US
>>>>>> Content-Type: multipart/related;
>>>>>>
type="multipart/alternative";
>>>>>>
boundary="----=_NextPart_001_00EE_01D3DBBD.BC977220"
>>>>>> 
>>>>>>
Greetings everyone,
>>>>>> 
>>>>>> Today, 1science announced the
official launch of 1findr, its platform for research discovery and
analytics. Indexing 90 million articles­of which 27 million are
available in OA­it represents the largest curated collection worldwide
of scholarly research. The platform aims to include all articles
published in peer-reviewed journals, in all fields of research, in all
languages and from every country.
>>>>>> 
>>>>>> Here are a few
resources if youâEUR(tm)re interested in learning more:
>>>>>> 
>>>>>> *
p; Access 1findr platform: www.1findr.com [1]
>>>>>> * p; Visit the
1findr website: www.1science.com/1findr [2]
>>>>>> * p; Send in your
questions: 1findr at 1science.com [3]
>>>>>> * p; See the press release:
www.1science.com/1findr-public-launch [4] 
>>>>>> 
>>>>>>
Sincerely,
>>>>>> 
>>>>>> GrÃ(c)goire
>>>>>> 
>>>>>> GR(c)GOIRE
CT(c)
>>>>>> President | PrÃ(c)sident 
>>>>>> SCIENCE-METRIX 
>>>>>>
1335, Mont-Royal E
>>>>>> MontrÃ(c)al, QC H2J 1Y6
>>>>>> Canada
>>>>>>

>>>>>> T. 1.514.495.6505 x115
>>>>>> T. 1.800.994.4761
>>>>>> F.
1.514.495.6523
>>>>>> gregoire.cote at science-metrix.com [5]
>>>>>>
www.science-metrix.com [6]
>>>>>> 
>>>>>> Content-Type:
image/png;
>>>>>> name="image001.png"
>>>>>> Content-Description:
image001.png
>>>>>> Content-Disposition: inline;
>>>>>>
creation-date=Tue, 24 Apr 2018 12:00:30 GMT;
>>>>>>
modification-date=Tue, 24 Apr 2018 12:00:30 GMT;
>>>>>>
filename="image001.png";
>>>>>> size=1068
>>>>>> Content-ID: 
>>>>>>

>>>>>> Content-Type: image/png;
>>>>>> name="image002.png"
>>>>>>
Content-Description: image002.png
>>>>>> Content-Disposition:
inline;
>>>>>> creation-date=Tue, 24 Apr 2018 12:00:30 GMT;
>>>>>>
modification-date=Tue, 24 Apr 2018 12:00:30 GMT;
>>>>>>
filename="image002.png";
>>>>>> size=1109
>>>>>> Content-ID: 
>>>>>>

>>>>>> _______________________________________________
>>>>>>
SIGMETRICS mailing list
>>>>>> SIGMETRICS at mail.asis.org [7]
>>>>>>
http://mail.asis.org/mailman/listinfo/sigmetrics [8]
>>>>> 
>>>>>
_______________________________________________
>>>>> SIGMETRICS mailing
list
>>>>> SIGMETRICS at mail.asis.org [11]
>>>>>
http://mail.asis.org/mailman/listinfo/sigmetrics [12]
>>> 
>>>
_______________________________________________
>>> 
>>> SIGMETRICS
mailing list
>>> 
>>> SIGMETRICS at mail.asis.org
>>> 
>>>
http://mail.asis.org/mailman/listinfo/sigmetrics
 

Links:
------
[1]
http://www.1findr.com/
[2] http://www.1science.com/1findr
[3]
mailto:1findr at 1science.com
[4]
http://www.1science.com/1findr-public-launch
[5]
mailto:gregoire.cote at science-metrix.com
[6]
http://www.science-metrix.com/
[7] mailto:SIGMETRICS at mail.asis.org
[8]
http://mail.asis.org/mailman/listinfo/sigmetrics
[9]
mailto:dwojick at craigellachie.us
[10] https://www.osti.gov/
[11]
mailto:SIGMETRICS at mail.asis.org
[12]
http://mail.asis.org/mailman/listinfo/sigmetrics
[13]
mailto:sigmetrics-bounces at asist.org
[14]
mailto:dwojick at craigellachie.us
[15]
mailto:sigmetrics at mail.asis.org
[16]
mailto:sigmetrics at mail.asis.org
[17]
mailto:eric.archambault at science-metrix.com
[18]
http://www.science-metrix.com/
[19] http://www.science-metrix.com/
[20]
mailto:sigmetrics-bounces at asist.org
[21]
mailto:sigmetrics at mail.asis.org
[22] http://www.harzing.com
[23]
https://harzing.com
[24] https://twitter.com/awharzing
[25]
https://scholar.google.co.uk/citations?user=v0sDYGsAAAAJ
[26]
https://harzing.com/blog/.latest?redirect
[27]
https://harzing.com/blog/.random
[28]
https://harzing.com/resources/publish-or-perish/donations
[29]
https://www.base-search.net/
[30] https://www.base-search.net/
[31]
https://core.ac.uk/
[32] https://core.ac.uk/
[33]
http://science-metrix.com/sites/default/files/science-metrix/publications/d_1.8_sm_ec_dg-rtd_proportion_oa_1996-2013_v11p.pdf
[34]
http://science-metrix.com/en/publications/reports
[35]
http://www.science-metrix.com/en/classification
[36]
http://science-metrix.com/sites/default/files/science-metrix/sm_journal_classification_106_1.xls
[37]
mailto:eric.archambault at science-metrix.com
[38]
http://www.science-metrix.com/
[39] http://www.science-metrix.com/
[40]
https://www2.cabells.com/about-whitelist
[41] https://goo.gl/MWzRAb
[42]
https://goo.gl/DjGMA1
[43] http://goo.gl/ARyCn4
[44]
https://doi.org/10.1108/EUM0000000007147
[45]
http://ec3.ugr.es/in-recs/
[46] http://epuc.cchs.csic.es/resh
[47]
https://doi.org/10.1126/science.342.6154.60
[48]
https://osf.io/preprints/socarxiv/k54uv/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.asis.org/pipermail/sigmetrics/attachments/20180504/3909fd0b/attachment-0001.html>


More information about the SIGMETRICS mailing list