Problems with Web of Science

Mark Newman mark at SANTAFE.EDU
Thu Aug 8 12:57:05 EDT 2013


Many thanks to all those who replied to my earlier query.

I received a very informative and useful response by email from Marie
McVeigh at Thomson Reuters (which operates Web of Science).  She attempted
to post her reply to the list herself, but it was rejected by the server,
so, with her permission, I am posting it here on her behalf.

Of particular interest to me is the news that Thomson Reuters has developed
a fix for the problem that will be included in a forthcoming release of Web
of Science.

Mark Newman


----------------------------------------------------------------------

Subject: [SIGMETRICS] Problems with Web of Science
From: <marie.mcveigh at thomsonreuters.com>
Date: 08/08/2013 11:35 AM

All –

The incomplete resolution of citation-to-source links results not from Web
of Science, per se, but from the combination of Web of Science and our
hosted content.  Here’s how: Web of Science maintains a separate article
identified field – so that an article can have a page number, and an
article number, e-locator, e-identifier, or doi alone or in combination
with the page.  There is a constellation of physics journals (primarily)
that use article numbers exclusively and these are most often present in
hosted content using the article number in the page field.  A cited
reference that uses that format (cited page = 066112) is linking
preferentially to the hosted record and splitting off from the Web of
Science record.

We have developed an additional level of linking that will let us get
around/past this problem in an upcoming release of Web of Science.

I do need to specify that this is NOT due to the difference in the title.
We use a variety of metadata fingerprints on source and on cited reference
to match and link, but none requires a complete or perfect match in the
cited title string.  The more non-title elements match, the less dependent
we are on source title.  The algorithms are bolstered by standard word
abbreviations and an expert curation of title variants – Phys Rev E, alone,
has 17 recorded variants, once we impose standard abbreviation of Physical
and of Review.  All “linked” citations show the Thomson Reuters – JCR
abbreviation PHYS REV E because, once linking is established by metadata
match, we aggregate the references to our system-standard title.  That is
also applied to the 209 citations that are linked to the Medline Record for
the article used in your example – we are preferentially displaying the
Medline title of the source rather than any of the variations of the title
that appeared in the original citation.  I can pretty well guarantee you
that they did not all cite “Phys Rev E Stat Nonlin Soft Matter Phys”.
Probably most said “Phys Rev E”.

Because we retain in our system metadata all of the title variants that are
associated even with a collection of references that are all unified (like
the 209 refs to “May RM, Phys Rev E Stat Nonlin Soft Matter Phys” and the
72 refs to “May RM, PHYS REV E” both references can be retrieved from Web
of Science using a cited reference search for Cited Work = PHYS REV E or a
cited reference search for Cited Work = “PHYSICAL REVIEW E”

Ironically, the very importance of Cited Reference searching in Web of
Science is the fact that it can so easily allow knowledgeable users to see
what’s missing in linking.  .  Do your Cited Ref search for Cited
Author=MAY RM and Cited Work=Physical Review E, select both variants and
complete your search for a complete list of citing articles.  The same
source of the uncomfortable display of this link-splitting, allows you to
get accurate and real citation metrics about articles that are not in Web
of Science or Web of Knowledge-hosted source coverage.  No matter how
completely a database covers a subject, it simply cannot cover everything
that can be cited – and you will finally be limited if you do not have the
“times cited” link.  In Web of Science, you can get outside of that and
view/navigate citations to non-scholarly works, journals that are outside
of selected corpus of sources, but are cited by our core works.

I’ll wrap up with a note about JCR.  JCR represents title aggregation at
the level of the journal – and is not dependent on linking to the specific
source item.  (See:
http://community.thomsonreuters.com/t5/Citation-Impact-Center/Understanding-the-Journal-Impact-Factor/ba-p/18461)
All of these references – to any variant of Phys Rev E would have been
compiled, year-by-year, into the JCR data for the journal: Physical Review
E.

Hope that answers the questions!

Marie E. McVeigh
Director, Content Selection
Thomson Reuters
thomsonreuters.com
scientific.thomsonreuters.com



More information about the SIGMETRICS mailing list