accuracy of Thomson data

Stephen J Bensman notsjb at LSU.EDU
Sun Dec 23 11:14:32 EST 2007


Loet et al, 
The more and more I look at the problem of backfiles, the more I opt for using the numerator of the Impact Factor as the Total Cites measure.  It is the method originally used by Martyn and Gilchrist, who pioneered the formulation of the Impact Factor, and it postulates that two years is an adquate sample to measure journal importance.  However, while it is the easiest to do, it is only a measure of current importance, and you lose a lot of historical perspective, for one measure of the importance of a journal is whether the articles published 50 years ago are still being cited today.  However, given contagion and the cumulative advantage of the Matthew Effect, the correlations of the two-year backfile with the entire backfile should be quite high, and you save yourself a lot of misery of tracing and aggregating backfiles.  To do the latter, you really have to be an expert serials cataloger with a knowledge of AACR2 rules and the MARC coding used in the OCLC cataloging mode.  That is too much to expect of most information scientists.
 
Stephen J. Bensman
LSU Libraries
Louisiana State University

 
________________________________

From: ASIS&T Special Interest Group on Metrics on behalf of Loet Leydesdorff
Sent: Sun 12/23/2007 1:45 AM
To: SIGMETRICS at listserv.utk.edu
Subject: Re: [SIGMETRICS] accuracy of Thomson data


Dear Stephen, Gene, and colleagues,
 
The data is available under the "citing journals" tab in the JCR at the Web. The data is processed from the "citing" side and the data is complete from this side. 
 
The ISI is correct that data like "J PHYS CHEM-US" cannot unambiguously be included in the calculation of the impact factor or total cites. Spell-checkers and twigging help at correcting individual citations, but this does not work in the case of these structural changes to the data. 
 
Technically, in my opinion, the ISI does an excellent job. The problem is ours. Helpfiles, of course, can be helpful.
 
With best wishes, 
 
 
Loet
 
________________________________

Loet Leydesdorff 
Amsterdam School of Communications Research (ASCoR), 
Kloveniersburgwal 48, 1012 CX Amsterdam. 
Tel.: +31-20- 525 6598; fax: +31-20- 525 3681 
loet at leydesdorff.net <mailto:loet at leydesdorff.net> ; http://www.leydesdorff.net/ <http://www.leydesdorff.net/>  

 


________________________________

	From: ASIS&T Special Interest Group on Metrics [mailto:SIGMETRICS at listserv.utk.edu] On Behalf Of Stephen J Bensman
	Sent: Saturday, December 22, 2007 9:26 PM
	To: SIGMETRICS at listserv.utk.edu
	Subject: Re: [SIGMETRICS] accuracy of Thomson data
	
	
	Adminstrative info for SIGMETRICS (for example unsubscribe): http://web.utk.edu/~gwhitney/sigmetrics.html 
	Loet,
	As Dr. Garfield pointed out, publishers and editors will do what they will do.  You cannot expect them to set policy as to what will best improve their ranking in the ISI JCRs.  Therefore, it behooves Thomson Scientific to make some improvements in the JCRs and the Web of Science.
	 
	From experimenting with the problem of Total Cites a bit, these are the improvements that I can suggest.  From what I can determine, the JCRs have totally disaggregated bibliographic entries.  There are treated separately changes that are not alphabetically continuous, parts, etc.  This is actually good, because sometimes one has to work with disaggregated data.  However, for various reasons, it is necessary to aggregate the data into uniform bibliographic entities.  For example, to obtain the complete total cites for JASIST, you have to aggregate cites to the complete backfiles of JASIST, JASIS, and American Documentation to obtain the complete total cites.  To help this, Thomson Scientific should do the following things;
	 
	1) Explain in the JCR Help how the cites are aggregated for Total Cites.
	2) Expand the list of title changes, division into parts, etc., in the JCR Help from the three-year limit geared to the Impact Factor to the complete historical list of such changes.  This list has appeared annually over the years, so it is a question of compiling them.  This should give the necessary title abbreviations for backfile searches in the Web of Science.
	3) Make it possible to limit the citations to these title segments to one JCR Year in the Web of Science.  This may be possible, but I did not figure out how to do this.
	4) Explain all this in the JCR Help.
	 
	If this is done, then the researcher should have complete flexibility in defining serials as bibliographic entities in the manner necessary for her/him.  It will be a struggle, but it at least will be possible.  If I remember correctly, the old JCRs had a lot of backfile data in them, but this seems to have disappeared in the new online ones.
	 
	Stephen J. Bensman
	LSU Libraries
	Louisiana State University  

________________________________

	From: ASIS&T Special Interest Group on Metrics on behalf of Loet Leydesdorff
	Sent: Sat 12/22/2007 2:54 AM
	To: SIGMETRICS at listserv.utk.edu
	Subject: Re: [SIGMETRICS] accuracy of Thomson data
	
	

	Adminstrative info for SIGMETRICS (for example unsubscribe):
	http://web.utk.edu/~gwhitney/sigmetrics.html
	
	Dear Stephen,
	
	Paradoxically, the 50,000+ yearly citations to "J Phys Chem-US" do not
	affect so much the Impact Factor (because of its limitation to the last two
	years), but it does dramatically affect Total Cites. The journal was split
	into different parts in 1997. (Yesterday, the ACS launched J Phys Chem-C
	with a focus on nano.)
	
	The lesson from this seems to me a policy advice to scientific publishers:
	do not change the name of the journal! The journal (and potentially the
	authors) will suffer in terms of visibility.
	
	Best wishes,
	
	
	Loet
	
	________________________________
	
	Loet Leydesdorff
	Amsterdam School of Communications Research (ASCoR),
	Kloveniersburgwal 48, 1012 CX Amsterdam.
	Tel.: +31-20- 525 6598; fax: +31-20- 525 3681
	loet at leydesdorff.net ; http://www.leydesdorff.net/
	
	
	
	> -----Original Message-----
	> From: ASIS&T Special Interest Group on Metrics
	> [mailto:SIGMETRICS at listserv.utk.edu] On Behalf Of Stephen J Bensman
	> Sent: Friday, December 21, 2007 6:10 PM
	> To: SIGMETRICS at listserv.utk.edu
	> Subject: Re: [SIGMETRICS] accuracy of Thomson data
	>
	> Adminstrative info for SIGMETRICS (for example unsubscribe):
	> http://web.utk.edu/~gwhitney/sigmetrics.html
	>
	> Loet,
	> Thanks for the tip.  I thought of that but that it requires
	> knowing all
	> the abbreviations used for the various parts.  Then there was the
	> problem of defining in what year the cites were made.  They have to be
	> limited to the JCR Year.  At that point my mind boggled at the
	> complexity of the process, and I thought perhaps it best to restrict
	> oneself to what can be aggregated in the JCR and rely on the adequate
	> sample theory that the JCRs first used.   But I did not put a lot of
	> time into this, because I am no longer interested in the
	> Impact Factor.
	>
	> Thanks for sparing me the explanation of the "vector space
	> models."  The
	> only vectors I know are compass ones used for triangulation
	> purposes to
	> knock out enemy artillery and other unhealthy things.
	>
	> Stephen J. Bensman
	> LSU Libraries
	> Louisiana State University
	> Baton Rouge, LA   70803
	> USA
	> notsjb at lsu.edu
	>
	> -----Original Message-----
	> From: ASIS&T Special Interest Group on Metrics
	> [mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Loet Leydesdorff
	> Sent: Friday, December 21, 2007 10:31 AM
	> To: SIGMETRICS at LISTSERV.UTK.EDU
	> Subject: Re: [SIGMETRICS] accuracy of Thomson data
	>
	> Adminstrative info for SIGMETRICS (for example unsubscribe):
	> http://web.utk.edu/~gwhitney/sigmetrics.html
	>
	> Dear Stephen,
	>
	> As you know, I was interested in the so-called
	> "externally-cited impact
	> factor" of a journal which is not included in the set and then I
	> realized
	> that one can search on non-includied journals. In the
	> Web-version, it is
	> easy. You can just type "J PHYS CHEM-US" in the cited reference search
	> or if
	> you wish "WALL STREET J" and you get the citations.
	>
	> It seems virtually impossible for me for the ISI (or Scopus) to
	> attribute
	> the citations to "J Phys Chem-US" to either "J Phys Chem A" or "J Phys
	> Chem
	> B". The same holds true for the BBA-volumes, etc. I don't blame them,
	> but
	> one should be aware of the problem. This underestimates the citations,
	> but
	> as you know we found also 21.5% overestimation in the case of
	> "Angewandte
	> Chemie" because of people citing both editions. However, the group of
	> journals for which this latter effect may happen, can be delimited.
	>
	> You don't mind if I am not going to explain the vector space model. I
	> think
	> the reference is Salton & McGill (1983) and if I remember
	> correctly, the
	> explanation is rather algebraic.
	>
	> With best wishes,
	>
	>
	> Loet
	>
	> ________________________________
	>
	> Loet Leydesdorff
	> Amsterdam School of Communications Research (ASCoR),
	> Kloveniersburgwal 48, 1012 CX Amsterdam.
	> Tel.: +31-20- 525 6598; fax: +31-20- 525 3681
	> loet at leydesdorff.net ; http://www.leydesdorff.net/
	>
	> 
	>
	> > -----Original Message-----
	> > From: ASIS&T Special Interest Group on Metrics
	> > [mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Stephen J Bensman
	> > Sent: Friday, December 21, 2007 4:01 PM
	> > To: SIGMETRICS at LISTSERV.UTK.EDU
	> > Subject: Re: [SIGMETRICS] accuracy of Thomson data
	> >
	> > Adminstrative info for SIGMETRICS (for example unsubscribe):
	> > http://web.utk.edu/~gwhitney/sigmetrics.html
	> >
	> > Loet,
	> > In re to your first paragraph, if you read the paper that
	> is posted on
	> > Dr. Garfield's Web site, you will see that the probability was very
	> > stable across time both in terms of the Impact Factor and
	> Total Cites
	> > not at the aggregate level but within one
	> discipline--chemistry.  This
	> > stability is essentially a reflection of the stability of the
	> > underlying
	> > social stratification system of chemistry.  The only
	> consistent change
	> > in probability was the continuing increase in dominance of
	> > the journals
	> > at the top of both measures, indicating the success-breeds-success
	> > mechanism of the Matthew Effect--technically known as "contagion" in
	> > statistics.  This is the basic reason for the stability of both the
	> > social stratification system and the journal system based upon it. 
	> >
	> > In re your method, I must admit that generally it shoots
	> right by me.
	> > When I was young, a mathematics teacher once told me that
	> minds can be
	> > divided into two types--those that think linearly and are good in
	> > algebra, and those that think spatially and are good in
	> geometry.  The
	> > teacher said that people very good in algebra often have
	> hard time in
	> > geometry.  I was extremely good in algebra but had a hard time in
	> > geometry due to difficulty in grasping spatial relationships.
	> >  It seems
	> > that computer programmers have to be able to think
	> spatially as well,
	> > and I have a hard time understanding this.
	> >
	> > However, your most important recent discovery--in my opinion--was in
	> > your recent "Caveats" paper, where you prove that JCRs do not
	> > aggregate
	> > Total Cites across title changes, splits, etc.  In other
	> > words, much of
	> > the backfile is left out of the count.  From my historical
	> > perspective,
	> > this is a most serious flaw.  You also have to understand
	> that I am a
	> > catalog librarian and define serial bibliographic entities
	> in terms of
	> > volume count--if the volume sequence is consistent, it is
	> > still the same
	> > journal despite the title change.  I investigated the
	> problem briefly
	> > and found that I could not aggregate more than three years.  I would
	> > like to know how you did that.  For me it means that for
	> > Total Cites to
	> > be good, Total Cites across 3 years have to be highly
	> correlated with
	> > Total Cites across the entire backfile.  An initial view was
	> > that such a
	> > short backfile was a sufficient sample.  If this is so,
	> then the Total
	> > Cites measure is good.
	> >
	> > Stephen J. Bensman
	> > LSU Libraries
	> > Louisiana State University
	> > Baton Rouge, LA   70803
	> > USA
	> > notsjb at lsu.edu
	> >
	> > -----Original Message-----
	> > From: ASIS&T Special Interest Group on Metrics
	> > [mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Loet Leydesdorff
	> > Sent: Friday, December 21, 2007 1:05 AM
	> > To: SIGMETRICS at LISTSERV.UTK.EDU
	> > Subject: Re: [SIGMETRICS] accuracy of Thomson data
	> >
	> > Adminstrative info for SIGMETRICS (for example unsubscribe):
	> > http://web.utk.edu/~gwhitney/sigmetrics.html
	> >
	> > Dear Stephen and colleagues,
	> >
	> > These (Spearman) correlations teach us, in my opinion, that
	> > the measures
	> > covary across disciplines at the aggregate level. Since
	> impact factors
	> > are
	> > higher in the biomedical sciences, library usages and expert ratings
	> > would
	> > be expected to be higher in these sciences as well?
	> >
	> > I would expect the c/p ratio, the impact factor, and the immediacy
	> > factor to
	> > correlate highly as one group, but total publication, total
	> citations,
	> > library usage, expert ratings, etc., as a second group (size
	> > related).
	> >
	> > As you know, my preference goes in the direction of mapping local
	> > citation
	> > impact environments.  These can be visualized using, for
	> example, the
	> > files
	> > at http://www.leydesdorff.net/jcr06 . (See: Visualization of the
	> > Citation
	> > Impact Environments of Scientific Journals: An online mapping
	> > exercise,
	> > Journal of the American Society for Information Science and
	> Technology
	> > 58(1), 25-38, 2007.) The normalization in terms of the vector space
	> > (cosine)
	> > takes care of the size effects in the relations (links), and
	> > size can be
	> > considered as an attribute of the nodes.
	> >
	> > I have no better operationalization at the moment, but I am open for
	> > suggestions.
	> >
	> > Best wishes,
	> >
	> >
	> > Loet
	> >
	> > ________________________________
	> >
	> > Loet Leydesdorff
	> > Amsterdam School of Communications Research (ASCoR),
	> > Kloveniersburgwal 48, 1012 CX Amsterdam.
	> > Tel.: +31-20- 525 6598; fax: +31-20- 525 3681
	> > loet at leydesdorff.net ; http://www.leydesdorff.net/
	> >
	> > 
	> >
	> > > -----Original Message-----
	> > > From: ASIS&T Special Interest Group on Metrics
	> > > [mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Stephen
	> J. Bensman
	> > > Sent: Thursday, December 20, 2007 10:37 PM
	> > > To: SIGMETRICS at LISTSERV.UTK.EDU
	> > > Subject: Re: [SIGMETRICS] accuracy of Thomson data
	> > >
	> > > Adminstrative info for SIGMETRICS (for example unsubscribe):
	> > > http://web.utk.edu/~gwhitney/sigmetrics.html
	> > >
	> > > I try to do some of this in the paper posted on Dr.
	> > > Garfield's Web site at:
	> > >
	> > > 
	> > >
	> > > http://garfield.library.upenn.edu/bensman/bensmanegif22007.pdf
	> > >
	> > > 
	> > >
	> > > You might want to look at the second half of the paper, where
	> > > I discuss
	> > > the Impact Factor in terms of Poisson lambdas, sampling
	> > > variance, random
	> > > error, etc.  The amazing thing to me, at least, is that
	> > > despite all the
	> > > random error and sampling variance, there is a remarkable
	> > > stability of
	> > > probability across time with Spearman rhos of 0.9 and above
	> > with high
	> > > respectable correlations with Total Cites, library use,
	> and expert
	> > > ratings.  Most impact factors move up and down within
	> > > extremely narrow
	> > > limits across time.  I found a similar phenomenon in a paper
	> > > just accepted
	> > > by JASIST called "Distributional Differences of the Impact
	> > > Factor in the
	> > > Sciences vs. the Social Sciences: An Analysis of the
	> Probabilistic
	> > > Structure of the 2005 Journal Citation Reports."   I no
	> > > longer own the
	> > > copyright and so cannot post it, but I suppose that I can let
	> > > you read it
	> > > on a private basis, if you're willing to suffer the pain of
	> > > reading it. 
	> > > There is much more to the Impact Factor than meets the eye,
	> > > and it is an
	> > > extremely good measure for many purposes, if of extremely
	> > > doubtful use for
	> > > ranking purposes in the vast bulk of the cases.  
	> > >
	> > > 
	> > >
	> > > Stephen J. Bensman, Ph.D.
	> > > LSU Libraries
	> > > Louisiana State University
	> > > Baton Rouge, LA   70803
	> > > USA
	> > > notsjb at lsu.edu
	> > >
	> >
	>
	

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.asis.org/pipermail/sigmetrics/attachments/20071223/96660cae/attachment.html>


More information about the SIGMETRICS mailing list