Accuracy of Thomson data - decentralising data collection and enhancing the scope of scientometrics?

Loet Leydesdorff loet at LEYDESDORFF.NET
Wed Dec 19 13:24:37 EST 2007


Dear Christina and colleagues, 

I was thinking along similar lines and therefore decided to write a routine
which allows me to fuse Scopus data with ISI data. The first (preliminary
version) is available at
http://www.leydesdorff.net/software/scop2isi/index.htm. 

The program reads a file in the csv-format exported from Scopus and produces
a file ISI.txt which is in the tagged format of the ISI database. This file
can be used for input into HistCite or to my programs. (One may have to
rename the file.) I tested two of my programs and I have to make some
adaptations, but it seems doable. 

The Scopus files contain also mistakes. For example, addresses are often
given twice, and sometimes incomplete.

With best wishes, 


Loet

 

> -----Original Message-----
> From: ASIS&T Special Interest Group on Metrics 
> [mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Pikas, Christina K.
> Sent: Wednesday, December 19, 2007 4:26 PM
> To: SIGMETRICS at LISTSERV.UTK.EDU
> Subject: Re: [SIGMETRICS] Accuracy of Thomson data - 
> decentralising data collection and enhancing the scope of 
> scientometrics?
> 
> Adminstrative info for SIGMETRICS (for example unsubscribe):
> http://web.utk.edu/~gwhitney/sigmetrics.html
> 
> I do think that a promising line of research is information 
> fusion from
> the multiple databases -- getting Loet et al the good quality data for
> their analyses.  Sure some platforms "de-dup" but I think 
> there's a need
> for quantifiable and testable rules and metrics to understand how this
> can be done in a justifiable manner.
> 
> I think we've seen that multiple sources need to be searched --
> particularly in areas such as applied math and computer 
> science.  Adding
> in institutional and disciplinary repositories is important, but would
> be extremely difficult from a data cleansing point of view (plus the
> theoretical part: what is a work? Which work? Which version? 
> How to roll
> up citations to different versions? Is a citation to a book 
> chapter the
> "same" as a citation to a conference paper in x domain? -- others are
> taking on some of these issues).  (Continuing a sort of stream of
> consciousness -- would some of the analysis that's gone into FRBR be
> helpful?)
> 
> Recent articles I've seen have used (hundreds if not 
> thousands of) hours
> of graduate student time to process the retrieved records, but that
> isn't really scalable and it seems unnecessary with all of the
> developments in areas like sensor fusion and information fusion from
> multiple sensors (in science and engineering).  If this work 
> has already
> been done, it surely hasn't been applied in the articles I've read.
> 
> 
> Christina
> 
> Christina K. Pikas, MLS 
> R.E. Gibson Library & Information Center
> The Johns Hopkins University Applied Physics Laboratory 
> Voice  240.228.4812 (Washington), 443.778.4812 (Baltimore) 
> Fax 443.778.5353 
> 
> 
> 
> 
> -----Original Message-----
> From: ASIS&T Special Interest Group on Metrics
> [mailto:SIGMETRICS at listserv.utk.edu] On Behalf Of Loet Leydesdorff
> Sent: Wednesday, December 19, 2007 3:28 AM
> To: SIGMETRICS at listserv.utk.edu
> Subject: Re: [SIGMETRICS] Accuracy of Thomson data - 
> decentralising data
> collection and enhancing the scope of scientometrics?
> 
> Adminstrative info for SIGMETRICS (for example unsubscribe):
> http://web.utk.edu/~gwhitney/sigmetrics.html
> 
> Dear Chris and colleagues, 
> 
> In my opinion, we have made a lot of progress in terms of 
> data analysis.
> The quality of the data that one inputs into the analysis is 
> a different
> issue.
> There are pros and cons using different data (SCI, Scopus, Google
> Scholar,
> etc.) as there are pros and cons using different techniques for the
> analysis (e.g., different clustering algorithms, similarity criteria,
> etc.). 
> 
> Nevertheless, I think that we have a state of the art in terms of
> techniques. I make some of them available as computer programs and
> lessons for my students at http://www.leydesdorff.net/indicators . If
> you have suggestions for improvements, please, let me know.
> 
> With best wishes, 
> 
> 
> Loet
> 
> ________________________________
> 
> Loet Leydesdorff
> Amsterdam School of Communications Research (ASCoR), Kloveniersburgwal
> 48, 1012 CX Amsterdam. 
> Tel.: +31-20- 525 6598; fax: +31-20- 525 3681 loet at leydesdorff.net ;
> http://www.leydesdorff.net/ 
> 
>  
> 
> > -----Original Message-----
> > From: ASIS&T Special Interest Group on Metrics 
> > [mailto:SIGMETRICS at listserv.utk.edu] On Behalf Of Armbruster, Chris
> > Sent: Wednesday, December 19, 2007 8:45 AM
> > To: SIGMETRICS at listserv.utk.edu
> > Subject: Re: [SIGMETRICS] Accuracy of Thomson data - decentralising 
> > data collection and enhancing the scope of scientometrics?
> > 
> > Adminstrative info for SIGMETRICS (for example unsubscribe):
> > http://web.utk.edu/~gwhitney/sigmetrics.html
> > 
> > To the list:
> > 
> > Would you trust the situation to improve if digital repositories 
> > (institutional, disciplinary and/or national) were to 
> provide data in 
> > future?
> > One would possibly expect that a decentralised solution 
> would provide 
> > more comprehensive (types of publication, languages
> > etc.) and more accurate coverage, but one might also worry that the 
> > corpus will be less well defined.... Hence, what would you think if 
> > repositories developed a system of author registration (unique 
> > identifier, institutional affiliation) and provided data?
> > 
> > What is the scope for delivering scientometrics to the digital 
> > workbench of scientists?
> > I have anecdotal evidence that review panels (for major 
> grants, tenure
> 
> > etc. - often very senior scientists) routinely use software 
> and search
> 
> > engines to look up the citation data and indices of applicants and 
> > candidates. If we were not to dismiss this simply as 
> evaluation mania,
> 
> > but to say that all scientists (senior and junior) now need 
> tools for 
> > metric research evaluation to reduce complexity on an 
> everyday basis 
> > (and develop strategies for research, teaching, publishing and 
> > networking) - is scientometrics developed enough to be a reliable 
> > tool?
> > 
> > Context: for the Max Planck Digital Library I am looking into the 
> > potential of digital libraries and repositories for the generation, 
> > collection and evaluation of scientometric data.
> > 
> > Chris Armbruster
> > http://ssrn.com/author=434782
> > 
> > 
> > 
> > 
> > -----Original Message-----
> > From: ASIS&T Special Interest Group on Metrics on behalf of Loet 
> > Leydesdorff
> > Sent: Tue 18/12/2007 20:50
> > To: SIGMETRICS at listserv.utk.edu
> > Subject: Re: [SIGMETRICS] FW: GENERAL: accuracy of Thomson data
> >  
> > Adminstrative info for SIGMETRICS (for example unsubscribe):
> > http://web.utk.edu/~gwhitney/sigmetrics.html
> > 
> > Dear Christina and colleagues:
> > 
> > Incorrect journal abbreviations and non-ISI sources
> > 
> > Citations
> > 
> > http://users.fmg.uva.nl/lleydesdorff/list.htm
> > 
> > Table 4: Non-ISI sources and incorrect journal 
> abbreviations with more
> 
> > than 10,000 citations in the JCR 2005.
> > 
> > "With its 54,139 citations, the J Phys Chem-US would belong to the 
> > top-50 journals of the database if it were included. However, this 
> > journal is included in the ISI-database under the 
> abbreviations J Phys
> 
> > Chem A and J Phys Chem B with 32,086 and 59,826 citations, 
> > respectively. For some journals, however, the different 
> spellings in 
> > the references may have large implications. Bornman et al. 
> (2007, at 
> > p. 105) found 21.5% overestimation of the impact factor of 
> Angewandte 
> > Chemie in 2005 because of authors providing references to both the 
> > German and international editions of this journal (Marx, 2001)."
> >  
> > Source: " 
> > <blocked::http://www.leydesdorff.net/cit_indicators/index.htm>
> > Caveats for the Use of Citation Indicators in Research and Journal 
> > Evaluations," Journal of the American Society for 
> Information Science 
> > and Technology, February 2008 (forthcoming; available as 
> Early View).
> >  
> > With best wishes,
> > 
> > Loet Leydesdorff
> > Amsterdam School of Communications Research (ASCoR) 
> Kloveniersburgwal 
> > 48, 1012 CX Amsterdam http://users.fmg.uva.nl/lleydesdorff/list.htm
> > 
> > 
> > > -----Original Message-----
> > > From: ASIS&T Special Interest Group on Metrics 
> > > [mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Pikas,
> > Christina K.
> > > Sent: Tuesday, December 18, 2007 3:37 PM
> > > To: SIGMETRICS at LISTSERV.UTK.EDU
> > > Subject: [SIGMETRICS] FW: GENERAL: accuracy of Thomson data
> > >
> > > Adminstrative info for SIGMETRICS (for example unsubscribe):
> > > http://web.utk.edu/~gwhitney/sigmetrics.html
> > >
> > > 
> > > Interesting article -- this came across another listserv I'm on. 
> > > - Calls for an audit of WoS data. 
> > > - Suggests median measure
> > > - Points to errors caused by article type designations.
> > >
> > >
> > > Christina K. Pikas, MLS
> > > R.E. Gibson Library & Information Center The Johns Hopkins 
> > > University Applied Physics Laboratory Voice  240.228.4812 
> > > (Washington), 443.778.4812 (Baltimore) Fax 443.778.5353
> > >
> > >
> > 
> 



More information about the SIGMETRICS mailing list