Accuracy of Thomson data - decentralising data collection and enhancing the scope of scientometrics?

Pikas, Christina K. Christina.Pikas at JHUAPL.EDU
Wed Dec 19 10:25:55 EST 2007


I do think that a promising line of research is information fusion from
the multiple databases -- getting Loet et al the good quality data for
their analyses.  Sure some platforms "de-dup" but I think there's a need
for quantifiable and testable rules and metrics to understand how this
can be done in a justifiable manner.

I think we've seen that multiple sources need to be searched --
particularly in areas such as applied math and computer science.  Adding
in institutional and disciplinary repositories is important, but would
be extremely difficult from a data cleansing point of view (plus the
theoretical part: what is a work? Which work? Which version? How to roll
up citations to different versions? Is a citation to a book chapter the
"same" as a citation to a conference paper in x domain? -- others are
taking on some of these issues).  (Continuing a sort of stream of
consciousness -- would some of the analysis that's gone into FRBR be
helpful?)

Recent articles I've seen have used (hundreds if not thousands of) hours
of graduate student time to process the retrieved records, but that
isn't really scalable and it seems unnecessary with all of the
developments in areas like sensor fusion and information fusion from
multiple sensors (in science and engineering).  If this work has already
been done, it surely hasn't been applied in the articles I've read.


Christina

Christina K. Pikas, MLS 
R.E. Gibson Library & Information Center
The Johns Hopkins University Applied Physics Laboratory 
Voice  240.228.4812 (Washington), 443.778.4812 (Baltimore) 
Fax 443.778.5353 




-----Original Message-----
From: ASIS&T Special Interest Group on Metrics
[mailto:SIGMETRICS at listserv.utk.edu] On Behalf Of Loet Leydesdorff
Sent: Wednesday, December 19, 2007 3:28 AM
To: SIGMETRICS at listserv.utk.edu
Subject: Re: [SIGMETRICS] Accuracy of Thomson data - decentralising data
collection and enhancing the scope of scientometrics?


Dear Chris and colleagues, 

In my opinion, we have made a lot of progress in terms of data analysis.
The quality of the data that one inputs into the analysis is a different
issue.
There are pros and cons using different data (SCI, Scopus, Google
Scholar,
etc.) as there are pros and cons using different techniques for the
analysis (e.g., different clustering algorithms, similarity criteria,
etc.). 

Nevertheless, I think that we have a state of the art in terms of
techniques. I make some of them available as computer programs and
lessons for my students at http://www.leydesdorff.net/indicators . If
you have suggestions for improvements, please, let me know.

With best wishes, 


Loet

________________________________

Loet Leydesdorff
Amsterdam School of Communications Research (ASCoR), Kloveniersburgwal
48, 1012 CX Amsterdam. 
Tel.: +31-20- 525 6598; fax: +31-20- 525 3681 loet at leydesdorff.net ;
http://www.leydesdorff.net/ 

 

> -----Original Message-----
> From: ASIS&T Special Interest Group on Metrics 
> [mailto:SIGMETRICS at listserv.utk.edu] On Behalf Of Armbruster, Chris
> Sent: Wednesday, December 19, 2007 8:45 AM
> To: SIGMETRICS at listserv.utk.edu
> Subject: Re: [SIGMETRICS] Accuracy of Thomson data - decentralising 
> data collection and enhancing the scope of scientometrics?
> 
> Adminstrative info for SIGMETRICS (for example unsubscribe):
> http://web.utk.edu/~gwhitney/sigmetrics.html
> 
> To the list:
> 
> Would you trust the situation to improve if digital repositories 
> (institutional, disciplinary and/or national) were to provide data in 
> future?
> One would possibly expect that a decentralised solution would provide 
> more comprehensive (types of publication, languages
> etc.) and more accurate coverage, but one might also worry that the 
> corpus will be less well defined.... Hence, what would you think if 
> repositories developed a system of author registration (unique 
> identifier, institutional affiliation) and provided data?
> 
> What is the scope for delivering scientometrics to the digital 
> workbench of scientists?
> I have anecdotal evidence that review panels (for major grants, tenure

> etc. - often very senior scientists) routinely use software and search

> engines to look up the citation data and indices of applicants and 
> candidates. If we were not to dismiss this simply as evaluation mania,

> but to say that all scientists (senior and junior) now need tools for 
> metric research evaluation to reduce complexity on an everyday basis 
> (and develop strategies for research, teaching, publishing and 
> networking) - is scientometrics developed enough to be a reliable 
> tool?
> 
> Context: for the Max Planck Digital Library I am looking into the 
> potential of digital libraries and repositories for the generation, 
> collection and evaluation of scientometric data.
> 
> Chris Armbruster
> http://ssrn.com/author=434782
> 
> 
> 
> 
> -----Original Message-----
> From: ASIS&T Special Interest Group on Metrics on behalf of Loet 
> Leydesdorff
> Sent: Tue 18/12/2007 20:50
> To: SIGMETRICS at listserv.utk.edu
> Subject: Re: [SIGMETRICS] FW: GENERAL: accuracy of Thomson data
>  
> Adminstrative info for SIGMETRICS (for example unsubscribe):
> http://web.utk.edu/~gwhitney/sigmetrics.html
> 
> Dear Christina and colleagues:
> 
> Incorrect journal abbreviations and non-ISI sources
> 
> Citations
> 
> http://users.fmg.uva.nl/lleydesdorff/list.htm
> 
> Table 4: Non-ISI sources and incorrect journal abbreviations with more

> than 10,000 citations in the JCR 2005.
> 
> "With its 54,139 citations, the J Phys Chem-US would belong to the 
> top-50 journals of the database if it were included. However, this 
> journal is included in the ISI-database under the abbreviations J Phys

> Chem A and J Phys Chem B with 32,086 and 59,826 citations, 
> respectively. For some journals, however, the different spellings in 
> the references may have large implications. Bornman et al. (2007, at 
> p. 105) found 21.5% overestimation of the impact factor of Angewandte 
> Chemie in 2005 because of authors providing references to both the 
> German and international editions of this journal (Marx, 2001)."
>  
> Source: " 
> <blocked::http://www.leydesdorff.net/cit_indicators/index.htm>
> Caveats for the Use of Citation Indicators in Research and Journal 
> Evaluations," Journal of the American Society for Information Science 
> and Technology, February 2008 (forthcoming; available as Early View).
>  
> With best wishes,
> 
> Loet Leydesdorff
> Amsterdam School of Communications Research (ASCoR) Kloveniersburgwal 
> 48, 1012 CX Amsterdam http://users.fmg.uva.nl/lleydesdorff/list.htm
> 
> 
> > -----Original Message-----
> > From: ASIS&T Special Interest Group on Metrics 
> > [mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Pikas,
> Christina K.
> > Sent: Tuesday, December 18, 2007 3:37 PM
> > To: SIGMETRICS at LISTSERV.UTK.EDU
> > Subject: [SIGMETRICS] FW: GENERAL: accuracy of Thomson data
> >
> > Adminstrative info for SIGMETRICS (for example unsubscribe):
> > http://web.utk.edu/~gwhitney/sigmetrics.html
> >
> > 
> > Interesting article -- this came across another listserv I'm on. 
> > - Calls for an audit of WoS data. 
> > - Suggests median measure
> > - Points to errors caused by article type designations.
> >
> >
> > Christina K. Pikas, MLS
> > R.E. Gibson Library & Information Center The Johns Hopkins 
> > University Applied Physics Laboratory Voice  240.228.4812 
> > (Washington), 443.778.4812 (Baltimore) Fax 443.778.5353
> >
> >
> 



More information about the SIGMETRICS mailing list