Journal classification metrics and clustering algorithms?

David Wojick dwojick at CRAIGELLACHIE.US
Tue Dec 3 12:34:40 EST 2013


To my knowledge Google Scholar does not do journal classification. All such 
classification schemes are relatively arbitrary because science is seamless 
and multi-dimensional, as it were. Nor does GS measure similarity of 
journals, not that I know of. Happy to learn otherwise, of course.

GS does measure the similarity between articles with their "related 
articles" feature, which is quite good. This feature appears to use term 
vector similarity measures, but I have yet to find this explained anywhere. 
I myself have built a community detection algorithm using the GS related 
articles feature. It measures the conceptual distance, from a given central 
concept, for each related article. In this case the community being 
identified is that using the central concept. It probably could do journals 
as well, but I have never tried that.

David

At 01:11 PM 12/3/2013, you wrote:
>Adminstrative info for SIGMETRICS (for example unsubscribe):
>http://web.utk.edu/~gwhitney/sigmetrics.html
>
>I am interested in the issue of Journal Classification used by Scopus,
>Thomson Reuters, Google Scholar and so on. I do not know if this is
>strategic information for the companies, but I would appreciate if
>anyone could point me out to references on:
>
>a) which metric the companies use to measure the similarity between journals?
>
>b) which clustering/community detection algorithm they use?
>
>Regrading the metric (a) I assume that it is bibliographic coupling,
>but there are two subquestions:
>
>a1) bibliographic coupling is, in principle, an asymmetric measure -
>that is, the similarity from A to B may not be the same as the
>similarity from B to A. If journal A cites 20 documents in common to B
>but A cites in total 200 documents while B cites 2000 documents, the
>two similarities are one order of magnitude different! Do they use any
>symetrization procedure?
>
>a2) Is there a fixed interval for collecting the citations made by a
>journal (say citations in the last two years?) or should one use the
>whole history of the journal?
>
>
>Thank you
>
>jacques wainer



More information about the SIGMETRICS mailing list