Journal classification metrics and clustering algorithms?
David Wojick
dwojick at CRAIGELLACHIE.US
Tue Dec 3 12:34:40 EST 2013
To my knowledge Google Scholar does not do journal classification. All such
classification schemes are relatively arbitrary because science is seamless
and multi-dimensional, as it were. Nor does GS measure similarity of
journals, not that I know of. Happy to learn otherwise, of course.
GS does measure the similarity between articles with their "related
articles" feature, which is quite good. This feature appears to use term
vector similarity measures, but I have yet to find this explained anywhere.
I myself have built a community detection algorithm using the GS related
articles feature. It measures the conceptual distance, from a given central
concept, for each related article. In this case the community being
identified is that using the central concept. It probably could do journals
as well, but I have never tried that.
David
At 01:11 PM 12/3/2013, you wrote:
>Adminstrative info for SIGMETRICS (for example unsubscribe):
>http://web.utk.edu/~gwhitney/sigmetrics.html
>
>I am interested in the issue of Journal Classification used by Scopus,
>Thomson Reuters, Google Scholar and so on. I do not know if this is
>strategic information for the companies, but I would appreciate if
>anyone could point me out to references on:
>
>a) which metric the companies use to measure the similarity between journals?
>
>b) which clustering/community detection algorithm they use?
>
>Regrading the metric (a) I assume that it is bibliographic coupling,
>but there are two subquestions:
>
>a1) bibliographic coupling is, in principle, an asymmetric measure -
>that is, the similarity from A to B may not be the same as the
>similarity from B to A. If journal A cites 20 documents in common to B
>but A cites in total 200 documents while B cites 2000 documents, the
>two similarities are one order of magnitude different! Do they use any
>symetrization procedure?
>
>a2) Is there a fixed interval for collecting the citations made by a
>journal (say citations in the last two years?) or should one use the
>whole history of the journal?
>
>
>Thank you
>
>jacques wainer
More information about the SIGMETRICS
mailing list