ART: Kleinberg, Hubs, authorities, and communities
Gretchen Whitney
gwhitney at UTKUX.UTCC.UTK.EDU
Mon Nov 20 18:15:30 EST 2000
Jon M. Kleinberg : E-Mail : kleinber at CS.Cornell.EDU
TITLE : Hubs, authorities, and communities (Article, English)
AUTHOR : Kleinberg, JM
JOURNAL : ACM Computing Surveys, 31(4) (Suppl) December 1999
KEYWORDS : algorithms, human factors, hypertext structure, world wide
web, link analysis, graph algorithms
AUTHOR ADDRESS: JM Kleinberg, Cornell Univ. Dept Comp Sci, Ithaca, NY
The entire paper can be accessed at :
http://www.cs.brown.edu/memex/ACMCSHT/10/10.html
EXCERPT FROM THE PAPER :
In the field of citation analysis, a number of methods have been
proposed for measuring the importance of scientific journals 1990].
Perhaps the most widely used is Garfield's impact factor, which provides a
quantitative ``score'' for each journal proportional to the average
number of citations per paper published in the previous two years [Garfield
1972]. This measure encodes the fundamental intuition that more
heavily-cited journals have more overall impact on a field, and it has
been applied to rank journals in the Journal Citation Reports of the
Institute for Scientific Information.
Beginning with this measure, we could picture enhancing our estimate
of the important journals as follows. Suppose we have concluded, by
counting citations, that the journals Science and Nature are highly
prominent. Then if we are comparing two more obscure journals which
have received roughly the same number of citations as one another, and we
discover that one of these journals has received many citations from
Science and Nature, we may wish to elevate its ranking. In other words, it
is better to receive citations from an important journal than from an
unimportant one. We can see this phenomenon on the WWW as well: counting
the number of links to a page can give us a general estimate of its
prominence on the Web, but a page with very few incoming links may also be
prominent, if two of these links come from the home pages of Yahoo! and
Netscape. Defining such a richer notion of importance, or prominence,
contains an intrinsic element of circularity: it arises from the fragile
intuition that a node is important if it receives links from other important
nodes. Several measures incorporate this basic circular notion, and each
contains a method for capturing the implicit equilibrium that this
circularity encodes.
Two early approaches to embrace this theme in the study of social
networks are the measures of Katz [Katz 1953] and Hubbell [Hubbell
1965]. (See also the discussion in Wasserman and Faust [Wasserman 1994].) In
Hubbell's formulation, each node has an internal, a priori weight that
is given at the outset. We are also given a specified connection strength
between each pair of nodes. We seek to assign a global weight, or
prominence value, to each node in such a way that a node's global
weight is equal to the sum of its internal weight and the global weights of
all nodes that link to it, scaled by their connection strengths. This can
be represented as a collection of linear equations; its solution captures a
version of the equilibrium discussed above. The solution has some of
the key features we were seeking; if a node has large weight, the nodes it
links to will tend to have large weights as well. In the field of
citation analysis, Pinski and Narin [Pinski 1976] developed a similar
notion of influence weights, using a somewhat different mathematical model.
First, they define the strength of the connection from one journal to
another to be the percentage of the citations in the first journal that
refer to the second. They then seek a set of weights that obey the
following equilibrium: the weight of each journal J should be equal to the
sum of the weights of all journals citing J, scaled by the strengths
of their connections to it. Again, we can see desirable features of this
definition; if a journal receives regular citations from other
journals of large weight, it too will acquire large weight.
More information about the SIGMETRICS
mailing list