Shimbo, M (Shimbo, Masashi); Ito, T (Ito, Takahiko); Mochihashi, D (Mochihashi, Daichi); Matsumoto, Y (Matsumoto, Yuji) On the properties of von Neumann kernels for link analysis MACHINE LEARNING, 75 (1): 37-67 APR 2009
Eugene Garfield
garfield at CODEX.CIS.UPENN.EDU
Wed Mar 25 12:19:41 EDT 2009
E-mail Address: shimbo at is.naist.jp; tito at microsoft.com;
daichi at cslab.kecl.ntt.co.jp; matsu at is.naist.jp
Author(s): Shimbo, M (Shimbo, Masashi); Ito, T (Ito, Takahiko);
Mochihashi, D (Mochihashi, Daichi); Matsumoto, Y (Matsumoto, Yuji)
Title: On the properties of von Neumann kernels for link analysis
Source: MACHINE LEARNING, 75 (1): 37-67 APR 2009
Language: English
Document Type: Article
Author Keywords: Link analysis; Recommender system; von Neumann kernel;
HITS; Topic drift
KeyWords Plus: LATENT SEMANTIC ANALYSIS; DOCUMENTS; GRAPH
Abstract: We study the effectiveness of Kandola et al.'s von Neumann
kernels as a link analysis measure. We show that von Neumann kernels
subsume Kleinberg's HITS importance at the limit of their parameter range.
Because they reduce to co-citation relatedness at the other end of the
parameter, von Neumann kernels give us a spectrum of link analysis
measures between the two established measures of importance and
relatedness. Hence the relative merit of a vertex can be evaluated in
terms of varying trade-offs between the global importance and the local
relatedness within a single parametric framework. As a generalization of
HITS, von Neumann kernels inherit the problem of topic drift. When a graph
consists of multiple communities each representing a different topic, HITS
is known to rank vertices in the most dominant community higher regardless
of the query term. This problem persists in von Neumann kernels; when the
parameter is biased towards the direction of global importance, they tend
to rank vertices in the dominant community uniformly higher irrespective
of the community of the seed vertex relative to which the ranking is
computed. To alleviate topic drift, we propose to use of a PLSI-based
technique in combination with von Neumann kernels. Experimental results on
a citation network of scientific papers demonstrate the characteristics
and effectiveness of von Neumann kernels.
Addresses: [Shimbo, Masashi; Ito, Takahiko; Matsumoto, Yuji] Nara Inst Sci
& Technol, Grad Sch Informat Sci, Nara 6300192, Japan; [Mochihashi,
Daichi] NTT Commun Sci Labs, Keihanna Sci City, Kyoto 6190237, Japan
Reprint Address: Shimbo, M, Nara Inst Sci & Technol, Grad Sch Informat
Sci, 8916-5 Takayama, Nara 6300192, Japan.
E-mail Address: shimbo at is.naist.jp; tito at microsoft.com;
daichi at cslab.kecl.ntt.co.jp; matsu at is.naist.jp
Cited Reference Count: 40
Times Cited: 0
Publisher: SPRINGER
Publisher Address: VAN GODEWIJCKSTRAAT 30, 3311 GZ DORDRECHT, NETHERLANDS
ISSN: 0885-6125
DOI: 10.1007/s10994-008-5090-6
29-char Source Abbrev.: MACH LEARN
ISO Source Abbrev.: Mach. Learn.
Source Item Page Count: 31
Subject Category: Computer Science, Artificial Intelligence
ISI Document Delivery No.: 411TO
ACHARYYA S
WORKSH LINK AN DET C : 2003
BALDI P
MODELING INTERNET WE : 2003
BHARAT K
P 21 ANN INT ACM SIG : 1998
BOLLACKER K
P 2 INT C AUT AG : 116 1998
BRIN S
The anatomy of a large-scale hypertextual Web search engine
COMPUTER NETWORKS AND ISDN SYSTEMS 30 : 107 1998
CHEBOTAREV PY
The matrix-forest theorem and measuring relations in small social groups
AUTOMATION AND REMOTE CONTROL 58 : 1505 1997
CHUNG FRK
SPECTRAL GRAPH THEOR : 1997
COHN D
P 17 INT C MACH LEAR : 167 2000
CRISTIANINI N
On kernel-target alignment
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 14, VOLS 1 AND 2 14 :
367 2002
DALE R
HDB NATURAL LANGUAGE : 2000
DEERWESTER S
INDEXING BY LATENT SEMANTIC ANALYSIS
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE 41 : 391 1990
DHYANI D
A survey of Web metrics
ACM COMPUTING SURVEYS 34 : 469 2002
FAGIN R
Comparing top k lists
SIAM JOURNAL ON DISCRETE MATHEMATICS 17 : 134 DOI
10.1137/S0895480102412856 2003
FOUSS F
Random-walk computation of similarities between nodes of a graph with
application to collaborative recommendation
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 19 : 355 2007
FOUSS F
P 2006 IEEE INT C DA : 863 2006
HAUSSLER D
UCSCCRL9910 : 1999
HOFMANN T
Learning the similarity of documents: An information-geometric approach to
document, retrieval and categorization
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 12 12 : 914 2000
HOFMANN T
Unsupervised learning by probabilistic latent semantic analysis
MACHINE LEARNING 42 : 177 2001
HOFMANN T
P 22 ANN INT ACM SIG : 50 1999
ITO T
P 10 EUR C PRINC PRA : 235 2006
ITO T
P 11 ACM SIGKDD : 586 2005
JAAKKOLA TS
Exploiting generative models in discriminative classifiers
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 11 11 : 487 1999
KANDOLA J
ADV NEURAL INFORM PR 15 : 673 2003
KANDOLA J
NCTR2002121 : 2002
KESSLER MM
BIBLIOGRAPHIC COUPLING BETWEEN SCIENTIFIC PAPERS
AMERICAN DOCUMENTATION 14 : 10 1963
KLEINBERG JM
Authoritative sources in a hyperlinked environment
JOURNAL OF THE ACM 46 : 604 1999
KONDOR R
P 18 INT C MACH LEAR : 21 2001
LEMPEL R
SALSA: The stochastic approach for link-structure analysis
ACM TRANSACTIONS ON INFORMATION SYSTEMS 19 : 131 2001
LEPAIR C
HDB QUANTITATIVE STU : 537 1988
NADLER B
ADV NEURAL INFORM PR 18 : 955 2006
SAERENS M
The principal components analysis of a graph, and its relationships to
spectral clustering
MACHINE LEARNING: ECML 2004, PROCEEDINGS 3201 : 371 2004
SHAWETAYLOR J
KERNEL METHODS PATTE : 2004
SHIMBO M
MINING GRAPH DATA : CH12 2006
SHIMBO M
P ACM IEEE JOINT C D : 354 2007
SIEGEL S
NONPARAMETRIC STAT B : 1988
SMALL H
COCITATION IN SCIENTIFIC LITERATURE - NEW MEASURE OF RELATIONSHIP BETWEEN
2 DOCUMENTS
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE 24 : 265 1973
SMOLA AJ
Kernels and regularization on graphs
LEARNING THEORY AND KERNEL MACHINES 2777 : 144 2003
WHITE S
P 9 ACM SIGKDD INT C : 266 2003
ZELNIKMANOR L
ADV NEURAL INFORM PR 17 : 2005
ZHOU D
P WORKSH STAT REL LE : 2004
More information about the SIGMETRICS
mailing list