Guo, Z; Zhang, ZF; Zhu, SH; Chi, Y; Gong, YH. 2009. Knowledge Discovery from Citation Networks. 2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING: 800-805

Eugene Garfield garfield at CODEX.CIS.UPENN.EDU
Mon Apr 18 15:29:29 EDT 2011


Guo, Z; Zhang, ZF; Zhu, SH; Chi, Y; Gong, YH. 2009. Knowledge Discovery 
from Citation Networks. 2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA 
MINING: 800-805. edited by Wang, W; Kargupta, H; Ranka, S; Yu, PS; Wu, 
XD.presented at 9th IEEE International Conference on Data Mining in Miami 
Beach, FL, DEC 06-09, 2009.

Author Full Name(s): Guo, Zhen; Zhang, Zhongfei (Mark); Zhu, Shenghuo; Chi, 
Yun; Gong, Yihong
Book series title: IEEE International Conference on Data Mining
Language: English
Document Type: Proceedings Paper

Author Keywords: Unsupervised learning; latent models; text mining

Abstract: Knowledge discovery from scientific articles has received increasing 
attentions recently since huge repositories are made available by the 
development of the Internet and digital databases. In a corpus of scientific 
articles such as a digital library, documents are connected by citations and one 
document plays two different roles in the corpus: document itself and a 
citation of other documents. In the existing topic models, little effort is made 
to differentiate these two roles. We believe that the topic distributions of 
these two roles are different and related in a certain way. In this paper we 
propose a Bernoulli Process Topic (BPT) model which models the corpus at two 
levels: document level and citation level. In the BPT model, each document has 
two different representations in the latent topic space associated with its 
roles. Moreover, the multi-level hierarchical structure of the citation network is 
captured by a generative process involving a Bernoulli process. The distribution 
parameters of the BPT model are estimated by a variational approximation 
approach. In addition to conducting the experimental evaluations on the 
document modeling task, we also apply the BPT model to a well known 
scientific corpus to discover the latent topics. The comparisons against state-
of-the-art methods demonstrate a very promising performance.

Addresses: [Guo, Zhen; Zhang, Zhongfei (Mark)] SUNY Binghamton, Dept Comp 
Sci, Binghamton, NY 13902 USA

Reprint Address: Guo, Z, SUNY Binghamton, Dept Comp Sci, Binghamton, NY 
13902 USA.
E-mail Address: zguo at cs.binghamton.edu; zhongfei at cs.binghamton.edu; 
zsh at sv.nec-labs.com; ychi at sv.nec-labs.com; ygong at sv.nec-labs.com
ISSN: 1550-4786
ISBN: 978-1-4244-5242-2
fulltext: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5360314&tag=1



More information about the SIGMETRICS mailing list