Tutorial on Web mining at CIKM 2008
Peiling Wang
peilingw at UTK.EDU
Tue Aug 19 14:08:08 EDT 2008
- Previous message: Chiang, MF; Peng, WC; Lo, CH Discovering popular co-cited communities in blogspaces 2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOP, VOLS 1 AND 2 439-444, 2008
- Next message: Eustache, F; Desgranges, B; Lambert, J; Belleville, S ; Platel, H The twenty-first century as a neuropsychology era REVUE NEUROLOGIQUE, 164: S63-S72 Suppl. 3 MAY 2008
- Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]
Call for participation:
A half-day tutorial on October 26 P.M. on
Web search log analysis and user behavior modeling
by Peiling Wang, Lei Wu, Dietmar Wolfram
at the ACM Seventeenth Conference on Information and Knowledge Management CIKM
2008
(October 26-30, 2008, Napa Valley Marriott Hotel & Spa, California, USA)
Abstract:
Web search logs capture valuable user-generated data as users naturally search
the Website. These log data can reveal what users were searching for and how
they searched. However, despite rich and informative, these transactional log
records are unstructured and messy. The current IR strategies for handling
structured documents (e.g., tf-idf, vector space) are not readily applicable to
studying user query log data. The query corpora include large amount of search
formulations that are short linguistic expressions, which reflects how the
majority of the users interact with the Web). With server-side logs, search
session boundaries are undefined, which makes individual search sessions
difficult to identify. Even though individual users can be identified in an
intranet environment or using client-side logs, identifying individual search
sessions remains a big challenge. Using data mining strategies and
technologies, we can process data once into a data model that is simple and
uniformed to allow intensive exploration. We can explore the data in different
ways to build models of Web search behaviors. However, current data mining
tools developed for business applications do not apply to transactional query
logs. Transforming unstructured log data into a relational database for mining
requires a deep understanding of both IR and data mining. In addition,
innovative tools must be developed to support ongoing analysis because new
questions often emerge when the current hypotheses are being studied. Although
the literature on Web transaction log analysis is growing fast over the past
decade, the published research works, with few exceptions, tend to focus on
presenting analytical results with insufficient coverage of technical details
to enable later researchers to duplicate the study using the same data or
different data. Many tools developed by individual projects are not shared
outside of its research context. This gap must be filled so that findings can
endure cross-examination and can be systematically compared.
This tutorial is built on research that the instructors have conducted on
studying Web search behaviors over the past decade. Through a series of
programmed intensive research projects of analyzing large amount of
transactional logs from different search environments, one of which is
supported by a National Leadership on Research grant from the Institute for
Museum and Library Services (IMLS), the instructors have gained in-depth
knowledge of and insight into Web search behaviors and unique experiences and
skills on processing and analyzing large Web search transactional logs to model
these behaviors. This tutorial will teach the algorithms and technical
implementations that participants can use for their own research design and for
Web applications that incorporate user observation and effective search
support.
**Early registration deadline closes 22 August 2008**
**Conference hotel is filling up - BOOK NOW**
Since 1992, the ACM Conference on Information and Knowledge Management
(CIKM) has successfully brought together leading researchers and
developers from the database, information retrieval, and knowledge
management communities. The purpose of the conference is to identify
challenging problems facing the development of future knowledge and
information systems, and to shape future research directions through the
publication of high quality, applied and theoretical research findings.
In CIKM 2008, we will continue the tradition of promoting collaboration
among multiple areas. CIKM 2008 topics are in the broad areas of
Databases, Information Retrieval, and Knowledge Management. CIKM 2008
also includes an Industry track.
WEBSITE: http://www.cikm2008.org/
- Previous message: Chiang, MF; Peng, WC; Lo, CH Discovering popular co-cited communities in blogspaces 2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOP, VOLS 1 AND 2 439-444, 2008
- Next message: Eustache, F; Desgranges, B; Lambert, J; Belleville, S ; Platel, H The twenty-first century as a neuropsychology era REVUE NEUROLOGIQUE, 164: S63-S72 Suppl. 3 MAY 2008
- Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]
More information about the SIGMETRICS
mailing list