[Asis-l] SOASIST: August 27 / Dayton, OH / "Data Mining and Text-based Information"
Glen Horton
glen2 at gclc-lib.org
Tue Aug 6 10:13:20 EDT 2002
SOASIST & LexisNexis Technical Library Sponsor
Mark Wasson, LexisNexis on
"Data Mining and Text-based Information"
WHEN: August 27, 2002 (Tuesday), 6 PM-- Dinner and 7PM-- Speaker in
Auditorium 3 (Three)
WHERE: LexisNexis, Dayton, OH
COST: Presentation & dinner-- $10.00 for members and non-members; $5 for
student or retired members. PREPAYMENT REQUIRED TO GUARANTEE CATERER.
COST: Presentation only-- free but must register to guarantee seating.
PAYMENT/REGISTRATION DEADLINE: 08/23/2002 by 5PM
"Metadata" is defined as data about data. For a text document, there are
at least two types of metadata. Structural or formatting metadata
describes a document's layout on the page, and can include information
on fonts, spacing, indentation and so on. Content-based metadata
captures information found in a document and organizes it for further
use. This may include controlled vocabulary index terms, extracted terms
and summaries that have been created and assigned to the document.
Extracting and highlighting a list of proper names or citations found in
a document is another example source of content-based metadata.
Using such metadata in combination with or as an alternative to
full-text search can help people find and retrieve relevant documents
more easily by both simplifying the search and improving the preciseness
of what the search is specifying. Picklist-based document retrieval
based on controlled vocabulary index terms is one way of exploiting
content-based metadata.
By examining metadata from across a collection of documents, and
combining it with data from other sources, such as stock price,
corporate financial and economic data, one can begin to discover
information that is not available in any one document. Knowledge
Discovery in Databases, a.k.a. Data Mining, is a newer area in
Artificial Intelligence that has had much early success when dealing
with numerical and other structured data, in areas like consumer
behavior and purchasing analysis, fraud detection and business forecasting.
Free text, however, does not have the structure that many knowledge
discovery processes need. Metadata provides a means for representing the
information found in free text in a structured way that is appropriate
for many knowledge discovery processes. Applying data mining techniques
to text is a steadily growing focus area within the knowledge discovery
domain.
In his talk, Mark will give a general overview of knowledge discovery
and data mining, discuss how this technology can be applied to text,
review some applications and related technology, and provide links to
resources for more information.
Mark Wasson is a Senior Architect/Research Scientist who has been with
LexisNexis since 1986. He led the research projects behind Term-based
Topic Identification, the Term Mapping System, the NEXIS Company
Indexing and NetOwl Indexing technologies behind SmartIndexing,
Searchable LEAD and the Fact Extraction Tool Kit. He also conceived
Company Dossiers and Trend Analysis. He collaborated with researchers at
the University of Pennsylvania on two projects. His current research
activities and interests include applying knowledge discovery and data
mining technologies to text-based content, question answering technology
and automatic summarization. Mark also scouts new and emerging
technologies at numerous conferences, workshops and third party
technology companies (including more than 100 companies in 2001 alone).
Mark has authored or co-authored a number of papers and presentations
for technical conferences covering topics including document
categorization and indexing, summarization, information extraction,
knowledge discovery, shallow vs. deep text processing approaches and
academic-industry relations. He has also served on two panel discussions
(including one at 2001 ASIS&T) and three conference and workshop program
committees. Mark received a Bachelor of Science degree in Computer
Science and both Bachelor of Arts and Master of Arts degrees in
Linguistics, all at the University of Iowa.
DIRECTIONS to LexisNexis from Cincinnati: Take I-75 North to Dayton.
Exit 44 to S.R. 725 (Centerville/Miamisburg Rd.). East on S.R. 725.
South on S.R. 741 (Springboro Pike). LexisNexis is approx. 1 mile, on
the right side of the road. Turn right at Spring Valley Road entrance
(the 6th light from S.R. 725/S.R. 741 intersection), LexisNexis sign
will say 9443-9595. Drive underneath skyway connecting the buildings.
Turn right, go over the speed bump, and park in the lot next to the
covered entrance. Enter Building 4 (9443 Springboro Pike). Wait at the
guard station for someone to escort you.
DIRECTIONS to LexisNexis from Columbus: West on I-70. I-675 towards
Cincinnati. Exit 2 (Centerville/Miamisburg Exit) off of I-675. Left on
Yankee Rd. Right on Lyons Rd. Left on S.R. 741 (Springboro Pike).
LexisNexis is about .5 mile on the right. Turn right at Spring Valley
Road entrance (the 2nd light from Lyons Rd./S.R. 741 intersection),
LexisNexis sign will say 9443-9595. Drive underneath skyway connecting
the buildings. Turn right, go over the speed bump, and park in the lot
next to the covered entrance. Enter Building 4 (9443 Springboro Pike)
and wait for an escort.
AGENDA:
* 5:30 - 6:00: REGISTRATION and social half-hour. Beverages will be
provided by LexisNexis.
* 6:00 - 7:00: DINNER
* 7:00 - 9:00: MARK WASSON
Optional dinner: hot buffet of baked chicken with fine herb sauce,
braised beef tips burgundy, vegetarian lasagna, wild rice pilaf, green
beans almondine, tossed salad, pasta salad, assorted fresh dinner rolls,
dessert choices, coffee, and iced tea.
PREPAYMENT REQUIRED with check made out to "SOASIS" by 5 p.m., Friday,
08/23/2002, sent to Patricia Carter, B6F1 room 82, LexisNexis, 9595
Springboro Pike, Miamisburg, OH 45342. Please indicate (1) association
affiliation noting student/retired as appropriate, and (2) the name of
your employer. Questions may be addressed
to:patricia.carter at lexisnexis.com; (937) 865-6800 x6099.
Beverages and a portion of dinner cost are being underwritten by LexisNexis.
More information about the Asis-l
mailing list