[Asis-l] SOASIST: August 27 / Dayton, OH / "Data Mining and Text-based Information"

Glen Horton glen2 at gclc-lib.org
Tue Aug 6 10:13:20 EDT 2002


SOASIST & LexisNexis Technical Library Sponsor
Mark Wasson, LexisNexis on
"Data Mining and Text-based Information"

WHEN: August 27, 2002 (Tuesday), 6 PM-- Dinner and 7PM-- Speaker in 
Auditorium 3 (Three)

WHERE: LexisNexis, Dayton, OH
COST: Presentation & dinner-- $10.00 for members and non-members; $5 for 
student or retired members. PREPAYMENT REQUIRED TO GUARANTEE CATERER.

COST: Presentation only-- free but must register to guarantee seating.

PAYMENT/REGISTRATION DEADLINE: 08/23/2002 by 5PM

"Metadata" is defined as data about data. For a text document, there are 
at least two types of metadata. Structural or formatting metadata 
describes a document's layout on the page, and can include information 
on fonts, spacing, indentation and so on. Content-based metadata 
captures information found in a document and organizes it for further 
use. This may include controlled vocabulary index terms, extracted terms 
and summaries that have been created and assigned to the document. 
Extracting and highlighting a list of proper names or citations found in 
a document is another example source of content-based metadata.

Using such metadata in combination with or as an alternative to 
full-text search can help people find and retrieve relevant documents 
more easily by both simplifying the search and improving the preciseness 
of what the search is specifying. Picklist-based document retrieval 
based on controlled vocabulary index terms is one way of exploiting 
content-based metadata.

By examining metadata from across a collection of documents, and 
combining it with data from other sources, such as stock price, 
corporate financial and economic data, one can begin to discover 
information that is not available in any one document. Knowledge 
Discovery in Databases, a.k.a. Data Mining, is a newer area in 
Artificial Intelligence that has had much early success when dealing 
with numerical and other structured data, in areas like consumer 
behavior and purchasing analysis, fraud detection and business forecasting.
Free text, however, does not have the structure that many knowledge 
discovery processes need. Metadata provides a means for representing the 
information found in free text in a structured way that is appropriate 
for many knowledge discovery processes. Applying data mining techniques 
to text is a steadily growing focus area within the knowledge discovery 
domain.

In his talk, Mark will give a general overview of knowledge discovery 
and data mining, discuss how this technology can be applied to text, 
review some applications and related technology, and provide links to 
resources for more information.

Mark Wasson is a Senior Architect/Research Scientist who has been with 
LexisNexis since 1986. He led the research projects behind Term-based 
Topic Identification, the Term Mapping System, the NEXIS Company 
Indexing and NetOwl Indexing technologies behind SmartIndexing, 
Searchable LEAD and the Fact Extraction Tool Kit. He also conceived 
Company Dossiers and Trend Analysis. He collaborated with researchers at 
the University of Pennsylvania on two projects. His current research 
activities and interests include applying knowledge discovery and data 
mining technologies to text-based content, question answering technology 
and automatic summarization. Mark also scouts new and emerging 
technologies at numerous conferences, workshops and third party 
technology companies (including more than 100 companies in 2001 alone).

Mark has authored or co-authored a number of papers and presentations 
for technical conferences covering topics including document 
categorization and indexing, summarization, information extraction, 
knowledge discovery, shallow vs. deep text processing approaches and 
academic-industry relations. He has also served on two panel discussions 
(including one at 2001 ASIS&T) and three conference and workshop program 
committees. Mark received a Bachelor of Science degree in Computer 
Science and both Bachelor of Arts and Master of Arts degrees in 
Linguistics, all at the University of Iowa.

DIRECTIONS to LexisNexis from Cincinnati: Take I-75 North to Dayton. 
Exit 44 to S.R. 725 (Centerville/Miamisburg Rd.). East on S.R. 725. 
South on S.R. 741 (Springboro Pike). LexisNexis is approx. 1 mile, on 
the right side of the road. Turn right at Spring Valley Road entrance 
(the 6th light from S.R. 725/S.R. 741 intersection), LexisNexis sign 
will say 9443-9595. Drive underneath skyway connecting the buildings. 
Turn right, go over the speed bump, and park in the lot next to the 
covered entrance. Enter Building 4 (9443 Springboro Pike). Wait at the 
guard station for someone to escort you.

DIRECTIONS to LexisNexis from Columbus: West on I-70. I-675 towards 
Cincinnati. Exit 2 (Centerville/Miamisburg Exit) off of I-675. Left on 
Yankee Rd. Right on Lyons Rd. Left on S.R. 741 (Springboro Pike). 
LexisNexis is about .5 mile on the right. Turn right at Spring Valley 
Road entrance (the 2nd light from Lyons Rd./S.R. 741 intersection), 
LexisNexis sign will say 9443-9595. Drive underneath skyway connecting 
the buildings. Turn right, go over the speed bump, and park in the lot 
next to the covered entrance. Enter Building 4 (9443 Springboro Pike) 
and wait for an escort.

AGENDA:

    * 5:30 - 6:00: REGISTRATION and social half-hour. Beverages will be 
provided by LexisNexis.
    * 6:00 - 7:00: DINNER
    * 7:00 - 9:00: MARK WASSON

Optional dinner: hot buffet of baked chicken with fine herb sauce, 
braised beef tips burgundy, vegetarian lasagna, wild rice pilaf, green 
beans almondine, tossed salad, pasta salad, assorted fresh dinner rolls, 
dessert choices, coffee, and iced tea.

PREPAYMENT REQUIRED with check made out to "SOASIS" by 5 p.m., Friday, 
08/23/2002, sent to Patricia Carter, B6F1 room 82, LexisNexis, 9595 
Springboro Pike, Miamisburg, OH 45342. Please indicate (1) association 
affiliation noting student/retired as appropriate, and (2) the name of 
your employer. Questions may be addressed 
to:patricia.carter at lexisnexis.com; (937) 865-6800 x6099.

Beverages and a portion of dinner cost are being underwritten by LexisNexis.





More information about the Asis-l mailing list