[Sigia-l] using thesauri to improve search
Avi Rappoport
analyst at searchtools.com
Mon Jun 10 13:46:59 EDT 2002
At 12:55 PM -0400 6/10/02, Michael Fry wrote:
>I've been working on this thesaurus a lot lately, but suddenly I feel like
>what we're developing isn't going to bridge the gap between users and
>information as well as it ought to. Is there something about search engine
>software that I'm underestimating? Do we have to design a more complex search
>UI in order to facilitate the translation? Should we be building a vocabulary
>that's pre-coordinated rather than post-coordinated?
You need to take a step back and think about your users and their
search vocabulary. You have a great resource, the search logs, so
the first thing to do is do some analysis and see how your users are
searching and what they're looking for (two different though related
questions). Once you know about entry vocabulary, you can think
about what you want to offer and how to direct them to categories.
Inktomi's Content Classification Engine might fit in there. Marcia
Bates of UCLA has written some great stuff on these topics.
If you're investing a lot in metadata, you may want to use a faceted
search rather than a standard search. I'm doing some research right
now on this topic, based on data mining, OLAP, parametric search and
Marti Hearst's faceted metadata search research
(http://flamenco.berkeley.edu). I'm looking at Endeca, but it
doesn't have the "dynamic preview" counts of how many items are in
each category, which I think is vital. Other products sort of in
this field include AltaVista, Mercado, EasyAsk, Verity and so on, but
I've had a hard time finding examples on the web.
Please feel free to contact me offlist to talk about this in more
detail -- I'd like to learn more about your work.
Avi
--
Search Server Industry Analysis from Search Tools Consulting
(510) 845-2551 -- <mailto: analyst at searchtools.com>
Complete Guide to Search Engines for Web Sites and Intranets
<http://www.searchtools.com>
More information about the Sigia-l
mailing list