[Sigia-l] "Best Bets" or "Accidental Thesaurus" Examples?
huerro at consumer.org
huerro at consumer.org
Mon Aug 26 13:31:09 EDT 2002
Rich,
I have seen excellent Web-based search interfaces, incorporating searchable
thesauri, that allow for very precise searching. The problem with these
advanced search tools is that very few people understand how to use them.
Most Web site visitors have little patience for, or interest in, figuring
out the best way to take advantage of the advanced features of a site search
engine.
I have often puzzled over ways to leverage controlled vocabulary indexing in
the search interface of a popular Web site. The "accidental thesaurus" you
describe is a great idea. It brings the benefit of precise searching, with a
controlled vocabulary, to the novice searcher in a very intuitive fashion.
Bob Huerster, Research Librarian, Indexer, Thesaurus Editor,
huerro at consumer.org
-----Original Message-----
From: Richard Wiggins [mailto:rich at richardwiggins.com]
Sent: Sunday, August 25, 2002 11:05 AM
To: sigia-l at asis.org
Subject: [Sigia-l] "Best Bets" or "Accidental Thesaurus" Examples?
Hi -- I'm looking for examples of search engines that offer "Best Bets"
search results, specifically those built using analysis of search logs. As
an example, please visit search.msu.edu and type in searches such as:
bookstore
library
map
The results labeled "MSU Keywords" were editorially chosen, mostly by
examining search logs to find what people look for and what search terms
they use. We hand pick the best URL to match with each search word or
phrase. I call this an "accidental thesaurus" because the vast majority of
terms in the database come straight out of the list of the most popular
searches done using our (increasingly ineffective) AltaVista index; there
was no attempt to build a thesaurus for an entire university, nor was there
even an attempt to comprehensively include every office in the phone book.
I'm looking for other examples of such success. It is fairly common to see
federated searches at business sites, e.g. with product catalog results
shown first. (C.f. AT&T, BBC, ESPN, AOL, etc.) I'm looking for examples in
general Web environments, whether universities, corporate intranets, etc.
I've found one remarkably similar example at a major pharmaceutical company.
Their work, in turn, was inspired by work that Vivian Bliss of Microsoft
described in this interview:
http://www.asis.org/Bulletin/Aug-00/bliss.html
Our project has been very successful. It turns out that a very small set of
terms in the MSU Keywords database covers a huge percentage of searches. In
analyzing logs, we find that out of 200,000 searches performed over several
weeks, the top 500 unique phrases account for 40% of searches performed!
Some science explains why MSU Keywords is so successful. Lou Rosenfeld tells
me that our numbers are consistent with Bradford's Law of Scatter, and I
find Zipf and Pareto have similar things to say. I'm certain our
distribution curve matches others'.
The question is, if so few unique search terms can yield so much paydirt,
why don't more people do this? In fact, shouldn't EVERY intranet search
engine include a Best Bets service at the top of the hit list? Or, if more
people do it, would you please send me examples?
Again, I'm not just looking for federated searches; I'm looking for examples
of "Best Bets" features whose database was built from the bottom up based on
what people search for.
Thanks much,
/rich
____________________________________________________
Richard Wiggins
Writing, Speaking, and Consulting on Internet Topics
rich at richardwiggins.com www.richardwiggins.com
------------
When replying, please *trim your post* as much as possible.
*Plain text, please; NO Attachments
ASIST Annual Meeting:
http://www.asis.org/Conferences/AM02/index.html
ASIST SIG IA website: http://www.asis.org/SIG/SIGIA/index.html
________________________________________
Sigia-l mailing list -- post to: Sigia-l at asis.org
Changes to subscription: http://mail.asis.org/mailman/listinfo/sigia-l
More information about the Sigia-l
mailing list