[Sigia-l] "Best Bets" or "Accidental Thesaurus" Examples?

Richard Wiggins rich at richardwiggins.com
Sun Aug 25 11:05:04 EDT 2002


Hi -- I'm looking for examples of search engines that offer "Best Bets"
search results, specifically those built using analysis of search logs.  As
an example, please visit search.msu.edu and type in searches such as:

 bookstore
 library
 map

The results labeled "MSU Keywords" were editorially chosen, mostly by
examining search logs to find what people look for and what search terms
they use.  We hand pick the best URL to match with each search word or
phrase. I call this an "accidental thesaurus" because the vast majority of
terms in the database come straight out of the list of the most popular
searches done using our (increasingly ineffective) AltaVista index; there
was no attempt to build a thesaurus for an entire university, nor was there
even an attempt to comprehensively include every office in the phone book.

I'm looking for other examples of such success.  It is fairly common to see
federated searches at business sites, e.g. with product catalog results
shown first.  (C.f. AT&T, BBC, ESPN, AOL, etc.) I'm looking for examples in
general Web environments, whether universities, corporate intranets, etc. 
I've found one remarkably similar example at a major pharmaceutical company.
 Their work, in turn, was inspired by work that Vivian Bliss of Microsoft
described in this interview:

http://www.asis.org/Bulletin/Aug-00/bliss.html

Our project has been very successful. It turns out that a very small set of
terms in the MSU Keywords database covers a huge percentage of searches. In
analyzing logs, we find that out of 200,000 searches performed over several
weeks, the top 500 unique phrases account for 40% of searches performed!

Some science explains why MSU Keywords is so successful. Lou Rosenfeld tells
me that our numbers are consistent with Bradford's Law of Scatter, and I
find Zipf and Pareto have similar things to say.  I'm certain our
distribution curve matches others'.

The question is, if so few unique search terms can yield so much paydirt,
why don't more people do this?  In fact, shouldn't EVERY intranet search
engine include a Best Bets service at the top of the hit list?  Or, if more
people do it, would you please send me examples? 

Again, I'm not just looking for federated searches; I'm looking for examples
of "Best Bets" features whose database was built from the bottom up based on
what people search for.

Thanks much,

/rich

____________________________________________________
Richard Wiggins
Writing, Speaking, and Consulting on Internet Topics
rich at richardwiggins.com       www.richardwiggins.com     



More information about the Sigia-l mailing list