[Sigia-l] "Best Bets" or "Accidental Thesaurus" Examples?

Richard Wiggins rich at richardwiggins.com
Sun Aug 25 22:03:35 EDT 2002


With all due respect, I do understand the notion of undercutting serendipity
and comprehensiveness, but I'm afraid you may not fully understand what
we've built, what I've described, and how and why it works.  

MSU Keywords, and other "best bets" services, and the accidental thesaurus
concept, are driven entirely by what people search for -- as expressed by
actual search terms.  This isn't based on what's popular, except as
expressed by actual searches. We don't "promote" any search terms; that's
entirely up to users.  However, given highly popular searches, we often DO
know with pretty good confidence the page most users likely want.

So when I see numerous searches for "human resources", here's the thinking
that goes into the equation:

Hmmm, most folks looking for "human resources" want the department named
Human Resources.  They don't want 10,000 personal pages of people who aspire
to be a human resources professional.  Most of them don't want an  academic
program related to human resources, though some of them do.  The idea is to
deliver to the large majority of users the content they obviously seek.

So go to search.msu.edu and do a search for "human resources", and see the
items from MSU Keywords and you'll have a better understanding of what the
concept is all about.

If a large number of people searched for "Moby Dick", we'd give them links
to the text of Moby Dick, or a library record on Moby Dick, or distinguished
commentary on Moby Dick (if it exists in the Web corpus).  We would not give
links to a random page that says "I hated reading Moby Dick in high school."
 If a small number of searches were for "Moby Dick", we'd let the search
engine do its job unassisted.

There is no pretense that the best bets are the only bets, or the best bets
are best for all constituents.  We always give search engine results for the
more obscure bettors to explore.  We only provide best bets for the top
500-1000 search phrases, out of tens of thousands of unique searches. 

Understand that EVERY search engine makes assumptions about what to put at
the top of the hit list.  In the absence of a best bets service, you are at
the mercy of a robot. What you call a "problem" is inherent in every search
engine result list.  Statistics show that a tiny fraction of users ever
explores beyond the initial hit list.  The idea is to, with some careful
human input, give the majority what they seek at the top of the list.

The mistake many of us made about search engines is to assume people use
them only to find obscure leaf pages.  Google is where they are today
because most people do seek popular pages, mainly starting points, using
search engines.  Clue: in the minutes after planes hit the World Trade
Center, 6000 people per minute typed "cnn" into Google's search box.

Here are the most popular searches at our institution for the last week. 
While some terms are local lingo, I can assure you that it's easy to find
the best URL (or one of three) for each of these most popular searches. 
There is no "best seller" phenomenon to worry about here.  On the contrary,
there is a definite "why does AltaVista show so much crap instead of what
I'm looking for??!?" phenomenon.  

Count	Query
805	stuinfo
554	bookstore
450	parking
414	employment
407	twig
406	jobs
330	schedule
324	human resources
324	pilot
298	football
295	housing
288	computer store
254	map
249	registrar
245	transcripts
220	wharton center
181	student info
180	computer enrollment
172	msu bookstore
166	stu info

/rich

On Sun, 25 August 2002, Listera wrote:

> 
> "Richard Wiggins" wrote:
> 
> > The question is, if so few unique search terms can yield so much
paydirt,
> > why don't more people do this?
> 
> Depends on what's "paydirt" ;-)
> 
> There is an aspect of this that's problematic. It's the best-seller
> phenomenon. Best-sellers are created and maintained as such by promoting
> them as best-sellers; it's a feedback loop. If you look at a public library
> or examine the logs of music retailers, you'll see the push for
> best-sellers, performers, best-bets, etc. If the same files, books, music,
> etc., are always promoted as "best-bets" they indeed become the
most
> referenced. This may or may not be considered as "serving the
user,"
> depending on your political orientation. I'm not saying this is always bad,
> just that it can be.
> 
> Best,
> 
> Ziya
>  
> 
> ------------
> When replying, please *trim your post* as much as possible.
> *Plain text, please; NO Attachments
> 
> ASIST Annual Meeting:
> <a
href="http://mail.richardwiggins.com//jump/http://www.asis.org/Conferences/AM02/index.html">http://www.asis.org/Conferences/AM02/index.html</a>
> 
> ASIST SIG IA website: <a
href="http://mail.richardwiggins.com//jump/http://www.asis.org/SIG/SIGIA/index.html">http://www.asis.org/SIG/SIGIA/index.html</a>
> ________________________________________
> Sigia-l mailing list -- post to: Sigia-l at asis.org
> Changes to subscription: <a
href="http://mail.richardwiggins.com//jump/http://mail.asis.org/mailman/listinfo/sigia-l">http://mail.asis.org/mailman/listinfo/sigia-l</a>

____________________________________________________
Richard Wiggins
Writing, Speaking, and Consulting on Internet Topics
rich at richardwiggins.com       www.richardwiggins.com     



More information about the Sigia-l mailing list