[Sigia-l] "Best Bets" or "Accidental Thesaurus" Examples?
Richard Wiggins
rich at richardwiggins.com
Sun Aug 25 22:03:35 EDT 2002
With all due respect, I do understand the notion of undercutting serendipity
and comprehensiveness, but I'm afraid you may not fully understand what
we've built, what I've described, and how and why it works.
MSU Keywords, and other "best bets" services, and the accidental thesaurus
concept, are driven entirely by what people search for -- as expressed by
actual search terms. This isn't based on what's popular, except as
expressed by actual searches. We don't "promote" any search terms; that's
entirely up to users. However, given highly popular searches, we often DO
know with pretty good confidence the page most users likely want.
So when I see numerous searches for "human resources", here's the thinking
that goes into the equation:
Hmmm, most folks looking for "human resources" want the department named
Human Resources. They don't want 10,000 personal pages of people who aspire
to be a human resources professional. Most of them don't want an academic
program related to human resources, though some of them do. The idea is to
deliver to the large majority of users the content they obviously seek.
So go to search.msu.edu and do a search for "human resources", and see the
items from MSU Keywords and you'll have a better understanding of what the
concept is all about.
If a large number of people searched for "Moby Dick", we'd give them links
to the text of Moby Dick, or a library record on Moby Dick, or distinguished
commentary on Moby Dick (if it exists in the Web corpus). We would not give
links to a random page that says "I hated reading Moby Dick in high school."
If a small number of searches were for "Moby Dick", we'd let the search
engine do its job unassisted.
There is no pretense that the best bets are the only bets, or the best bets
are best for all constituents. We always give search engine results for the
more obscure bettors to explore. We only provide best bets for the top
500-1000 search phrases, out of tens of thousands of unique searches.
Understand that EVERY search engine makes assumptions about what to put at
the top of the hit list. In the absence of a best bets service, you are at
the mercy of a robot. What you call a "problem" is inherent in every search
engine result list. Statistics show that a tiny fraction of users ever
explores beyond the initial hit list. The idea is to, with some careful
human input, give the majority what they seek at the top of the list.
The mistake many of us made about search engines is to assume people use
them only to find obscure leaf pages. Google is where they are today
because most people do seek popular pages, mainly starting points, using
search engines. Clue: in the minutes after planes hit the World Trade
Center, 6000 people per minute typed "cnn" into Google's search box.
Here are the most popular searches at our institution for the last week.
While some terms are local lingo, I can assure you that it's easy to find
the best URL (or one of three) for each of these most popular searches.
There is no "best seller" phenomenon to worry about here. On the contrary,
there is a definite "why does AltaVista show so much crap instead of what
I'm looking for??!?" phenomenon.
Count Query
805 stuinfo
554 bookstore
450 parking
414 employment
407 twig
406 jobs
330 schedule
324 human resources
324 pilot
298 football
295 housing
288 computer store
254 map
249 registrar
245 transcripts
220 wharton center
181 student info
180 computer enrollment
172 msu bookstore
166 stu info
/rich
On Sun, 25 August 2002, Listera wrote:
>
> "Richard Wiggins" wrote:
>
> > The question is, if so few unique search terms can yield so much
paydirt,
> > why don't more people do this?
>
> Depends on what's "paydirt" ;-)
>
> There is an aspect of this that's problematic. It's the best-seller
> phenomenon. Best-sellers are created and maintained as such by promoting
> them as best-sellers; it's a feedback loop. If you look at a public library
> or examine the logs of music retailers, you'll see the push for
> best-sellers, performers, best-bets, etc. If the same files, books, music,
> etc., are always promoted as "best-bets" they indeed become the
most
> referenced. This may or may not be considered as "serving the
user,"
> depending on your political orientation. I'm not saying this is always bad,
> just that it can be.
>
> Best,
>
> Ziya
>
>
> ------------
> When replying, please *trim your post* as much as possible.
> *Plain text, please; NO Attachments
>
> ASIST Annual Meeting:
> <a
href="http://mail.richardwiggins.com//jump/http://www.asis.org/Conferences/AM02/index.html">http://www.asis.org/Conferences/AM02/index.html</a>
>
> ASIST SIG IA website: <a
href="http://mail.richardwiggins.com//jump/http://www.asis.org/SIG/SIGIA/index.html">http://www.asis.org/SIG/SIGIA/index.html</a>
> ________________________________________
> Sigia-l mailing list -- post to: Sigia-l at asis.org
> Changes to subscription: <a
href="http://mail.richardwiggins.com//jump/http://mail.asis.org/mailman/listinfo/sigia-l">http://mail.asis.org/mailman/listinfo/sigia-l</a>
____________________________________________________
Richard Wiggins
Writing, Speaking, and Consulting on Internet Topics
rich at richardwiggins.com www.richardwiggins.com
More information about the Sigia-l
mailing list