[Sigia-l] Best Bets - a Federated Search?

huerro at consumer.org huerro at consumer.org
Tue Oct 1 10:26:17 EDT 2002


Richard,

Thanks a lot for this helpful information on site search engines. I have a
question for you (and the rest of the list).  I am trying to estimate the
time it would take to monitor the search log of a very popular product
information site (over a million visitors a month). The aim would be to find
candidate terms for the controlled vocabulary (to possibly use in a "best
bets" kind of approach for the site search engine) and determine popular
searches. Any ideas how long this would take, say on a weekly or monthly
basis? Thanks in advance. Bob Huerster, huerro at consumer.org

-----Original Message-----
From: Richard Wiggins [mailto:rich at richardwiggins.com]
Sent: Tuesday, October 01, 2002 8:21 AM
To: eric.scheid at ironclad.net.au
Cc: sigia-l at asis.org
Subject: Re: [Sigia-l] Best Bets - a Federated Search?


Hmmm.  I would say the answer varies based on the implementation.  Your Best
Bets may draw from the same Webspace as your spider, but that doesn't mean
every URL in the Best Bets service is in the index.

All spiders lag a bit in discovering new content.  Some spiders provide
better tools than others for the system administrator to force inclusion of
a new page or site.  A big motivating factor of the Best Bets service that
my group developed was to point to news content.  Our local AltaVista may or
may not have yet discovered a newsy URL when we add it to the best bets
database.

For sure some Best Bets services are part of federated searches.  A major
pharmaceutical company whose IA folks are on this mailing list offers Best
Bets along with results from a couple of different intranet search engines. 
I believe Proquest has products that offer Best Bets for very common
searches and then results from various databases.

Your point about the tradeoff of transparency and usability is interesting. 
I think there's a usability argument in favor of labeling the Best Bets. 
For those who "get it" the label can be positive: "Hmm, I see an editor has
chosen to highlight these hits."  Let's say a new hole in Internet Explorer
is discovered.  This one is nasty.  It sends all your private data to
everyone in your address book, it buys a Corvette on Ebay, it formats your
hard drive, and it turns off the aerator in your aquarium.  If I go to the
Microsoft site and type in any of those effects in the search box on the
home page, and a Best Bet comes up at the top of the list, I have high
confidence that it'll be what I want.  I need the label to know that.

Also, I think typically with Best Bets the payload is a list of URLs just
like a search engine would deliver.  But in ours the metadata is not
identical to what the search engine spits out.  Go to search.msu.edu and
search for "human resources" for an example.  We also sometimes offer in the
hit list something we call Pathfinders; the user who clicks on one of these
sees an annotated hit list.  Go to search.msu.edu and search for "virtual
university" for an example.

Finally, I know of one case where if there is a match the user sees ONLY the
Best Bets, without the beginning of the spider results in the hit list.  Go
to www.berkeley.edu and search for "human resources" for an example.  I
guess you might call that "interposed" instead of "federated"? I bet some
folks won't like the behavior but that's how it works.





More information about the Sigia-l mailing list