[Sigia-l] search results and thesauri

Sat May 25 01:43:12 EDT 2002

"Andrew McNaughton" wrote:

> It's not a question of audience, it's a question of scale and accuracy.

While Tal's system would generally work for the criteria he outlined, once
the result sets reach large numbers (or zero), another strategy may be of
use.

Fortunately there's a way to perform arbitrary set operations on clusters of
taxonomic terms or keywords *without* having the database to do any search
until the user is satisfied with his criteria.

In other words, if the user sees that a search on 'word_1' would produce a
million records, it's probably not going to be very useful. So why do the
search to begin with, wasting the user's time, taxing the server/DB and
clogging the network?

At that point, the application should be set to offer 'related' words, each
indicating how many results it would produce. The user can then AND/OR these
words and immediately see the size of the resulting set, hopefully
approaching a reasonable, small number.

Sets are pre-calculated and thus set operations are instantaneous, requiring
zero search of actual records. Once a small number is reached, a button
would perform the actual search across records.

This way, there's a single trip to the database, a single search, no
irrelevant result sets being shuffled over the network to the user, no
multiple waits for searches, and the user quickly and continuously sees the
impact of choosing/combining words on the result set.

Actually, if the user can drill down and perform Boolean operations on Tal's
taxonomy list, with each operation quickly indicating the size of the result
set (without doing the actual record search until the last choice), that
would solve one aspect of the problem.

Best,

Ziya