[Sigia-l] Full-Text vs Keyword Searching

Alice Preston aliceflute at hotmail.com
Thu Feb 24 14:39:38 EST 2005


Greetings--

Having recently switched jobs, I find myself being asked to provide 
different kinds of justifications for what I do (or what users might do) 
than in the past. This week's question has to do with full-text search vs 
keyword search. I know something about the subject, but maybe not enough to 
keep out of trouble.

The application will have huge collections of items (some are documents, 
some images, some other types of materials). To assist users with finding 
what they need, we plan to offer browsing through access to a couple of 
hierarchies, such as Topic, Origin of the material, (etc.) with sub headings 
taking them in directions we expect to be somewhat frequently used. We also 
plan to offer both a simple and "advanced" search. It seems that the 
development side expected for simple search to be just full-text search 
(with exact match) and advanced search to require the user to select an 
attribute and provide match text for the attribute.  (I'm still figuring out 
what they thought would happen with all the keywords the content providers 
are working to supply in the metadata.)

So here's the question (OK, several questions):

Have you worked on a system with a combination of browse and search where 
you dealt with keywords in some additional manner? How did you do it? For 
example, what if the simple search was sensitive to whenever a keyword was 
entered and switched to matches on keywords then? Or what about an 
application where there was an Index choice as well as (or maybe instead of) 
Advanced Search?

Other thoughts on how people succeed or don't succeed with such searches?

My gut instinct is to step farther away from full-text search (especially 
because some of the materials are "in context"--think full pages of 
newspapers instead of just clippings--and the searching will find the text 
strings on the same page around the piece in question, not only within the 
piece itself). I do not yet have access to target users; they will be 
university undergraduate and graduate students, librarians, faculty, and 
non-university researchers in several different academic areas, across the 
world. I fully realize how many "ifs" there are, but I am in the position of 
needing to either back development's initial plan to omit keywords from 
their work or to supply some kind of generic evidence that they should use 
them in some manner.

Thanks in advance for any background or opinions you can share--
Alice Preston
Ithaka Harbors
Princeton, NJ USA





More information about the Sigia-l mailing list