[Sigia-l] Re: String Search Question

Hal Taylor taylor at critpath.org
Thu Oct 23 07:03:23 EDT 2003


> Subject: [Sigia-l] String search question
> 
> If you have a "one box" search. Single text field w/ a search button. No
> other options. What do people feel is the default setting for behavior?
> 
> 1. Do you allow boolean symbols "&&" "||" etc.?

Why not? It adds functionality without interfering with naïve use. I would
also respect plain language terms ("and", "or").

> 2. What about just quotes and commas?

Again, I think that if you can add "advanced" functionality without
detracting from simple use, it's win/win.

> 3. If it is just an unquoted string, do you assume that spaces are separate
> words, or do you hold the integrity of the "phrase"?

Here I would say that my feeling for "default" is that separate words are
separate terms, and by default phrasing is *not* respected. Don't know if
you might be able to configure a search engine to give you the best of both
worlds by prioritizing exact phrase matches over combined term matches...

> And in either case are
> the strings or string used as "starts with" or "contains"?
> Starts with means that the string represents the starting point of the found
> entities and contains means that if the string is in any part a found entity
> it is good.
> 
> For example:
> Keyword: "vid"
> Will it show both video and david?
> Or just video?

This is probably the trickiest question you've got here, and I don't have a
sense of "default" for this.

What's interesting is that Yahoo's default search used to work as "contains"
and now defaults to exact match, plus a "did you mean to search for XXXX?"
response if it thinks your search term is funny. I assume that this is
because the "contains" search was too inclusive. You pose the question as
"starts with" vs. "contains" but don't discuss "exact term"...

Obviously, "starts with" will bring you fewer results. Maybe this decision
should be made based on context - how much data is there, who is the
audience, what is the likelihood that users are going to be overwhelmed by
the results of a too-inclusive result? Also, what happens if a user enters
"a" or something like that? Does it bring down the servers as they find
every instance of the letter?

You start with the condition that you have a single-box search field with no
options. What about the idea of adding options on the search results page?
This way, you could have a default behavior but allow users to tune their
search if the default wasn't working for them...

-Hal


> 
> Thanx!
> -- dave




More information about the Sigia-l mailing list