[Sigia-l] Re: Faceted Classification

David R. Austen dausten at hoosier.net
Fri Jul 12 09:53:59 EDT 2002


Hello, Christopher:

Thank you for your reply.  Alas, I have time this morning for only
this quick response.

Thoughts, all?

Best regards,

David                           

http://zillionbucks.com -- Web hosting for the creative industry

Friday, July 12, 2002, 8:09:35 AM, you wrote:
>> In ERIC, if memory serves me, wrong spellings in record entries were
>> quite common and apparently even "accepted." One could see all the
>> different permutations and go with the most popular spelling when
>> making an entry. I suppose this "feature" was useful in that
>> "synonyms" (misspellings) were collected that usually pointed pretty
>> clearly to the descriptor.

CFa> "World Trade Center" and "Twin Towers" were, until recently, equally
CFa> popular terms for the buildings that used to define the NYC skyline.
CFa> Neither is a misspelling of the other,

I guess we'd all agree with this statement.

CFa> and the more popular of the two
CFa> may not have been the desired term.

Does it not depend? I'd think it generally true that the desired term
(for use as a descriptor) should be the most popular (choice) as
determined by careful testing of the appropriate users and
stakeholders.

CFa> Without a controlled vocabulary,
CFa> there would be these two facets with the same content.

Well, I'd really like to refer to these (loosely) as a the descriptor
(if, for no other reason, because it is spelled correctly) and
synonyms (non-preferred, because, if for no other reason, the rest are
spelled incorrectly.) I'd rather not refer to facets here, because I'd
say that is a non-standard use of the term "facet." I've studied the
work of Ranganathan years ago in grad school, but I've been wrong
before.

CFa> A better approach would be to "correct" each user's spelling by
CFa> cross-referencing each user entry with a controlled vocabulary. AFAIK,
CFa> the traditional method of ensuring adherance to a controlled vocabulary
CFa> in data entry is through a multiple choice interface element like a
CFa> pulldown menu.

I think that's a great way to go. Please note however, that in ERIC
(Dialog) there will be hundreds of new proper nouns each day. The
pulldown technology may or may not be in use by those who compile
ERIC database entries.


CFa> But your suggestions alludes to a better approach for
CFa> occasions when there are hundreds or even thousands of multiple choices:
CFa> if the user types "Twin Towers", prompt them with "Did you mean 'World
CFa> Trade Center'?

CFa> The "correction" process you describe may be more useful for the
CFa> searcher than it is for the data entry person.


CFa> I think that this is what
CFa> they do at stock photo search sites: If you type in "businessperson",
CFa> the system can, for example, translate this term into controlled
CFa> vocabulary terms "businessman" and "businesswoman" and return
CFa> appropriate results. I think of this as a kind of pre-search
CFa> "substituter" to translate the wide variety of search inputs into the
CFa> small range of controlled vocabulary in the database itself.

CFa> -Cf

CFa> [christopher eli fahey]
CFa> art: http://www.graphpaper.com
CFa> sci: http://www.askrom.com
CFa> biz: http://www.behaviordesign.com



CFa> ------------
CFa> When replying, please *trim your post* as much as possible.

CFa> *Plain text, please; NO Attachments

CFa> ASIST SIG IA website: http://www.asis.org/SIG/SIGIA/index.html
CFa> _______________________________________________
CFa> Sigia-l mailing list -- post to: Sigia-l at asis.org
CFa> Changes to subscription: http://mail.asis.org/mailman/listinfo/sigia-l



-- 




More information about the Sigia-l mailing list