[Sigcr-l] Exhaustivity and specifity of indexing
Barbara Kwasnik
bkwasnik at syr.edu
Sat Aug 26 11:55:44 EDT 2006
My sense of specificity is that it is a conceptual matter: broad indexing/classification covers only the "top" concepts -- the ones that cover many subconcepts, while "deep" indexing covers down to the most specific concepts. These two notions are relative to the text and the domain of course.
My concept of exhaustivity is related to this, but it deals with how COMPREHENSIVELY you adhere to the rule for specificity. For example, one of the indexes on the front of the newspaper can be by design rather shallow -- covering only the headlines of interesting articles in other sections perhaps -- sort of a teaser. In addition it is not exhaustive either, because it only covers SOME of the headlines (not all) and is selective in its choice of indexable matter. On the other hand, a table of contents for a book may list the chapter titles (not very "deep" in terms of chapter content perhaps) but it is exhaustive if it covers every chapter, leaving none out. Thus a table of contents in a book can be construed as an index that is shallow with respect to specificity but exhaustive with respect to coverage.
I can't tell you where I got these definitions. I think it was from James Anderson back in graduate school at Rutgers, and I know I disagree somewhat in this respect with the text I use in my classes: Introduction to Indexing and Abstracting by Cleveland & Cleveland. I know there are other notions of exhaustivity that deal with "using all the terms/concepts that apply." I tell my students that what you call these notions is not as important as understanding that there is a difference in the index design decisions an indexer/classificationist makes along two dimensions -- 1. conceptual specificity of concepts and 2. exhaustivity/comprehensiveness of applying the determination of specificity in any given indexing endeavor.
Barbara Kwasnik
Professor, School of Information Studies, Syracuse University
>>> Leonard Will <L.Will at willpowerinfo.co.uk> 08/26/06 9:34 AM >>>
In message <FB64419FDA34834382771A20964B15CED27AF9 at amon.it.lth.se> on
Thu, 24 Aug 2006, Koraljka Golub <kora at it.lth.se> wrote
>
>Does anyone know of any references or have any opinion about
>exhaustivity and specificity of classification, meaning assignment of
>classes from a classification scheme.
In message <73573C2DCB0154408D790B1E7EDB0C521B9E1F at ka-exch01.db.dk> on
Sat, 26 Aug 2006, Birger Hjørland <BH at db.dk> wrote
>Dear Kora,
>I believe, that you are making the wrong assumption that indexing and
>classification is different in this respect. If you take a concept from
>a controlled vocabulary (say, a thesaurus) this is in my opinion
>similar to taking a class from a a clasification system (which also
>represents a concept). So, the specificity of a term in a thesaurus
>depends on the number of terms given and the specificity of a class in
>a classification system depends on the number og classes given (the
>more terms/classes, the greater the specificity of applying a given
>term/class). It it worth considering however, that although the overall
>specificity can be measured by counting the number of
>descriptors/classes, any given system will have a greater specificity
>in some areas compared to others (DDC, for example, is much more
>specific in Christianity compared to other religions).
I agree with what Birger says, but I think that Koraljka's question was
not so much about the specificity provided in the scheme itself, but the
specificity with which it is applied when classifying documents, i.e.,
for example, is it worth while to use the full specificity possible in
DDC by adding all the possible common subdivisions, "divide-like"
instructions and so on, or is it better to simplify by limiting class
numbers to 3 (or 6 or whatever) digits?
The answer to this must be that it depends on the material being
classified. The aim should be to classify specifically enough to make it
easy for the user to scan through the items in a class. I usually think
of this as meaning that a class should contain between 10 and 50 items.
If the collection is large, or concentrated in a single subject area,
more specificity will be needed than if it is a small, general
collection.
Other considerations are:
a. Allowing for growth of the collection. You don't want to have to go
back and re-classify if more material is added in a given subject area.
b. Compatibility with what is being done elsewhere. Do you share
records, obtain them from elsewhere or merge them in a combined
catalogue?
c. Provision of access from concepts that are scattered by the
classification. These may come later in the citation order of combining
facets in a synthesised class number, and if the number is truncated
they will be lost.
d. Adequacy of the alphabetical index constructed to show where topics
have been classed. It is seldom adequate to rely on the index published
with the schedules, but far too often that is all that is provided. It
will not show many synthesised numbers, and there is little point in
creating these if you do not also create the means of finding them.
Exhaustivity is more a matter of subject analysis of the documents. Do
you identify and record topics that are only treated incidentally in a
document, or do you restrict indexing and classification to the main
topics only? There is no simple answer, so much depending on the nature
of the collection, the users, and the purpose of the catalogue.
Leonard Will
--
Willpower Information (Partners: Dr Leonard D Will, Sheena E Will)
Information Management Consultants Tel: +44 (0)20 8372 0092
27 Calshot Way, Enfield, Middlesex EN2 7BQ, UK. Fax: +44 (0)870 051 7276
L.Will at Willpowerinfo.co.uk Sheena.Will at Willpowerinfo.co.uk
---------------- <URL:http://www.willpowerinfo.co.uk/> -----------------
_______________________________________________
Sigcr-l mailing list
Sigcr-l at asis.org
http://mail.asis.org/mailman/listinfo/sigcr-l
More information about the Sigcr-l
mailing list