[Sigcr-l] Exhaustivity and specifity of indexing

Sat Aug 26 09:34:27 EDT 2006

In message <FB64419FDA34834382771A20964B15CED27AF9 at amon.it.lth.se> on 
Thu, 24 Aug 2006, Koraljka Golub <kora at it.lth.se> wrote
>
>Does anyone know of any references or have any opinion about
>exhaustivity and specificity of classification, meaning assignment of
>classes from a classification scheme.

In message <73573C2DCB0154408D790B1E7EDB0C521B9E1F at ka-exch01.db.dk> on 
Sat, 26 Aug 2006, Birger Hjørland <BH at db.dk> wrote
>Dear Kora,
>I believe, that you are making the wrong assumption that indexing and 
>classification is different in this respect. If you take a concept from 
>a controlled vocabulary (say, a thesaurus) this is in my opinion 
>similar to taking a class from a a clasification system (which also 
>represents a concept). So, the specificity of a term in a thesaurus 
>depends on the number of terms given and the specificity of a class in 
>a classification system depends on the number og classes given (the 
>more terms/classes, the greater the specificity of applying a given 
>term/class). It it worth considering however, that although the overall 
>specificity can be measured by counting the number of 
>descriptors/classes, any given system will have a greater specificity 
>in some areas compared to others (DDC, for example, is much more 
>specific in Christianity compared to other religions).

I agree with what Birger says, but I think that Koraljka's question was 
not so much about the specificity provided in the scheme itself, but the 
specificity with which it is applied when classifying documents, i.e., 
for example, is it worth while to use the full specificity possible in 
DDC by adding all the possible common subdivisions, "divide-like" 
instructions and so on, or is it better to simplify by limiting class 
numbers to 3 (or 6 or whatever) digits?

The answer to this must be that it depends on the material being 
classified. The aim should be to classify specifically enough to make it 
easy for the user to scan through the items in a class. I usually think 
of this as meaning that a class should contain between 10 and 50 items. 
If the collection is large, or concentrated in a single subject area, 
more specificity will be needed than if it is a small, general 
collection.

Other considerations are:

a. Allowing for growth of the collection. You don't want to have to go 
back and re-classify if more material is added in a given subject area.

b. Compatibility with what is being done elsewhere. Do you share 
records, obtain them from elsewhere or merge them in a combined 
catalogue?

c. Provision of access from concepts that are scattered by the 
classification. These may come later in the citation order of combining 
facets in a synthesised class number, and if the number is truncated 
they will be lost.

d. Adequacy of the alphabetical index constructed to show where topics 
have been classed. It is seldom adequate to rely on the index published 
with the schedules, but far too often that is all that is provided. It 
will not show many synthesised numbers, and there is little point in 
creating these if you do not also create the means of finding them.

Exhaustivity is more a matter of subject analysis of the documents. Do 
you identify and record topics that are only treated incidentally in a 
document, or do you restrict indexing and classification to the main 
topics only? There is no simple answer, so much depending on the nature 
of the collection, the users, and the purpose of the catalogue.

Leonard Will

-- 
Willpower Information       (Partners: Dr Leonard D Will, Sheena E Will)
Information Management Consultants              Tel: +44 (0)20 8372 0092
27 Calshot Way, Enfield, Middlesex EN2 7BQ, UK. Fax: +44 (0)870 051 7276
L.Will at Willpowerinfo.co.uk               Sheena.Will at Willpowerinfo.co.uk
---------------- <URL:http://www.willpowerinfo.co.uk/> -----------------