[Sigia-l] precoordinate indexing

Leonard Will L.Will at willpowerinfo.co.uk
Wed Dec 1 06:18:07 EST 2004


In message <41ADEB51.70104 at poorbuthappy.com> on Wed, 1 Dec 2004, Peter 
Van Dijck <peter at poorbuthappy.com> wrote
>Question: I'm trying to figure out what "precoordinate indexing" really 
>means, in a web context.

In the draft revision of the standard for thesaurus construction BS5723 
(= ISO 2788), part 1, we have included the following definitions:

2.22 post-coordinate indexing

system of indexing in which a compound subject is analysed into its 
constituent concepts by an indexer but the descriptors so allocated are 
not combined until they are selected by a user at the search stage


2.23 pre-coordinate indexing

system of indexing in which the descriptors allocated to a particular 
document are syntactically combined in one or more sequences 
representing the only combinations available for retrieval purposes

>Let's say we have an article about Shakespeare's lovelife. In 
>precoordinate indexing, we have to put that under the categories:
>- Shakespeare
>- Skakespeare's lovelife
>- lovelife
>whereas in postcoordinate indexing, we have to put it under the categories
>- Shakespeare
>- lovelife
>and the system will also show it under Shakespeare's lovelife.

I would phrase this differently, by saying that in pre-coordinate 
indexing we would assign the subject index string "Shakespeare - 
lovelife" to a document, whereas in post-coordinate indexing we would 
assign the two descriptors "Shakespeare" and "lovelife" separately to 
the same document.

Pre-coordinate indexing is helpful for browsing, as it should provide a 
useful sequence in which to arrange compound topics. This is the basis 
of most classification systems, which have rules for the order in which 
concepts should be combined in an indexing string ("citation order"). 
The citation order adopted above would bring all aspects of 
Shakespeare's life together, whereas the alternative order "lovelife - 
Shakespeare" would bring together all documents on love lives, 
sub-arranged by the people concerned.

Post-coordinate indexing is helpful for specific searches, where an 
enquirer can specify combinations of concepts at the time of searching, 
allowing many more possible combinations than it would be practicable to 
provide as pre-combined strings. In the above example an enquirer should 
formulate the search query "Shakespeare AND lovelife", using AND as a 
Boolean connector (implicit in some systems but clearer if expressed 
explicitly).

The two approaches are complementary and ideally both should be 
provided.

In case of misunderstanding, I should note that the phrase "representing 
the only combinations available for retrieval purposes" in the second 
definition above does not mean that the individual concepts within a 
pre-coordinated string cannot be searched for separately, either as 
controlled descriptors or as free text, but that such methods are not 
part of the pre-coordinate indexing mechanism.

Leonard Will

-- 
Willpower Information       (Partners: Dr Leonard D Will, Sheena E Will)
Information Management Consultants              Tel: +44 (0)20 8372 0092
27 Calshot Way, Enfield, Middlesex EN2 7BQ, UK. Fax: +44 (0)870 051 7276
L.Will at Willpowerinfo.co.uk               Sheena.Will at Willpowerinfo.co.uk
---------------- <URL:http://www.willpowerinfo.co.uk/> -----------------



More information about the Sigia-l mailing list