Hjorland B, Nielsen LK "Subject access points in electronic retrieval" Ann. Rev. of Information Science and Technology 35:249-298, 2001

Eugene Garfield garfield at CODEX.CIS.UPENN.EDU
Tue May 14 15:38:57 EDT 2002


Birger Hjorland: birger.hjorland at hb.se

TITLE Subject access points in electronic retrieval
AUTHOR  Hjorland B, Nielsen LK
JOURNAL ANNUAL REVIEW OF INFORMATION SCIENCE AND TECHNOLOGY 35: 249-298 2001

 Document type: Review    Language: English    Cited References: 153
Times Cited: 1


KeyWords Plus:
FULL-TEXT DOCUMENTS, TOPICAL RELEVANCE RELATIONSHIPS, WORLD-WIDE-WEB,
INFORMATION-RETRIEVAL, SEARCHERS SELECTION, PROBABILISTIC MODEL, DIGITAL
LIBRARIES, CITATION, INTERNET, DATABASE

Addresses:
Royal Sch Lib & Informat Sci, DK-2300 Copenhagen S, Denmark

Publisher:
INFORMATION TODAY INC, MEDFORD

IDS Number:
511HJ

ISSN:
0066-4200


EXTENDED ABSTRACT:

The first part of this article presents theoretical issues related
to information retrieval (IR), based on the idea that IR-performance first
of all is determined by the objectively given data that may or may not be
utilized in retrieval. Those objectively given data are the subject access
points (SAPs) and they include titles, abstracts, references, classification
codes, descriptors, full text elements and structures and more. The paper
outlines how advances in information technology (IT) have developed five
major stages in SAPs, including computer based retrieval in the 50ies,
citation indexing in the 70ies and full text retrieval in the 90ies. This
part of the article also present criteria for a taxonomy of SAPs as well as
theoretical problems related to concepts like subject and aboutness.
 The second part of the article presents a synthesis of findings
related to each kind of access point (titles, abstracts,
references/citations, full text and descriptors, classification codes etc.)
On  the one hand traditional measurement of how different SAPs may improve
recall and precision in IR is discussed.  The emphasis has also been,
however, on the other hand to demonstrate the variability in the
functions of such different access points in different discourse communities
and at times  to consider more qualitative issues related to SAPs
compared to traditional approaches in IR.

A main conclusion is that a given access point does not have a general
informational value for IR, but that the value of different access points
are relative to norms of writing and citing in different communities. The
relative benefits and drawbacks of terms
versus references as SAPs are given special theoretical attention. The
informational value of a given SAP (e.g. abstracts) is therefore not only a
function of the length of the record, but also a function of its content and
quality. The review is informed by modern epistemology, according to which
observations are theory-dependent. A given representation
will thus always be biased in some direction or another. There is nothing
like a neutral representation (e.g. abstract) of a document recognizing that
there is a well known distinction between descriptive and evaluative
abstracts. A representation is never neutral, nor should it be. It should
represent the users' interest or the interests of the information system, of
which it forms a part. Thus different SAPs may be more or less useful
depending on how the perspective of the searcher matches the perspective
implied by a given SAP. The paper provides a rich description of many kinds
of documents,
subjects, cultures, and target groups etc., thus avoiding the dominating
tendency to suppose that one ideal language or algorithm can manage all
kinds
of demands.



More information about the SIGMETRICS mailing list