Hjorland B, Nielsen LK "Subject access points in electronic retrieval" Ann. Rev. of Information Science and Technology 35:249-298, 2001
Eugene Garfield
garfield at CODEX.CIS.UPENN.EDU
Tue May 14 15:38:57 EDT 2002
Birger Hjorland: birger.hjorland at hb.se
TITLE Subject access points in electronic retrieval
AUTHOR Hjorland B, Nielsen LK
JOURNAL ANNUAL REVIEW OF INFORMATION SCIENCE AND TECHNOLOGY 35: 249-298 2001
Document type: Review Language: English Cited References: 153
Times Cited: 1
KeyWords Plus:
FULL-TEXT DOCUMENTS, TOPICAL RELEVANCE RELATIONSHIPS, WORLD-WIDE-WEB,
INFORMATION-RETRIEVAL, SEARCHERS SELECTION, PROBABILISTIC MODEL, DIGITAL
LIBRARIES, CITATION, INTERNET, DATABASE
Addresses:
Royal Sch Lib & Informat Sci, DK-2300 Copenhagen S, Denmark
Publisher:
INFORMATION TODAY INC, MEDFORD
IDS Number:
511HJ
ISSN:
0066-4200
EXTENDED ABSTRACT:
The first part of this article presents theoretical issues related
to information retrieval (IR), based on the idea that IR-performance first
of all is determined by the objectively given data that may or may not be
utilized in retrieval. Those objectively given data are the subject access
points (SAPs) and they include titles, abstracts, references, classification
codes, descriptors, full text elements and structures and more. The paper
outlines how advances in information technology (IT) have developed five
major stages in SAPs, including computer based retrieval in the 50ies,
citation indexing in the 70ies and full text retrieval in the 90ies. This
part of the article also present criteria for a taxonomy of SAPs as well as
theoretical problems related to concepts like subject and aboutness.
The second part of the article presents a synthesis of findings
related to each kind of access point (titles, abstracts,
references/citations, full text and descriptors, classification codes etc.)
On the one hand traditional measurement of how different SAPs may improve
recall and precision in IR is discussed. The emphasis has also been,
however, on the other hand to demonstrate the variability in the
functions of such different access points in different discourse communities
and at times to consider more qualitative issues related to SAPs
compared to traditional approaches in IR.
A main conclusion is that a given access point does not have a general
informational value for IR, but that the value of different access points
are relative to norms of writing and citing in different communities. The
relative benefits and drawbacks of terms
versus references as SAPs are given special theoretical attention. The
informational value of a given SAP (e.g. abstracts) is therefore not only a
function of the length of the record, but also a function of its content and
quality. The review is informed by modern epistemology, according to which
observations are theory-dependent. A given representation
will thus always be biased in some direction or another. There is nothing
like a neutral representation (e.g. abstract) of a document recognizing that
there is a well known distinction between descriptive and evaluative
abstracts. A representation is never neutral, nor should it be. It should
represent the users' interest or the interests of the information system, of
which it forms a part. Thus different SAPs may be more or less useful
depending on how the perspective of the searcher matches the perspective
implied by a given SAP. The paper provides a rich description of many kinds
of documents,
subjects, cultures, and target groups etc., thus avoiding the dominating
tendency to suppose that one ideal language or algorithm can manage all
kinds
of demands.
More information about the SIGMETRICS
mailing list