[Sigcr-l] SIG-CR Workshop Agenda - one more time...
Mikel Breitenstein
mikel at breitenstein.com
Tue Nov 9 22:00:55 EST 2004
If you can't read this below, and want to -- contact me: mikel.breitenstein at verizon.net
See you Saturday, November 13.
SIG-CR WORKSHOP
"MORE THAN WORDS"
SATURDAY, NOVEMBER 13, 2004
DATE: November 2, 2004
FROM: Mikel Breitenstein, SIG-CR Workshop Chair
TO: SIG-CR Members
SCHEDULE
GATHER 8:30-9:00
INTRODUCTIONS AND WELCOME 9:00-9:10
Mikel Breitenstein, SIG-CR Chair
Email: mikel.breitenstein at verizon.net
OVERVIEW OF CLASSIFICATION PAST, PRESENT, FUTURE 9:15 ? 10:00
Clare Beghtol, Barbara Kwasnik, Hope Olson -
Perspectives: Where We?ve Been, Where We Are, Where We?re Going
If you want a future, darling,
Why don?t you get a past?
Cole Porter, ?Let?s Misbehave?
The creation of the present from the past is a complex undertaking, and
looking into the future from the present is chancy too. New relationships
among the three are constantly discovered, affirmed, redrawn or predicted.
It is thus both necessary and useful to look back in order to look ahead.
Thinkers in all fields and in all eras have seen classification as an important
conceptual tool that is cognitively available to human beings, and classification
for information retrieval has evolved from the ideas of the past into the
classification systems and techniques in the present and for the future.
Our presentations today look at classification research from the vantage
points of all three times?from the past to the present and from the present
to the future. We range from Olson?s research into Bacon?s thought in the
seventeenth century, through Beghtol?s description of the classificatory
work of James Duff Brown in the late nineteenth and early twentieth centuries,
to Kwa?nik?s analysis of PRECIS (Preserved Context Indexing System) in relation
to the ontologies of our own time. Through these lenses, we can clarify
the past to make the present whole and the future happen.
Hope A. Olson
University of Wisconsin Milwaukee
Email: holson at uwm.edu
TOPIC: Bacon, Warrant, and Classification
Francis Bacon still has a radical idea to suggest to modern classificationists
in relation to warrant. Pre-Baconian classifications of knowledge from Aristotle,
Hugh of St. Victor, and other classical and medieval sources, are based
on existing knowledge and reflect scientific inquiry or pedagogical goals.
Bacon?s motivation and warrant in his classification of knowledge shows
the integrated entirety of knowledge, even that knowledge not yet developed.
In
this presentation, two aspects of Bacon?s approach will be explored and
related to our postmodern context: first, Bacon?s fundamental belief that
all branches of knowledge are interrelated; and second, Bacon?s view of
classification not only as a reflection of knowledge, but also as a guide
to the expansion of knowledge. By identifying gaps in existing knowledge,
Bacon, in essence, created an agenda for future research.
Clare Beghtol
University of Toronto
Email: beghtol at utoronto.ca
TOPIC: James Duff Brown on Classification
James Duff Brown was an energetic and eminent librarian in late nineteenth
and early twentieth century Great Britain. He was active in the Library
Association and took part in all the major library debates of his day. In
particular, Brown believed that a good classification system highlighting
British topics would support his campaign for open access to the shelves
in public libraries. Furthermore, he believed that a British shelf classification
would provide a made-in-England alternative to the Dewey Decimal Classification.
This view inspired him to develop several library classification schemes:
the Quinn-Brown Classification, the Adjustable Classification, and the Subject
Classification. The last system deserves special study for the light it
sheds on the past, the present, and the future of classification research.
Brown?s somewhat iconoclastic views on classification theory included
a number of opinions that were unusual for his time and that are still controversial.
He believed that the order of main classes in a bibliographic classification
system was unimportant, that the applications of a science should follow
that science in the classification schedules, and that any topic could be
treated from a number of intra- and inter-disciplinary perspectives. In
this last view, he pre-dated the concerns of the twentieth century when
problems dealing with interdisciplinary documents became a major issue
that is still debated. The Subject Classification was published in three
editions (i.e., 1906, 1914, 1933), and it was idiosyncratic in a number
of ways. Nevertheless, its rationale and classificatory provisions helped
shape the conceptual and technical bases of important later developments
in classification theory and research. Two of these developments are 1)
creation of notations that could be synthesized in a number of ways and
2) the promulgation of the concept of a one-place classification for each
?concrete? subject. These developments helped shape the future of faceted
notations and are reflected in the concept of the ?phenomenon? class in
the second edition of the Bliss Bibliographic Classification.
This presentation describes briefly Brown?s career and importance
in the library world of his time. It emphasizes the influence of Brown?s
ideas on current classification theory and research. It describes the three
classification systems he devised. His rationale for interdisciplinary notations
and his thinking about the possible combinations of topics that could occur
in documents were more advanced than any classification of his age except
for the Universal Decimal Classification. Twentieth and twenty-first century
classification theory and research has progressed well beyond Brown?s ideas,
but his work warrants study for the light it sheds on the intellectual
development of bibliographic classifications.
Barbara H. Kwasnik
Syracuse University
Email: bkwasnik at syr.edu
TOPIC: Revisiting the Preserved Context Index System (PRECIS):
The Bridge between Hierarchically Structured Thesauri and Facetted Classifications
This presentation will address the difficult task of representing complex
concepts in a text in a way that reflects their contextual meaning. The
preservation of context enables the disambiguation of a term?s possible
multiple senses, and also shows how the term is being used. In developing
these ideas we revisit an indexing system called PRECIS, which was developed
by Derek Austin in the early 1970s for subject indexing for the British
National Bibliography, and subsequently developed by him with the assistance
of Mary Dykstra into a adaptable method of linking both the semantics and
syntax of indexing terms.
In the traditional approach to indexing there are two primary methods of
deriving the indexing vocabulary:
1. terms are chosen from the text itself ? typical of back-of -the-book
indexes, or
2. terms are chosen from a controlled vocabulary, such as a thesaurus or
a subject headings list.
In the first instance, besides the important ?content bearing? words, the
indexer supplies syntax in the form of subentries. For example:
librarians
education of
job satisfaction of
poor pay for
This sort of contextual information is usually limited to indexes created
one work, that is, the index that appears at the end of the work. In the
case of shared or continuing indexing, such as that which occurs in dynamic
collections of works and serially produced works, however, the indexers
typically must choose terms from a controlled vocabulary. Many traditional
controlled vocabularies comprise subject terms that are structured in the
form of hierarchies. For example:
librarians
academic librarians
community college librarians
university librarians
school library media specialists
special librarians
law librarians
music librarians
The strength of such representations is that they offer semantic context.
By knowing that music librarians are a kind of special librarian we know
more about the meaning of the term than if it were isolated. The obvious
issues with such a structure are that:
* typically only one aspect of meaning is revealed; the ?context? is that
of increasing or decreasing specificity along one set of discriminatory
dimensions. But, what if a librarian is a music librarian at a university
(and therefore, an academic librarian as well)?
* such vocabularies include only nouns, or noun phrases of nouns and adjectival
modifiers. Even actions are transformed into nouns: ?running,? ?management,?
?searching.?
* associations among terms can only imply syntactic relationships. E.g.,
?pasteurization? and ?milk.?
Facetted analysis, developed as a notion by Ranganathan, attempted to remedy
the limitations of one-dimensionality by enabling the representation of
an object from a number of perspectives, such as time, geographical association,
processes, materials, and so on. For instance, in using the Art & Architecture
Thesaurus to represent artefacts, we can construct a descriptive string
such as ?embroidered, felt, 12-Century, Celtic slippers,? in which the terms
and their citation order offer a rich dimensionality, and can then be permuted
in order to enable searching on any one of the component terms.
Both facetted and traditional hierarchically structured indexing vocabularies
rely on the subject terms (typically nouns and their adjectives) to convey
the meaning of the text. PRECIS, on the other hand, also captures the subject
terms, but adds one more dimension, and that is the syntax. For example:
a document about the storage of powdered milk would generate the following
entries (Dykstra, 1987):
Milk
Powdered milk. Storage
Powdered milk
Storage
Storage. Powdered milk.
An indexer using PRECIS asks the following questions of the text:
Did anything happen?
If yes, to whom or what did it happen?
Who or what did it?
Where did it happen?
In enabling a syntactic analysis of the text, the indexer can then represent
not only the subjects, but also the story behind the subject. In doing so,
it bridges the context-enhancing strategies of subject hierarchies and facetted
strings, and returns some of the richness provided by a good back-of-the-book
index that is inevitably lost in controlled-vocabulary indexing.
The rules and guidelines for PRECIS indexing are quite complex, however,
requiring a sophisticated understanding of the logic of syntax, as well
as the linguistic mechanics that reveal meaning at this level. In addition,
this is a manual system, in the sense that a person must assign the topic,
as well as the permutations, while a computer can assist with only the housekeeping
chores of alphabetizing and maintaining consistency. Perhaps it is time
to pull PRECIS into the era of natural language processing where texts can
be parsed for syntax, and then offered to the indexer for further intellectual
manipulation.
RESEARCH REPORTS
Deborah Karpuk 10:15 ? 11:00
San Jose State University
Email: djkarpuk at aol.com
TOPIC: Visual Approaches to Teaching Classification
This presentation will describe current strategies for and experiences in
teaching Cataloging and Classification courses in person and online.
Judith Weedman ? 11:15 ? 12:00
San Jose State University
Email: jweedman at slis.sjsu.edu
TOPIC: Image vocabularies: Design as professional practice
Herb Simon, economist, computer scientist, pioneer cognitive scientist,
and Nobel Prize winner, wrote that design is the core of all professional
activity. The natural sciences are concerned with how things are; the science
of design is concerned with how things ought to be ? with ?devising artifacts
to attain goals? (Schon, 1990, p. 110). What professionals do is to ?transform
an existing state of affairs, a problem, into a preferred state, a solution?
(Schon, p. 111).
This paper presents preliminary results from a research project on the
design of local, in-house subject vocabularies for images. Requests for
participation were posted to several electronic discussion lists, in subject
areas including science, art, history, and library and information science.
Thirty-one respondents agreed to complete questionnaires concerning the
history and structure of their vocabularies. Interviews with the thirteen
respondents who had been the designers of the vocabularies they were using
are currently being conducted; as of mid-October 2004, six have been completed.
The interviews last between ninety minutes and two hours, and explore the
nature of design work.
The first section of this paper describes design theory and practice theory
developed from the study of various professions including architecture,
software development, and engineering. My research lies within the field
of science and technology studies, and draws on constructivist theory in
that area. A central goal of this project is to understand the intellectual
work of vocabulary design through a comparison to design in other professions.
The dates of construction of the vocabularies range from 1916 and 1917
to 2004. The second section of the paper provides descriptive information
about the structure of the vocabularies (post-coordinate, pre-coordinate,
classification, and natural language systems), their size, and whether they
represent both literal and interpretive attributes of the images. It also
summarizes the extent to which existing standard vocabularies were considered
for use, and the reasons for choosing local design.
The interview data presented in the third section address the nature of
vocabulary design as professional practice, using the theoretical literature
as a foundation. This section discusses the respondents? framing of the
design problem, their understanding of the roles of emotion, intuition,
and reason in work of this nature, the influence of the work context including
technology, the creativity in design practice, and the ways in which conceptions
of users influence design.
The conclusion of the paper considers the implications of the data for
three areas: the applicability of standard vocabularies to specialized
collections, what LIS students need to learn about the design of subject
vocabularies, and the significance of design theory for vocabulary design.
Reference cited:
Schon, Donald A. (1990). The design process. In V. A. Howard (Ed.), Varieties
of thinking: Essays from Harvard?s Philosophy of Education Research Center
(pp. 110-141). New York: Routledge.
LUNCH & POSTERS 12:00 ? 1:30
ALL PARTICIPANTS ARE ENCOURAGED TO BRING A SMALL POSTER ON ANY TOPIC. WE'LL
HANG THESE AROUND THE ROOM OR PLACE THEM ON TABLES...IT'S A GOOD OPPORTUNITY
TO BRING NEW IDEAS FOR DISCUSSION AND CRITIQUE BY YOUR COLLEAGUES.
RESEARCH REPORTS
Joan Lussky -- 1:30 ? 2:15
Drexel University. College of Information Science and Technology
Email: jpl26 at drexel.edu
TITLE: Terminology of the evolving medical knowledge of the late 19th to
the early 20th century
Discoveries in medical science over time have led to shifts in accepted
knowledge such as our understanding of what can cause disease and what defines
specific disease entities. The shifts in accepted medical knowledge that
occurred in the late nineteenth- and early twentieth-centuries are captured
in the literature of that time. With the newly digitized form of the Index
Catalogue of the Library of the Surgeon General?s Office, United States
Army (IndexCat), an index to what was once the largest medical library in
this country and published from 1880-1961, we have the ability to undertake
a quantitative analysis of the variable advancement in medical knowledge
across many decades. My data looks at shifts surrounding three disease
entities: syphilis, Huntington?s chorea, and beriberi, and their interactions
with three disease causation theories: germ, hereditary, and deficiency
from 1880-1930. Temporal changes in the prominent subject heading words
and title words within the literature of these diseases and disease causations
corroborate qualitative accounts of this same literature which report the
complex and sometimes oblique process of knowledge accretion. Although
preliminary, my results indicate that the IndexCat is a valuable tool for
studying the development of medical knowledge.
James Turner ? 2:30 ? 3:15
University of Montreal
Email: james.turner at umontreal.ca
TOPIC: Data about metadata: beating the MetaMap into shape
DESCRIPTION:
Presented as a straightforward alphabetical list, the information and reference
tool would have served its purpose: to gather in a single place information
about metadata standards, sets, and initiatives of interest to people in
information science. But no, we couldn?t just let the sleeping dog lie.
There was a lot of data about all this
metadata, so we couldn?t resist trying to classify it, and that?s where
the fun began. What would be the best approach to organizing this data about
metadata? By information processes? By types of institutions with expertise
in managing information? By types of information? Why, we?d just have to
try all three approaches, wouldn?t
we? We ended up finding a way to include all three (which was like trying
to fit a square peg in a round hole, quite doable when you think about it),
and then some. It . This presentation will outline some of the classification
issues we faced in trying to beat the MetaMap (http://mapageweb.umontreal.ca/turner/)
into shape. We will
show the interactive online version and make the printed poster available
for discussion.
Corinne Jorgensen ? 3:30 ? 4:15
Florida State University
Email: cjorgensen at lis.fsu.edu
TOPIC: The ARDA NRRC Workshop on Large Scale Concept Ontology for Multimedia
Understanding
The ARDA NRRC workshop on Large Scale Concept Ontology for Multimedia Understanding
is a series of meetings and research, experimentation, and prototyping tasks
to address the theoretical and empirical aspects of the automatic detection
of semantic concepts in the domain of broadcast video. This project pulls
together collaborating experts in ontologies, user communities, information
retrieval, knowledge representation, and video analysis to define ontologies,
and evaluate and refine them. A significant challenge is the systematic
construction and evaluation of a large-scale lexicon of semantic concepts
(on the order of 1000 concepts) and their interrelationships, which is generally
applicable to all broadcast news video.
Increasing the number of concepts that can be detected and evaluated necessitates
careful lexicon design driven by user needs. The concepts in the lexicon
should be useful from a perspective of visual information exploitation.
Simultaneously the lexicon must be feasible from the perspective of automatic
and semi-automatic detection. It is hoped that the confluence of statistical
and non-statistical media analysis with ontologies, classification schemas,
and lexicons will make the scalable multimedia semantic concept detection
problem tractable.
The current work involves exploring existing vocabularies for relevant concepts
and relationships. The project is using an Image Description Template drawn
from the work of Jörgensen as a springboard for developing the lexicon.
The presentation will discuss more fully the context of the project, with
a focus on the needs of various user communities (such as security analysts)
and use scenarios, and the progress on the project to date.
CLOSING PANEL 4:30 ? 5:00
All panelists. Open questions and comments about the workshop topics.
PLEASE REPLY TO MY NEW EMAIL!
mikel.breitenstein at verizon.net
More information about the Sigcr-l
mailing list