[Asis-l] TOC, JASIST, Volume 53, # 13
Richard Hill
rhill at asis.org
Fri Oct 11 09:13:51 EDT 2002
Journal of the American Society for Information Science and Technology
JASIST
VOLUME 53, NUMBER 13
[Note: URLs for viewing contents of JASIST from past issues are at the
bottom. Immediately below, the contents of Bert Boyce's "In This Issue"
has been cut into the Table of Contents.]
EDITORIAL
In This Issue
Bert R. Boyce
1083
RESEARCH
State Digital Library Usability: Contributing Organizational Factors
Hong (Iris) Xie and Dietmar Wolfram
Published online 18 September 2002
1085
In this issue Xie and Wolfram study the Wisconsin state digital library
BadgerLink to determine the organizational factors that lead to different
use requirements and the degree to which these are met, as well as impact
on physical libraries. To this end, usage data from EBSCOhost and ProQuest
logs for BadgerLink were analyzed, 313 Wisconsin libraries of all types
were surveyed (76% response rate), and analyzed along with 81 responses to
a voluntary web survey of end users. Heaviest users were K-12 schools and
institutions of higher education. Heaviest use sites were the two largest
state universities and the state's largest public library. Small libraries
were infrequent users. Web survey respondents were mature working
professionals. Sixty percent searched for specific information, but 46%
reported browsing in subject areas. Libraries with dedicated Internet
access reported more frequent usage than those with dial-up connection.
Those who accessed from libraries reported more frequent use than those at
work or at home. Libraries that trained end users reported more use, but
the majority of the web survey respondents reported themselves as
self-taught. Logs confirm reported subject interests. Three surrogates were
requested for every full text document but full text availability is
reported as the reason for
use by 30% of users. Availability has led to the cancellation of
subscriptions in many libraries that are important promoters of the
service. A model will need to include interactions based upon the influence
of each involved participant on the others. It will also need to include
the extension of the activities of one participant to other participant
organizations and the communication among these organizations.
Unfounded Attribution of the ``Half-Life'' Index-Number of Literature
Obsolescence to Burton and Kebler: A Literature Science Study
Endre Szava-Kovats
Published online 21 August 2002
1098
Szava-Kovats demonstrates that the common attribution of the origin of
the concept of half-life in subject-oriented journal literatures to the
1960 Burton and Kebler article in American Documentation is not
correct. The first use appears to be in C. R. Gosnell's 1944 paper in
College and Research Libraries. It was later discussed by J. D. Bernal at
the 1958 International Conference on Scientific Information in Washington,
DC. While Burton and Kebler do solve some of the theoretical problems by
redefining half-life, they do not express confidence in the use of
half-life in this milieu, and Burton later advocates in 1961 the term
``median age'' which was introduced by Broadus in this context in 1953.
Is the Relationship Between Numbers of References and Paper Lengths the
Same for All Sciences?
Helmut A. Abt and Eugene Garfield
Published online 19 September 2002
1106
It has been shown in the physical sciences that a paper's length is
related to its number of references in a linear manner. Abt and Garfield
here look at the life and social sciences with the thought that if the
relation holds the citation counts will provide a measure of relative
importance across these disciplines. In the life sciences 200 research
papers from 1999-2000 were scanned in each of 10 journals to produce counts
of 1000 word normalized pages. In the social sciences an average of 70
research papers in nine journals were scanned for the two-year period.
Papers of average length in the various sciences have the same average
number of references within plus or minus 17%. A look at the 30 to 60
papers over the two years in 18 review journals indicates twice the
references of research papers of the same length.
Algorithmic Procedure for Finding Semantically Related Journals
Alexander I. Pudovkin and Eugene Garfield
Published online 3 September 2002
1113
Journal Citation Reports provides a classification of journals most
heavily cited by a given journal and which most heavily cite that journal,
but size variation is not taken into account. Pudovkin and Garfield suggest
a procedure for meeting this difficulty. The relatedness of journal i to
journal j is determined by the number of citations from journal i to
journal j in a given year normalized by the product of the papers published
in the j journal in that year times the number of references cited in the i
journal in that year. A multiplier of ten to the sixth is suggested to
bring the values into an easily perceptible range. While citations received
depend upon the overall cumulative number of papers published by a journal,
the current year is utilized since that data is available in JCR. Citations
to current year papers would be quite low in most fields and thus not
included. To produce the final index, the maximum of the A citing B value,
and the B citing A value is chosen and used to indicate the closeness of
the journals. The procedure is illustrated for the journal Genetics.
Using Graded Relevance Assessments in IR Evaluation
Jaana Kekalainen and Kalervo Jarvelin
Published online 3 September 2002
1120
Kekalainen and Jarvelin use what they term generalized, nonbinary recall
and precision measures where recall is the sum of the relevance scores of
the retrieved documents divided by the sum of relevance scores of all
documents in the data base, and precision is the sum of the relevance
scores of the retrieved documents divided by the number of documents where
the relevance scores are real numbers between zero and one. Using the
In-Query system and a text data base of 53,893 newspaper articles with 30
queries selected from those for which four relevance categories to provide
recall measures were available, search results were evaluated by four
judges. Searches were done by average key term weight, Boolean expression,
and by average term weight where the terms are grouped by a synonym
operator, and for each case with and without expansion of the original
terms. Use of higher standards of relevance appears to increase the
superiority of the best method. Some methods do a better job of getting the
highly
relevant documents but do not increase retrieval of marginal ones. There is
evidence that generalized precision provides more equitable results, while
binary precision provides undeserved merit to some methods. Generally
graded relevance measures seem to provide additional insight into IR
evaluation.
Automatic Thesaurus Generation for Chinese Documents
Yuen-Hsien Tseng
Published online 19 September 2002
1130
Tseng constructs a word co-occurrence based thesaurus by means of the
automatic analysis of Chinese text. Words are identified by a longest
dictionary match supplemented by a key word extraction algorithm that
merges back nearby tokens and accepts shorter strings of characters if they
occur more often than the longest string. Single character auxiliary words
are a major source of error but this can be greatly reduced with the use of
a 70-character 2680 word stop list.
Extracted terms with their associate document weights are sorted by
decreasing frequency and the top of this list is associated using a Dice
coefficient modified to account for longer documents on the weights of term
pairs. Co-occurrence is not in the document as a whole but in paragraph or
sentence size sections in order to reduce computation time. A window of 29
characters or 11 words was found to be sufficient. A thesaurus was produced
from 25,230 Chinese news articles and judges asked to review the top 50
terms associated with each of 30 single word query terms. They determined
69% to be relevant.
On Bidirectional English-Arabic Search
M. Aljlayl, O. Frieder, and D. Grossman
Published online 19 September 2002
1139
Aljlayl, Frieder, and Grossman review machine translation of query
methodologies and apply them to English-Arabic/Arabic-English
Cross-Language Information Retrieval. In the dictionary method, replacement
of each term with all possible equivalents in the target language results
in considerable ambiguity, while taking the first term in the dictionary
list reduces the ambiguity but may fail to capture the meaning. A Two-Phase
method takes all possible equivalents and translates them back, retaining
only those that generate the original term. It results in an average query
length of six terms in TREC7 and 12 in TREC9. Arabic to English
translations consistently preformed below the original English queries, and
the Two-Phase method consistently preformed at the highest level and
significantly better than the Every-Match method.
Machine translation using other techniques is economical for queries but
not likely so for documents. Using ALKAFI, a commercial translation system
from Arabic to English and the Al-Mutarjim Al-Arabey system for English to
Arabic, nearly 60% of monolingual retrievals were generated going from
Arabic to English. Smaller numbers of terms in the source query improve
performance, and these systems require syntactically well-formed queries
for good performance.
The Influence of Mental Models and Goals on Search Patterns During Web
Interaction
Debra J. Slone
Published online 19 September 2002
1152
Thirty-one patrons, who were selected by Slone to provide a range of age
and experience, agreed when approached while using the catalog of the Wake
County library system to try searching via the Internet. Fifteen searched
the Wake County online catalog in this manner and 16 searched the World
Wide Web, including that catalog. They were subjected to brief
pre-structured taped interviews before and after their searches and
observed during the searching process resulting in a log of behaviors,
comments, pages accessed, and time spent. Data were analyzed across
participants and categories. Web searches were characterized as linking,
URL, search engine, within a site domain, and searching a web catalog; and
participants by the number of these techniques used. Four used only one, 13
used two, 11 used three, two used four, and one all five.
Participant experience was characterized as never used, used search
engines, browsing experience, email experience, URL experience, catalog
experience, and finally chat room/newsgroup experience. Sixteen percent of
the participants had never used the Internet, 71% had used search engines,
65% had browsed, 58% had used email, 39% had used URLs, 39% had used online
catalogs, and 32% had used chat rooms. The catalog was normally consulted
before the web, where both were used, and experience with an online catalog
assists in web use. Scrolling was found to be unpopular and practiced
halfheartedly.
Children's Use of the Yahooligans! Web Search Engine. III. Cognitive and
Physical Behaviors on Fully
Self-Generated Search Tasks
Dania Bilal
Published online 19 September 2002
1170
Bilal, in this third part of her Yahooligans! study looks at children's
performance with self-generated search tasks, as compared to previously
assigned search tasks looking for differences in success, cognitive
behavior, physical behavior, and task preference. Lotus ScreenCam was used
to record interactions and post search interviews to record impressions.
The subjects, the same 22 seventh grade children in the previous studies,
generated topics of interest that were mediated with the researcher into
more specific topics where necessary. Fifteen usable sessions form the
basis of the study. Eleven children were successful in finding information,
a rate of 73% compared to 69% in assigned research questions, and 50% in
assigned fact-finding questions.
Eighty-seven percent began using one or two keyword searches. Spelling
was a problem. Successful children made fewer keyword searches and the
number of search moves averaged 5.5 as compared to 2.4 on the research
oriented task and 3.49 on the factual. Backtracking and looping were
common. The self-generated task was preferred by 47% of the subjects.
Book Reviews
Usability Testing for Library Web Sites: A Hands-On Guide, by Elaina
Norlin and CM Winters
Matt Jones
Published online 7 August 2002
1184
Accessing and Browsing Information and Communication, by Ronald E. Rice,
Maureen McCreadie, and Shan-Ju L. Chang
Robert J. Sandusky
Published online 29 August 2002
1185
Strategies for Electronic Commerce and the Internet, by Henry C. Lucas, Jr.
Roisin Faherty
Published online 22 August 2002
1187
CALL FOR PAPERS
1189
----------
[Note: The ASIST home page
<http://www.asis.org/Publications/JASIS/tocs.html> contains the Table of
Contents and abstracts from Bert Boyce's "In This Issue" from January 1993
(Volume 44) to date.
The John Wiley Interscience site <http://www.interscience.wiley.com>
includes issues from 1986 (Volume 37) to date. Guests have access only to
tables of contents and abstracts. Registered users of the interscience
site have access to the full text of these issues and to preprints.]
Executive Director
American Society for Information Science and Technology
1320 Fenwick Lane, Suite 510
Silver Spring, MD 20910
FAX: (301) 495-0810
PHONE: (301) 495-0900
http://www.asis.org
More information about the Asis-l
mailing list