Kostoff, RN; Rigsby, JT; Barth, RB Brief communication - Adjacency and proximity searching in the Science Citation Index and Google JOURNAL OF INFORMATION SCIENCE, 32 (6): 581-587 2006
Eugene Garfield
garfield at CODEX.CIS.UPENN.EDU
Wed Apr 30 15:42:32 EDT 2008
E-mail Address: kostofr at onr.navy.mil
Author(s): Kostoff, RN (Kostoff, Ronald N.); Rigsby, JT (Rigsby, John T.);
Barth, RB (Barth, Ryan B.)
Title: Brief communication - Adjacency and proximity searching in the
Science Citation Index and Google
Source: JOURNAL OF INFORMATION SCIENCE, 32 (6): 581-587 2006
Language: English
Document Type: Article
Author Keywords: information retrieval; adjacency searching; proximity
searching; constrained co-occurrence searching; Science Citation Index;
Google; Yahoo; Engineering Compendex; PubMed; OVID; search engine; query
Abstract: We have developed simple algorithms that allow adjacency and
proximity searching in Google and the Science Citation Index (SCI). The
SCI algorithm exploits the fact that SCI stopwords in a search phrase
function as a placeholder. Such a phrase serves effectively as a fixed
adjacency condition determined by the number n of adjacent stopwords (i.e.
retrieve all records where word A and word B are separated by n words in
at least one location). The algorithm integrates over search phrases with
different numbers of adjacent stopwords to provide a flexible adjacency or
proximity capability (i.e. retrieve all records where word A and word B
are separated by n or fewer words in at least one location, where n is the
maximum separation desired between A and B in at least one location). The
Google algorithm exploits the fact that asterisks (in Google) separating
words in a phrase function like word wildcards. The difference between two
such phrases (the first phrase containing one fewer asterisk than the
second phrase) serves effectively as a fixed adjacency or proximity
condition, with the number of separating words equal to the number of
asterisks in the first phrase. The algorithm integrates over these phrase
differentials to provide a flexible adjacency or proximity capability
(i.e. retrieve all records where word A and word B are separated by n or
fewer words in at least one location, where n is the maximum separation
desired between A and B in at least one location).
Addresses: Off Naval Res, Arlington, VA 22217 USA; USN, Ctr Surface
Warfare, Dahlgren Div, Dahlgren, VA 22448 USA; DDL OMNI Engn LLC, Mclean,
VA 22102 USA
Reprint Address: Kostoff, RN, Off Naval Res, 875 N Randolph St, Arlington,
VA 22217 USA.
E-mail Address: kostofr at onr.navy.mil
Cited Reference Count: 4
Times Cited: 0
Publisher: SAGE PUBLICATIONS LTD
Publisher Address: 1 OLIVERS YARD, 55 CITY ROAD, LONDON EC1Y 1SP, ENGLAND
ISSN: 0165-5515
29-char Source Abbrev.: J INFORM SCI
ISO Source Abbrev.: J. Inf. Sci.
Source Item Page Count: 7
Subject Category: Computer Science, Information Systems; Information
Science & Library Science
ISI Document Delivery No.: 126GE
KEEN EM
SOME ASPECTS OF PROXIMITY SEARCHING IN TEXT RETRIEVAL-SYSTEMS
JOURNAL OF INFORMATION SCIENCE 18 : 89 1992
KOSTOFF RN
SYSTEMATIC ACCELERAT : 2005
KOSTOFF RN
Systematic acceleration of radical discovery and innovation in science and
technology
TECHNOLOGICAL FORECASTING AND SOCIAL CHANGE 73 : 923 DOI
10.1016/j.techfore.2005.09.004 2006
MCJUNKIN MC
PRECISION AND RECALL IN TITLE KEYWORD SEARCHES
INFORMATION TECHNOLOGY AND LIBRARIES 14 : 161 1995
More information about the SIGMETRICS
mailing list