Kostoff, RN; Rigsby, JT; Barth, RB Brief communication - Adjacency and proximity searching in the Science Citation Index and Google JOURNAL OF INFORMATION SCIENCE, 32 (6): 581-587 2006

Eugene Garfield garfield at CODEX.CIS.UPENN.EDU
Wed Apr 30 15:42:32 EDT 2008


E-mail Address: kostofr at onr.navy.mil

Author(s): Kostoff, RN (Kostoff, Ronald N.); Rigsby, JT (Rigsby, John T.); 
Barth, RB (Barth, Ryan B.) 

Title: Brief communication - Adjacency and proximity searching in the 
Science Citation Index and Google 

Source: JOURNAL OF INFORMATION SCIENCE, 32 (6): 581-587 2006 

Language: English 

Document Type: Article 

Author Keywords: information retrieval; adjacency searching; proximity 
searching; constrained co-occurrence searching; Science Citation Index; 
Google; Yahoo; Engineering Compendex; PubMed; OVID; search engine; query 
Abstract: We have developed simple algorithms that allow adjacency and 
proximity searching in Google and the Science Citation Index (SCI). The 
SCI algorithm exploits the fact that SCI stopwords in a search phrase 
function as a placeholder. Such a phrase serves effectively as a fixed 
adjacency condition determined by the number n of adjacent stopwords (i.e. 
retrieve all records where word A and word B are separated by n words in 
at least one location). The algorithm integrates over search phrases with 
different numbers of adjacent stopwords to provide a flexible adjacency or 
proximity capability (i.e. retrieve all records where word A and word B 
are separated by n or fewer words in at least one location, where n is the 
maximum separation desired between A and B in at least one location). The 
Google algorithm exploits the fact that asterisks (in Google) separating 
words in a phrase function like word wildcards. The difference between two 
such phrases (the first phrase containing one fewer asterisk than the 
second phrase) serves effectively as a fixed adjacency or proximity 
condition, with the number of separating words equal to the number of 
asterisks in the first phrase. The algorithm integrates over these phrase 
differentials to provide a flexible adjacency or proximity capability 
(i.e. retrieve all records where word A and word B are separated by n or 
fewer words in at least one location, where n is the maximum separation 
desired between A and B in at least one location). 

Addresses: Off Naval Res, Arlington, VA 22217 USA; USN, Ctr Surface 
Warfare, Dahlgren Div, Dahlgren, VA 22448 USA; DDL OMNI Engn LLC, Mclean, 
VA 22102 USA 

Reprint Address: Kostoff, RN, Off Naval Res, 875 N Randolph St, Arlington, 
VA 22217 USA. 

E-mail Address: kostofr at onr.navy.mil 

Cited Reference Count: 4 

Times Cited: 0 

Publisher: SAGE PUBLICATIONS LTD 

Publisher Address: 1 OLIVERS YARD, 55 CITY ROAD, LONDON EC1Y 1SP, ENGLAND 

ISSN: 0165-5515 

29-char Source Abbrev.: J INFORM SCI 

ISO Source Abbrev.: J. Inf. Sci. 

Source Item Page Count: 7 

Subject Category: Computer Science, Information Systems; Information 
Science & Library Science 

ISI Document Delivery No.: 126GE 

KEEN EM
SOME ASPECTS OF PROXIMITY SEARCHING IN TEXT RETRIEVAL-SYSTEMS 
JOURNAL OF INFORMATION SCIENCE 18 : 89 1992 

KOSTOFF RN
SYSTEMATIC ACCELERAT : 2005 

KOSTOFF RN
Systematic acceleration of radical discovery and innovation in science and 
technology 
TECHNOLOGICAL FORECASTING AND SOCIAL CHANGE 73 : 923 DOI 
10.1016/j.techfore.2005.09.004 2006 

MCJUNKIN MC
PRECISION AND RECALL IN TITLE KEYWORD SEARCHES 
INFORMATION TECHNOLOGY AND LIBRARIES 14 : 161 1995 



More information about the SIGMETRICS mailing list