web search in AltaVista (fwd)

Gretchen Whitney gwhitney at UTK.EDU
Wed Feb 28 09:13:39 EST 2001


---------- Forwarded message ----------
Date: Wed, 28 Feb 2001 09:12:25 -0600
From: Wallace Koehler <koeh5762_ou at SOONERS.NET>

Perhaps the closest we get today are the two OCLC services NetFirst and CORC.
NetFirst (a FirstSearch service) covers only a very small slice of the Web. It
may be "relevant" but not comprehensive. CORC may grow into the service Judit
is calling for. But even then, neither NetFirst nor CORC will give us a
tool to
study or map the Web.

Maybe the scientometric, particularly the cybermetric community should take
this on. A part of me says it is a quixotic dream, but, hey, Dulcinea is worth
it. OCLC has begun to meaure the Web. Could they provide the corporate
structure to do what needs to be done?

wally koehler
>
>Dear colleagues,
>I found Ronald Rousseau's observation on the way AltaVista reports the
>number of search results most fascinating.  Commercial search engines serve
>the "average user", who doesn't care whether there are 10,000 or
>10,000,000 pages that mention his/her keywords, he/she only looks at the
>first few answers. Search engines have to give quick and "relevant"
>answers and they have no intentions to serve as research tools for
>studying the Internet. I believe that the scientific community should put
>the money and effort in building a more reliable search tool.
>
>-----------------------------------------------------
>Judit Bar-Ilan
>School of Library, Archive and Information Studies
>The Hebrew University of Jerusalem
>P.O. Box 1255, Jerusalem, 91904, Israel
>Tel: 972-2-6584663 Fax: 972-2-6585707
>e-mail: judit at cc.huji.ac.il
>-----------------------------------------------------
>
>
>At 08:21 26/02/2001 +0100, Ronald Rousseau wrote:
>>Dear colleagues,
>>
>>I would like to mention another caveat when doing web searches. This time in
>>AltaVista advanced search. In this mode it is possible to specify 'one
result
>>per website'. Intuitively one now expects a smaller number of hits then
>>without
>>this specification. This is often the case, but when doing a search using
the
>>keyword 'peseta' (and this is just one example) one obtains almost four
times
>>more hits with 'one result per website' on than without it. This huge
>>difference is certainly not caused by the fact that AltaVista's counts are
not
>>precise. Consequently I asked AltaVista's Technical Support Team. This is
>>their
>>answer:
>>
>>
>>"When you do a particular keyword search, the spider brings up only those
>>pages which are relevant to that keyword. However, when you specify only
>>one result per website, the spider will pull up all the pages in the
>>index with that keyword. This is the reason to obtain more results when
>>you restrict the search of a generic keyword to one result per site."
>>
>>
>>Besides the fact that this is something one must know when performing
>>informetric investigations, there is the word 'relevant' in the answer. So,
>>AltaVista decides (how?) which sites are relevant for you and which are not.
>>Interesting!
>>
>>Ronald Rousseau
>>KHBO - Zeedijk 101
>>B-8400  Oostende  Belgium
>>E-mail:  ronald.rousseau at kh.khbo.be
>>web page:  users.pandora.be/ronald.rousseau
>
>-----------------------------------------------------
>Judit Bar-Ilan
>School of Library, Archive and Information Studies
>The Hebrew University of Jerusalem
>P.O. Box 1255, Jerusalem, 91904, Israel
>Tel: 972-2-6584663 Fax: 972-2-6585707
>e-mail: judit at cc.huji.ac.il
>-----------------------------------------------------
>
>
>---
>Incoming mail is certified Virus Free.
>Checked by AVG anti-virus system (http://www.grisoft.com).
>Version: 6.0.231 / Virus Database: 112 - Release Date: 2/12/01
>
===========
Wallace Koehler
Asst. Prof
School of Library and Information Studies ***University of Oklahoma
wkoehler at ou.edu
Web pages are heraclitian--you can never access the same one twice.



More information about the SIGMETRICS mailing list