web search in AltaVista

Judit Bar-Ilan judit at CC.HUJI.AC.IL
Wed Feb 28 06:14:56 EST 2001


Dear colleagues,
I found Ronald Rousseau's observation on the way AltaVista reports the
number of search results most fascinating.  Commercial search engines serve
the "average user", who doesn't care whether there are 10,000 or
10,000,000 pages that mention his/her keywords, he/she only looks at the
first few answers. Search engines have to give quick and "relevant"
answers and they have no intentions to serve as research tools for
studying the Internet. I believe that the scientific community should put
the money and effort in building a more reliable search tool.

-----------------------------------------------------
Judit Bar-Ilan
School of Library, Archive and Information Studies
The Hebrew University of Jerusalem
P.O. Box 1255, Jerusalem, 91904, Israel
Tel: 972-2-6584663 Fax: 972-2-6585707
e-mail: judit at cc.huji.ac.il
-----------------------------------------------------


At 08:21 26/02/2001 +0100, Ronald Rousseau wrote:
>Dear colleagues,
>
>I would like to mention another caveat when doing web searches. This time in
>AltaVista advanced search. In this mode it is possible to specify 'one result
>per website'. Intuitively one now expects a smaller number of hits then
>without
>this specification. This is often the case, but when doing a search using the
>keyword 'peseta' (and this is just one example) one obtains almost four times
>more hits with 'one result per website' on than without it. This huge
>difference is certainly not caused by the fact that AltaVista's counts are not
>precise. Consequently I asked AltaVista's Technical Support Team. This is
>their
>answer:
>
>
>"When you do a particular keyword search, the spider brings up only those
>pages which are relevant to that keyword. However, when you specify only
>one result per website, the spider will pull up all the pages in the
>index with that keyword. This is the reason to obtain more results when
>you restrict the search of a generic keyword to one result per site."
>
>
>Besides the fact that this is something one must know when performing
>informetric investigations, there is the word 'relevant' in the answer. So,
>AltaVista decides (how?) which sites are relevant for you and which are not.
>Interesting!
>
>Ronald Rousseau
>KHBO - Zeedijk 101
>B-8400  Oostende  Belgium
>E-mail:  ronald.rousseau at kh.khbo.be
>web page:  users.pandora.be/ronald.rousseau

-----------------------------------------------------
Judit Bar-Ilan
School of Library, Archive and Information Studies
The Hebrew University of Jerusalem
P.O. Box 1255, Jerusalem, 91904, Israel
Tel: 972-2-6584663 Fax: 972-2-6585707
e-mail: judit at cc.huji.ac.il
-----------------------------------------------------



More information about the SIGMETRICS mailing list