[Sigia-l] Don't submit websites to search engines?

Jon Hanna jon at hackcraft.net
Mon May 17 12:32:08 EDT 2004


> > Another ridiculous premise is that the search engines exhaustively
> > crawl the
> > web on a regular basis. Huge swathes of the web have *never* been
> > crawled,
> > and not simply because they are behind /robots.txt, dynamic pages, or
> > firewalls. Huge swathes of eminently crawlable webspace.
> > 
> > How do I know? I have google send me an email alert anytime they add a
> > new
> > page to their index that contains the keyword "IAwiki", and I receive
> > a
> > trickle of alerts for what I know to be very old pages. Specifically
> > the
> > SIGIA-L archives, and even other blogs.
> > 
> >     http://www.google.com/webalerts?hl=en
> 
> 
> I'm not too sure that's a valid source of data to support your claim
> that "Huge swathes of the web have *never* been crawled".

You are correct that that isn't really an irrefutable source of data for such a
claim. However huge swathes of the web have *never* been crawled.

-- 
Jon Hanna
<http://www.hackcraft.net/>
"
it has been truly said that hackers have even more words for
equipment failures than Yiddish has for obnoxious people." - jargon.txt



More information about the Sigia-l mailing list