[Sigia-l] Findability
Simon Wistow
simon at thegestalt.org
Tue Jan 28 10:57:59 EST 2003
On Tue, Jan 28, 2003 at 09:37:44AM -0600, Jonathan Broad said:
> The best information I've found on the subject is a rather lengthy paper
> called "Pagerank Uncovered":
> http://www.supportforums.org/PageRank.pdf
>
> In it, A number of search engine experts attempt to reverse engineer
> Google's algorithm. Can't nail it exactly, but close enough for all
> practical purposes.
Umm, a bit of a linkdump ...
There's obviously
http://www.google.com/technology/index.html
But, since Google also grew out of academic research at Stanford, and
there was a series of papers that came out of that work before it went
commercial, you can have a good stab at reading those. Presumably the
tech's evolved since then but ...
The Anatomy of a Large-Scale Hypertextual Web Search Engine
Sergey Brin and Lawrence Page
{sergey, page}@cs.stanford.edu
http://www-db.stanford.edu/~backrub/google.html
http://www7.scu.edu.au/programme/fullpapers/1921/com1921.htm
Efficient Crawling Through URL Ordering
Junghoo Cho, Hector Garcia-Molina, Lawrence Page
{cho,hector,page}@cs.stanford.edu
http://www-db.stanford.edu/~cho/crawler-paper/
WSQ: Web-Supported (Database) Queries
Roy Goldman, Jennifer Widom
http://www.google.com/search?q=cache:K59eKNoud1AC:www-db.stanford.edu/wsq/++site:www-db.stanford.edu+go
+ogle+stanford+paper
Finding near-replicas of documents on the web
Narayanan Shivakumar, Hector Garcia-Molina
(fshiva, hectorg}@cs.stanford.edu
http://www-db.stanford.edu/~shiva/Pubs/web.ps
Et cetera:
http://www.google.com/search?q=+site:www-db.stanford.edu+google+stanford+paper
There's also
The PageRank Citation Ranking: Bringing Order to the Web
http://citeseer.nj.nec.com/page98pagerank.html
and an interview with Sergey Brin and Lawrence Page
http://www.guardian.co.uk/Archive/Article/0,4273,4336874,00.html
Weirdly enough David Filo and Jerry Yang, the two guys who started
Yahoo! were also Stanford PhD students (in 1994 - I think the Google
paper was done in 1996-1998)
Interesting quote from the Google history page
"Larry and Sergey continued to perfect Google's technology through the
first half of 1998. Following a path that would become a key part of the
Google way, they bought a terabyte of disks at bargain prices and built
their own computer housings in Larry's dorm room, which became Google's
first data center. Meanwhile, Sergey set up a business office and the
two began calling on potential partners who might want to license search
technology that worked better than any available at the time. Despite
the dotcom fever of the day, they had little interest in building their
own company around the technology they had developed.
Among those called upon was Yahoo! founder and friend David Filo. Filo
agreed the technology was solid, but encouraged Larry and Sergey to grow
the service themselves by starting a search engine company.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
"When it's fully developed and scalable", he told them, "let's talk
again". Others were less interested in Google as it was now known. One
portal CEO told them, "As long as we're 80 percent as good as our
competitors, that's good enough. Our users don't really care about
search."
- http://www.google.com/corporate/history.html
Irony? (In a none Alannis Morissette kind of way)
Simon
More information about the Sigia-l
mailing list