[Sigia-l] Google clusters

Listera listera at rcn.com
Wed Oct 13 01:11:29 EDT 2004

Semantic Web (SW) comes up often on this list. There's a whole bunch
tools/proposed standards to draw relationships between entities to make SW
more than a high falutin' concept. The latter, unfortunately, is largely a
manual process that's hard to scale. We already know what Google did with
its algorithmic approach to search in the market place. Now, it appears that
Google is about to do to SW what it did to search:

Google Sets Sights on Clustering, Translation

In clustering, [Google director of search quality, Peter] Norvig
demonstrated a six-month-old project called "named entities abstraction,"
where Google's researchers are analyzing the company's large Web index to
extract entities‹such as the name of a company‹from the structure of content
and then decipher their relationship to one another.
For example, Norvig said, researchers are looking for ways to break down
sentences by looking for a phrase like "such as" and grabbing the names that
follow it. The goal is to not only pull out the name but also its clusters,
so that a name such as "Java" can be associated both with the computer
language and with language in general, Norvig said.

"We want to be able to search and find these [entities] and the
relationships between them, rather than you typing in the words
specifically," Norvig said.


Google... still separating IAs from potential jobs? :-)

Nullius in Verba 

More information about the Sigia-l mailing list