[Sigia-l] Findability - hierarchies
Lars Marius Garshol
larsga at garshol.priv.no
Tue Jan 28 17:38:38 EST 2003
* Lars Marius Garshol
|
| Faceted metadata essentially means replacing a single hierarchy by
| multiple hierarchy, with one hierarchy for each facet. Now,
| structurally that's actually the same as having a single hierarchy.
| (Just add a single node at the top linking all the facets and you've
| got it.)
* Cunliffe D. J.
|
| This is of course true, but we can trivially have a conceptually
| equivalent effect by adding an 'everything' category to whatever
| classification scheme we choose.
Not really. In fact, many ontologies have this built into them (like
Cyc's Thing class at the top of the class hierarchy) without it
turning them into a single tree.
| It does seem to misrepresent the purpose of facets however, which is
| to reflect fundamentally different aspects of the things we are
| categorising (such as business areas and geographical locations).
That's true, and that's why I qualified this in the paragraph
following the one you quoted. Still, the fact remains that the
division into facets is a very weak form of structure.
* Lars Marius Garshol
|
| And hierarchies fail on several counts in this regard:
* Cunliffe D. J.
|
| Many of these counts seem to impose a very restrictive view of
| hierarchies (or at least thesauri) in practice (as opposed to a pure
| theoretical viewpoint)
It's true that this speaks more to simple hierarchies than it does to
thesauri, but if you consider thesauri the only point that does not
apply is the typing of relationships. And even with thesauri you only
get a few extremely restricted relationship types. Expressing that the
"Kano field location" is located in "Kano" or that "Ibsen was born in
Skien" is something you just cannot express.
| Many thesauri have typed relationships beyond the BT/NT hierarchical
| relationships. Often this typing is only weakly typed (in the
| programming sense) but there is a lot of interest in more strongly
| typed relationships.
Ontologies give you that, and much more to boot, so if that's what you
want I don't really see the point of using a thesaurus.
* Lars Marius Garshol
|
| - you cannot type the nodes, so a machine can't tell countries
| from diseases from people from animal species,
* Cunliffe D. J.
|
| But this is one of the things you get from facets - diseases and
| countries might well be modelled and categorized as fundamentally
| different things.
Sure you can do this, but again the end result is just a very weak
ontology. Essentially an ontology that consists of nothing but a class
hierarchy. Why bother?
* Lars Marius Garshol
|
| - you cannot assign properties to the nodes beyond one or two
| fixed kinds of names and perhaps some untyped URIs to content,
| and
* Cunliffe D. J.
|
| I'm not quite sure what you are meaning here - what type of
| properties might we want to store?
Anything. Email address, home page, number of inhabitants, date of
premiere, serial number, name in Swahili, ...
* Lars Marius Garshol
|
| - your relationships must form a tree.
* Cunliffe D. J.
|
| Even the BT/NT hierarchy at the centre of the taxonomy may sometimes
| be a polyhierarchy (more than one parent) rather than a strict
| hierarchy.
True. I don't have enough experience with thesauri to know whether
this is a bug or a feature. It seems to me that in something as weakly
structured as a thesaurus attaching a node to multiple parents is a
bit dangerous and something that can quickly lead to chaotic
structures.
| The introduction of other relationship types typically cuts across
| the hierarchy or across facets.
It does, but again it's too weak to be of much help. If you could
choose relationship types it would help, but then why don't you just
go ahead and add the other features of ontologies while you're at it?
| Even a strict hierarchy allows some limited automated reasoning,
| such as retrieval by semantic similarity.
True, but as you say that is very limited. You can't do structured
queries or logical inferencing or anything fancy.
* Lars Marius Garshol
|
| What you'd really want to say is something like this:
|
| "Oil services" is a "business area"
| "Oil surveying" is a "department"
| "Kano field location" is an "office"
| "Oil production" is a "department"
|
| "Africa" is a "continent"
| "Nigeria" is a "country"
| "Kano" is a "place"
| "Morocco" is a "country"
|
| "Kano field location" is "located in" "Kano"
* Cunliffe D. J.
|
| Which looks to me like a faceted hierarchy with a specialised type
| of related-to relationship representing the 'located in' concept.
Could you really express this in a faceted hierarchy? Note the
existence of 6 different classes/types, and the three different
relationship types (two of which are implicit here). If you can
express all that explicitly you're pretty close to a full ontology and
might as well use one.
| Maybe we are just arguing semantics again. An interesting question
| is where we draw the line (if indeed we do) between a thesaurus and
| an ontology. Do we need to go as far as Doug Lenat's Cyc or can we
| develop more pragmatic solutions even if they only apply to
| specialised domains?
Now that *is* an interesting question. It definitely true that doing a
full ontology is requires more work, at least up front, and people
certainly do find it somewhat more scary than thesauri.
The answer is that even if you do use an ontology you can choose your
level of ambition. One good example of this is this site:
<URL: http://www.forskning.no/ >
which uses topic maps, but with a very weak ontology. They basically
have two thesauri with weak cross-links, and then some strongly typed
additions to that (person, institution, employed-by, article series,
next/previous article, and some more stuff).
I wouldn't claim that the site is revolutionary, but they did manage
to build a pretty good site very quickly by choosing a powerful
technology and then deciding how much of its power to make use of.
They can do a *lot* more with this site without changing the
underlying technology.
(Just for the record: I work for a company which makes its living from
selling topic map software. This particular site is using an open
source product, however.)
--
Lars Marius Garshol, Ontopian <URL: http://www.ontopia.net >
GSM: +47 98 21 55 50 <URL: http://www.garshol.priv.no >
More information about the Sigia-l
mailing list