[Sigia-l] Taxonomies & Navigation

Sun Aug 26 10:34:45 EDT 2007

Ziya wrote:
> The drastic and scalable solution lies in a totally different direction:
> eliminating 'documents' altogether by giving access to live, real-time
> 'views' of the data, customized just-in-time for specific uses and users.
> Multi-dimensional metadata can allow the architect to shape the 'view' in
> amazing ways, not possible with frozen docs, thereby eliminating the need
> for users to actually 'navigate'. Taxonomy dissolves into multi-dimensional
> (not faceted) metadata and rarely needs to be exposed to the user. Of
> course, there's a rules engine running in the background, orchestrating the
> whole thing.
>
> No docs: not much to store, version, maintain, archive, purge, etc. And no
> formal navigation, either.
>
> This gets complicated, if one's not used to dealing at this level of
> abstraction and integration, but the power therein is undeniable. I intend
> to write about it in detail in a blog or something, one of these months.

Another POV on scalability, complexity and overhead costs for a
non-document content model: (Caveat: my background is with very
large-scale systems, and that colors my POV)

Complicated, indeed. When attempting a fragmented content model, in
support of a page-less, document-less experience and content strategy,
respectively, expect to build in  the overhead you'll need for
creating and managing a significantly more complex data model.

It's important to think in terms of user objects. What this means is,
if your users will expect to find documents, like white papers, maybe
it's a good idea to keep them intact so that your system doesn't have
to build them each and every time a user requests one (i.e., by
clicking on the corresponding link). This is assuming that your
content is separated from your navigation components in the first
place. In the case of content types that are, in the real world,
documents, a larger content object from which elements can be selected
and extracted can be more useful than a disconnected pile of smaller
content objects that need to be assembled in order to display
something coherent.

One of the reasons, and tying this back to taxonomy and metadata, is
that you'll ultimately need at least 2 "taxonomies" for "content
type".
1) The first to support your users -- to catalog the types of content
and information objects they came to your site to find and explore in
the first place, regardless of how these things are stored and managed
in the back end. This may include document types, but may also include
things like tools, search interfaces & search results, dashboards...
anything from the real or digital world, that they may expect to find
or access on your site.

2) The second to support your authors and content owners -- to enable
their user experience with the CMS. "I want to upload a press release"
or "I need to modify my product description -- features  &
benefits"...

As both of the above are about "perceived" content types -- sometimes
rolling up to genre-like categories, you'll also need some
classification of the digital nature of the content and of the posture
of the content. Both of these dimensions have something to do with the
concept of "format".
3) Digital nature -- this is like MIME type, but in my experience,
I've needed to create my own list of allowed values... it's easiest to
just think of it as MIME type or data type and to figure out what your
project, site and/or systems particular needs are when the time is
right.
4) Posture -- the interaction paradigm that the content follows. For
example, "webcast" is a posture, not a content type, as in #1 above. A
webcast is the paradigm in which an event, interview, earnings
announcement, etc. is delivered. (Event, interview, and earnings
announcements are examples of #1)

That's complicated enough.

5) Now, if you're also going to deconstruct your #1s into
paragraph-level fragments, you're  going to need standards and a data
model to enable (re-)compilation of these fragments into something
meaningful to end-users. In truth, you would need to label
paragraph-level objects even if they are part of a document-based
content model, as you're most likely storing these in XML -- you'd
need some kind of <paragraph> tag to encapsulate each content block.
The difference is that now you're going to need to further type these
blocks (i.e., what type of paragraph? ...similar to the spirit of #1),
and you're going to probably need multiple systems to understand this
expanded vocabulary. What I mean by this is that it's no longer your
"local" CMS that needs to know about the granular intra-document
objects, but also your rendering system, search system, etc... of
course, all depending on when and where, in your system landscape, you
compile these fragments into something meaningful for the user.

Of course, this is easier if the content at hand is born of the
digital world in the first place, like the content of micro-blogs. But
keep in mind that even micro-blog entries, as fragmented and
disconnected from each other as they tend to be, may also be parts of
a larger "conversation" or "event"...

(Sidebar: it's important to consider, but not assume, that compiling
fragments into complex objects will happen on the fly with the help of
portal software. This "on the fly" process may be resource intensive,
and your architects may ultimately opt for pre-compiling and caching
certain combinations of fragments and objects.)

So to sum it up, my word of caution is -- before you start unpacking
your content into a fragmented model, do some cost-benefit analysis
and consult with your system architects, data architects and
developers. The bigger your system and ecosystem, the more expensive
managing smaller pieces becomes. Not to say that there isn't benefit
(or beauty) to this level of complexity. This s more to say that
scaling your content model requires deep scrutiny -- what MUST be
shared,  extracted, and/or compiled in real- or near-real time based
on rules (for both operational efficiency and the users' benefit, of
course) v. what CAN be... very different questions to answer.

thanks,
Ruth