[Sigia-l] Scaling content inventories

Margaret Hanley mairead at yahoo.com
Wed Oct 30 05:15:21 EST 2002


Well...

We are like any other large company, the metadata actually
held in each page or against content is dependant on the
content author. Some add the metadata as required by the
seach standards, others (most) do not.

So we know roughly who owns the content (based on the org
structure and internal networks), but we are not sure of
when the content is created due to the way we publish
content.
 
For the content inventory, I was particular interested in
the metadata, to give me an idea of how content authors
describe their content. I was also interested in the types
of content objects, to ensure that any CMS will handle the
content.

The reason we did both a web crawl and disk scrape was to
understand how much content we needed to migrate to the
CMS. Of course this will also be balanced by the business
requirements and (I hope) a content amanagement policy.

Terminology  
Content objects - lowest level of content held within the
CMS, described with metadata and perhaps in XML if textual.

Hopefully I will write or present some of our learnings
from this exercise in the next couple of months.

Mags 
 --- Peter VanDijck <pvandijck at lds.com> wrote: > 
> 
> Margaret Hanley wrote:
> 
> >  By identifying this content, we can then
> > choose to do a really detailed content analysis of the
> > content that is different or slightly different to the
> norm
> > to add to our ever increasing "content object" library.
> 
> Apart from the content object library (types of content
> right?), what
> other info do you record about the content? Owner? Date?
> ... I'm also
> interested in finding out the different ways in which the
> CI is useful for
> the BBC: how is it used?
> Thanks!
> Peter
>  

__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com



More information about the Sigia-l mailing list