[Sigia-l] Scaling content inventories

noreen whysel nwhysel at hotmail.com
Tue Oct 29 12:26:59 EST 2002


At my previous job (Big 5, 4, 3, 2?) I did ad hoc link audits and a 
quarterly contact email audit.  The link audits were generally on a per site 
basis, so I only needed to deal with a single database at a time.  I would 
run the link checking software, convert to Excel, sort by site and forward 
reports to the appropriate content team to clean up, or occasionally do it 
myself if the QA team was swamped.  Depending on the site, I could do the 
reporting portion in just a few hours.  If I had to do corrections myself, 
it could take a few days.  We also did quick-n-dirty audits based on the 
broken link data provided by our search service, but I am not sure what 
portion of the site was covered by this tool.  That generally could be done 
in a few days and would cover several of the websites.

Since email was my primary concern, the quarterly audits covered the entire 
global site, something like 15 databases, varying from several hundred to a 
few thousand pages apiece (non-global databases, like extranets and certain 
"rogue" country sites, used a different feedback form - a headache I can 
save for another topic).  To do the audit, I created a Notes view showing 
the site owner and contact address.  This was the field that was accessed by 
the feedback form on our global site and directed content to the "page 
author".  I sorted the docs by page address, updated any new or corrected 
contact addresses and highlighted names (I cheated by collapsing duplicate 
names...didn't want to bother IT for their limited resources) that did not 
follow the correct format (again, IT wouldn't give me resources to fix the 
field so it validates on entry...you'd be surprised how many people just put 
their initials or "web admin", even though the database clearly stated that 
the field was used to route webmail).  It was then the content team's 
responsibility to fix any errors.  I would be able to complete this audit in 
a couple of days.

I didn't have any control over when or whether the content team did 
corrections, but soon found that several sites were not being corrected.  So 
I had to develop an awareness campaign that I tied to the release of the 
monthly lead management report.  I would highlight leads received via the 
site, point out any revenue that was connected with particular web leads.  
Something like, "XYZ dept announced $XX,000 contract linked to web mail lead 
on Jan XX" or "ABC Corp seeking consulting services...."  Anyway, I'd follow 
it with "Are you getting all your leads?" and a reminder to review the 
audits and correct the errors.  Errors slowly decreased, but this remained 
an important audit to do.

Also, the feedback database had a view showing email address errors, so if 
an error occurred, I could alert the content team ad hoc.

I believe the CMS the company uses now has much better content auditing 
functionality.  The demo I saw before the layoff nearly claimed to correct 
link errors by themselves, but I never got a chance to work with it.


Noreen Y. Whysel
Information Architect
nwhysel at hotmail.com

Knowledge Management
Intranets and E-Communities
>From: Margaret Hanley To: Donna Maurer , sigia-l at asis.org Subject: Re: 
>[Sigia-l] Scaling content inventories Date: Tue, 29 Oct 2002 13:29:25 +0000 
>(GMT)
>
>I feel very well qualified to talk about content inventories and analysis 
>at the moment.
>
>I am managing the content audit, analysis and modelling of the BBC web 
>site. It contains (we think) 1M pages, 1000 web sites and growing.
>
>We are handling the sampling issue by doing a site sudit first - looking at 
>the sites, main sections for normal and unusual content. By identifying 
>this content, we can then choose to do a really detailed content analysis 
>of the content that is different or slightly different to the norm to add 
>to our ever increasing "content object" library.
>
>To cover ourselves, to identify content that's not linked or hard to find, 
>we also do a disk scrape (for unlinked content) and web crawl (linked 
>content) so we can see if we are missing vital info. We actually have 
>really large anmounts of content that are not linked, sometimes up to 2/3 
>of the total content.
>
>It seems to be working.
>
>Mags
>
>--- Donna Maurer wrote: > I have a feeling (and only a bit of data to 
>support it), > that once you get > over some number, the content is likely 
>to have more > consistency > than my 5000 different page Intranet, and may 
>be able to > be listed > out as a block (eg if there are minutes for weekly 
> > meetings for the > past 3 years, you probably don't need to list all of 
> > them). > > There are some times when you just don't need to know > every 
>page - > in my case I did because it all has to either move to a > new 
>system or > be deleted - I can't miss anything. > > Boy this was a time 
>consuming process, but boy it was > worthwhile. > > Donna > (of the famous 
>content inventory) > > > > On 28 Oct 2002 at 11:54, Peter VanDijck wrote: > 
> > > Content inventories are time intensive (DonnaM says 500 > pages a day: 
> > > > http://www.maadmob.net/donna/blog/archive/000035.html#000035) > and 
>nessecary (no > > spell checking on this machine). What are your > 
>strategies for scaling them up? > > What if you have not 5000 but 50.000 
>pages? At 500 > pages/day/person, that would > > take 4 people a full 
>month. What elements of the CI can > be automated? For what > > parts do 
>you *need* IA's to look at it? How does the > client fit in? What bits > > 
>can be done by temps? How do you assure accuracy? > > PeterV > > 
>http://poorbuthappy.com/ease > > ------------ > When replying, please *trim 
>your post* as much as > possible. > *Plain text, please; NO Attachments > > 
>ASIST Annual Meeting: > http://www.asis.org/Conferences/AM02/index.html > > 
>ASIST SIG IA website: > http://www.asis.org/SIG/SIGIA/index.html > 
>Searchable list archive: > http://www.info-arch.org/lists/sigia-l/ > 
>________________________________________ > Sigia-l mailing list -- post to: 
>Sigia-l at asis.org > Changes to subscription: 
>http://mail.asis.org/mailman/listinfo/sigia-l
>
>__________________________________________________ Do You Yahoo!? 
>Everything you'll ever need on one web page from News and Sport to Email 
>and Music Charts http://uk.my.yahoo.com ------------ When replying, please 
>*trim your post* as much as possible. *Plain text, please; NO Attachments
>
>ASIST Annual Meeting: http://www.asis.org/Conferences/AM02/index.html
>
>ASIST SIG IA website: http://www.asis.org/SIG/SIGIA/index.html Searchable 
>list archive: http://www.info-arch.org/lists/sigia-l/ 
>________________________________________ Sigia-l mailing list -- post to: 
>Sigia-l at asis.org Changes to subscription: 
>http://mail.asis.org/mailman/listinfo/sigia-l

_________________________________________________________________
Choose an Internet access plan right for you -- try MSN! 
http://resourcecenter.msn.com/access/plans/default.asp




More information about the Sigia-l mailing list