[Sigia-l] Vivisimo

Andrew McNaughton andrew at scoop.co.nz
Fri Sep 6 18:57:30 EDT 2002




On Fri, 6 Sep 2002, Richard Hill wrote:

> [Forwarded for Glen Green.  Dick Hill]
>
> <http://vivisimo.com/>
>
> Does anyone have information, insights or thoughts regarding the Vivisimo
> product. (My company directory saw a demo of it today and now believes it
> to be the solution to all taxonomy woes.)
>
> Thanks,
> Glen Green

I took a quick look.  I may have missed important stuff, but the
clustering tools seem to be what they highlight, and I zoomed in on that.

This system however has failed to impress me at all.  In order to asses
the effectiveness of this system, I went to the demo of clustering of
microsoft patents looking for:

1) microsoft's IP in the area of automated document categorization
2) a patent which I know exists concerning use of repeated word sequences
   to identify copyright infringement.

I wasn't entirely encouraged to see top level categories under microsoft
beginning with:

o Resources (121)
o Pixel, Color (114)
o Words (97)
...
[massive  long list continues, after half a dozen clicks in order to
see it]

I suppose either of the things I set out to look for might be under
'Words', or possibly under 'Resources'.  Predictably though the
collections under those terms were a fairly random bunch.  I gave up after
about 15 minutes with no success.  IF your boss is enamored of this
system, you should invite him to conduct a similar exercise.


Automated clustering tools have their place.  It's often the case that
manual categorization is not feasible.  It's also common enough for the
categories themselves to be shifting.

*Well implemented* clustering is a lot better than nothing, and can give
you quite rapid insight into the content of a large pile of otherwise
unorganized documents.  It's not a silver bullet.  It's unlikely to be an
improvement on human categorization where the resources exist to do the
job by hand.  It is however relatively cheap, and it can be a useful tool
for use in the human cataloguing process.







More information about the Sigia-l mailing list