[Sigia-l] Teragram Answers - Followup
dbedford at worldbank.org
dbedford at worldbank.org
Tue Jan 11 16:31:04 EST 2005
Quick responses to the followup questions about Teragram...
I was curious about a couple of things..... So what did you use before?
Human beings.....we are using Teragram to capture persistent metadata which
feeds into metadata stores. Search system and common reference sources are
distinct components in an open architecture. We use Oracle Intermedia search
against the metadata stores. We haven't changed our architecture or the tools
we're using. We've just inserted Teragram into some of the foundation
processes.
Was there a reason to switch to Teragrams tools?
Many reasons - among the business drivers were -- need to keep pace with
increased demand for metadata for electronic objects (throughput); need to
provide metadata for all kinds of content (not just formal documents or
publications); need to capture metadata in multiple languages, need to classify
to an institutional classification scheme and to begin to do 'deep conceptual
indexing' in contrast to broader subject analysis.
How about the learning curve?
Depends on which learning curve you mean. We already knew what we wanted to do
with Teragram and it was pretty easy to implement it in Teragram. We've been
planning this for about seven years -- we were just waiting for a tool to come
along that would support our architecture. The learning curve is not actually
working with Teragram in our case, the learning curve is in training people how
to build the parts of our institutional profile. For example, the profile we
build with Teragram can include upwards of 15 attributes. One of these might be
the topic classification. The topic classification profile has 600+ classes -
each of which is implemented as a categorization entity. The learning curve
relates to teaching our team how to build the concept rules to enable
categorization. To accomplish this, we work with the stakeholders as well.
We also leverage the categories and concepts in search - but Teragram captures
them as metadata, and search uses the metadata.
Learning curve also depends on what you're trying to do - what kind of a profile
you're building. For example, writing rules for ISSN, ISBN, Project ID, Trust
Fund #, Loan #, etc. is pretty much dependent upon understanding patterns and
writing the concept extraction rules to match. Some rules are easier than
others.
How'd you grade it?
A+. Average cataloging time has dropped from 20 minutes to under 5 minutes.
And, we can now process kinds of information we didn't have the resources to do
before. And, it will make a significant difference in cross-language access if
we can do metadata capture in the original language of the document. Providing
all metadata in English, regardless of the language of the document, introduces
challenges in translating English metadata into other languages for searching.
We want to implement a cross language thesaurus to support searching against
multiple language metadata.
Hope these answers help.
Best regards,
Denise Bedford, Ph.D.
Senior Information Officer
Information Solutions Group
World Bank
Washington DC 20433
More information about the Sigia-l
mailing list