[Sigia-l] On Evangelism, and How it Affects Enterprise

Andrew McNaughton andrew at scoop.co.nz
Thu Sep 26 20:01:33 EDT 2002


On Thu, 26 Sep 2002, Chris Chandler wrote:

> Derek,
>
> Someday I hope you'll share the URL of the script you use to
> come up with this stuff.

Firstly I should say that while I didn't manage to get much out of
Dereks's writing either, my intention here is not to be rude.  Rather this
looks like a useful launching point for a bit of a ramble through some
interesting ideas, particularly relating automated news processing.

Simple markov chain based text generator's do surprisingly well at
generating semi-coherent text based upon arbitrary text input.  The
generator takes a body of text as input and builds up statistics so that
given any n word sequence (n-gram) appearing one or more times in the
input text (where n for practical purposes is usually two or three), the
probability distribution of what word follows is known.  The generator is
then started from some n-gram from the text and proceeds to choose the
next word based on the collected probabilities.  The last n words of the
output text are then used to produce the next word, and so forth.  You can
find a perl implementation of this at
http://perlmonks.thepen.com/55851.html

I put together as a screen saver for my laptop which uses grimm's fairy
tales as input to this text generator.  At any given point in the produced
text, the context for a few words either side seems to flow normally, but
over a longer interval, the text wanders around through the source
material in a more or less random fashion.  The experience of reading it
is a lot like reading a book when you're really thinking about something
else.  The text passes through your mind, but nothing sinks in, and you
find you have no idea what it is you just read.  Also very similar to what
a good knock on the head can do to you.  You do know a good deal about
what the text is about, but at the same time you have very little idea of
what is being said.  I use grimm's fairy tales as source text because
familiarity with what the original said makes the output quite humorous at
times.

The reason this interests me is because it gives some intuitive sense of
what gets thrown away when text is represented in an indexed form for a
search engine.  I'm working specifically on indexing recurrent word
sequences for handling news material, so this representation is very close
to throwing away the right stuff for the approach I'm taking.  I guess to
get a sense of what's contained in a more typical inverted text index
you'd be better to simply randomise the word order of your input.

I really don't have a clue what most of the concepts derek strung together
were, or how they related to each other.  It's fairly apparent that he
attaches a lot of meanings to words that are different to the meanings I
associate with the same words. (Assuming of course that this isn't in fact
pseudo-random text). That derek is aware that he is using special meanings
for words and phrases is indicated by his quoting of particular phrases
and addition of explanatory(?) bracketed supplements.  None of this really
helps my comprehension, but it does focus attention on where the meaning
might lie.

It's rather analogous to the situation with computers processing text
according to content.  Given a whole lot of documents like this, processed
by the markov filter I described above, I could probably manually cluster
them into some sort of set of categories based on the combinations of
words and phrases found in the documents.  It's doubtful how well the
arrangement I came up with would map well to the conceptual organisation
of the author, but to the extent his use of words is consistent it would
probably make sense to a degree.  This is a fundamental problem in
automated clustering of documents.  The clusters formed by most algorithms
don't map well to the conceptual maps used by humans.  The problem doesn't
lie in limited algorithms so much as limited information being available
to the software.  The algorithms just don't know enough about the concepts
that are represented.

There are a lot of other clues available in the source text, particularly
involving co-occurence and hyperlinks which can be processed in various
ways by uncomprehending algorithms, but I think it's safe to say that
we'll need humans for this task for a while yet.  And subject experts for
that matter, not just IA's.

Another interesting observation.  Google doesn't report on the world.  It
reports on trends in media content.  Google is unlikely to highlight the
fact that America conducted a nuclear test today, because the media has
not reported on it very widely, though google has found stories about the
issue.  It seems clear to me that this is an important story in light of
the current discussion of weapons of mass destruction, but something like
what google have built is not going to identify this as important.  This
sort of critical judgement requires human intelligence and values, and
will do for some time to come.

What interests me about automated processing of news is that it makes it
possible to go closer to the source.  for example, to collect press
releases from various organizations sites rather than just using the
filtered and rewritten material on the news sites.  The hope is that the
aggregation process can be moved closer to the user, allowing a more
independent exploration of what's going on.  You still need an ability to
pick out the notable events of the day, but it's the ability to follow
through to related documents, and to pursue particular
conceptual/linguistic connections between documents that interests me.

Andrew McNaughton



On Thu, 26 Sep 2002, Chris Chandler wrote:

> Derek,
>
> Someday I hope you'll share the URL of the script you use to
> come up with this stuff.
>
> -cc
>
>
>
> ----- Original Message -----
> From: "Derek R" <derek at derekrogerson.com>
> To: <sigia-l at asis.org>
> Sent: Thursday, September 26, 2002 9:34 AM
> Subject: [Sigia-l] On Evangelism, and How it Affects
> Enterprise
>
>
> >
> > Evangelism will allow your company to discover and acquire
> continuity --
> > so that it sees no obstacle and is stimulated to seek the
> means to lead
> > itself toward enterprise-wide goals.
> >
> > Much more powerful than a mission-statement, acting like a
> > consumer-brand, evangelism is akin to alignment
> (direction) --
> > antithetical to 'correction' or 'administration.'
> >
> > Evangelism is a 'measure to safeguard contributions' so
> that groups and
> > individuals within your enterprise, and with whom your
> company does
> > business, have the 'right knowledge' to orient and nurture
> themselves --
> > to remind them that they, and their processes, are genuine
> and capable.
> >
> > Evangelism is about keeping the crosshairs 'safely
> on-target' -- or
> > making the target 'easy-to-see' and comprehensible -- so
> that no effort
> > is required to hit your mark.
> >
> > To 'actively monitor' the status of culture, language, and
> communication
> > sharing within your enterprise is to provide 'work-flow
> insurance'
> > (fundamental unity) so that construction and execution of
> > enterprise-wide goals are maintained, and remain
> obstacle-free.
> >
> > This 'ethnographic activity,' or direct observation,
> minimizes
> > dependence on self-reporting and self-definition --
> innumerable separate
> > things -- instead taking holistic views of the enterprise
> to reveal
> > emergent over-all direction (holistic-definition
> necessitates
> > holistic-cultivation).
> >
> > This evangelical approach provides your company with the
> time and
> > ability to 'react to itself' in a way which promotes
> > 'certainty-of-arrival,' (attention to purpose), without
> altering common
> > order.
> >
> > To identity language and communication differences within
> your company
> > is to remove confusion (fear) and provide congruence
> (confidence) so
> > that relationships (business) may flourish.
> >
> >
> >
> >
> > ------------
> > When replying, please *trim your post* as much as
> possible.
> > *Plain text, please; NO Attachments
> >
> > ASIST Annual Meeting:
> > http://www.asis.org/Conferences/AM02/index.html
> >
> > ASIST SIG IA website:
> http://www.asis.org/SIG/SIGIA/index.html
> > ________________________________________
> > Sigia-l mailing list -- post to: Sigia-l at asis.org
> > Changes to subscription:
> http://mail.asis.org/mailman/listinfo/sigia-l
>
> ------------
> When replying, please *trim your post* as much as possible.
> *Plain text, please; NO Attachments
>
> ASIST Annual Meeting:
> http://www.asis.org/Conferences/AM02/index.html
>
> ASIST SIG IA website: http://www.asis.org/SIG/SIGIA/index.html
> ________________________________________
> Sigia-l mailing list -- post to: Sigia-l at asis.org
> Changes to subscription: http://mail.asis.org/mailman/listinfo/sigia-l
>





More information about the Sigia-l mailing list