[Sigia-l] data as information?

Conal Tuohy Conal.Tuohy at vuw.ac.nz
Thu Jun 30 20:09:11 EDT 2005


Dan Saffer:
> > [a datum is] meaningless until it's part of a relationship with
other data.

Listera:
> This makes no sense at all to me. Since data is not going to 
> connect itself to other data autonomously, you're going to do 
> that yourself. Well, at that point why do you call it data? 
> It's already got a context, relationships, etc. So how does 
> the assertion: "[Data] is the evidence of a relationship 
> between two things" make any sense? How's that different from 
> what some here call information?

I've been reluctant to contribute to this thread because it seemed to me
that it wasn't sufficiently grounded on a common vocabulary, hence
there's way too much talking at cross-purposes. Because all these terms
are used informally in really loose and ambiguous ways, efveryone is
really thinking of different concepts when they use the same words. In
these circumstances you can't actually have a useful discussion.

Earlier in the thread Jan Jursa mentioned Shannon and the definitions
which "Information Theory" gave to these terms. For those who haven't
come across it, it's a theoretical framework for thinking about
communication, in quantitative terms. Signals, symbols, data, messages,
senders and receivers, information, noise, redundancy ... it's all
there. I don't claim to be an expert in it myself, but it surprises me
that there's been so little reference to it in this discussion. It must
be another universe of discourse! Of course it is really useful for
theorising about machine-to-machine communication, but it's also 100%
relevant to this thread. 

Anyway ... for those not familiar with it, in terms of "Information
Theory", there's no dichotomy between "data" and "information". They are
distinct concepts, but they are not disjoint.

In the theory, "data" are the symbols of whatever type. They may or may
not be meaningful. In the digital world of IA they are always bits at
the end of the day. In terms of Information Theory, data can carry (or
contain, or be) information. By definition, a datum is "information" to
the extent that it can induce a change in the internal state of the
receiver of the datum. If you're looking up a bus schedule on a bus
timetable, and someone points out that Ouagadougou is the capital of
Burkina Faso, that will probably have no effect on you ... in that sense
it's not "information". Though humans are complex creatures, and you may
well ferret away that bit of data for later use in crossword puzzles. If
so, then it WAS information.

Hence the "information content" of data is crucially dependent on the
receiver. If you receive data containing a coded message, and you don't
have the key to the code, then the data will not be information (to you,
though the same data may be information to someone else). Or at least,
the data will carry SOME information to you, namely that there was SOME
message rather than NO message, but that's not a lot of information
(it's only 1 bit of information I guess). This is where context (e.g.
the key to the code) comes into it. 

So data exist objectively and independently, but since "data" in this
theory are only "information" to the extent that they can have an effect
on a receiver, this makes "information" a user-centric concept, and
even, if you define "receiver" to be a task-focussed human, in some
particular "mode" (rather than as a complete person), then what
constitutes information can be defined in terms of the task, and the
receiver's pre-existing knowledge of the task. This user-centredness is
essential IMHO.

There's a lot more to Information Theory, and not all of it that
relevant, so I'll leave it at that. HTH someone!

Cheers

Con



More information about the Sigia-l mailing list