[Sigia-l] Word HTML - money were my mouth is (was When Should a Manual be Web-based?)

Jon Hanna jon at spin.ie
Wed Mar 5 07:32:21 EST 2003


> You have completely missed the point of the HTML example. The point
> was NOT about how to turn invalid HTML into valid HTML.

My example does not turn invalid HTML into valid HTML. It defines a
hypothetical version of HTML in which your example was valid (though looking
at it now it has a bug in having text directly in the body), hence proving
your assertion that one could tell that the elements were wrong without
knowing or assuming a particular DTD is false. I can't tell that the
document isn't of such a version if I don't know what version it is of.

What you can and cannot do in HTML depends on the version. For example:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
 "http://www.w3.org/TR/html4/strict.dtd">
<title>title</title><p>test</p>

is a perfectly good document roughly equivalent to your example. However:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<title>title</title><p>test</p>

is complete gibberish.

> The point was that HTML error can still be detected even when there is
> no reference to a published HTML version.

That is true. A document may not be well-formed by the rules of SGML,
further if we can deduce (perhaps from an <?xml?> declaration or out-of-band
information) that it is XHTML we can see if it is well-formed by the
stricter rules of XML. Your example is not an example of such.

However we weren't talking about how to detect HTML errors, we were talking
about how to detect their absence.

The one thing we can tell about your example document that makes it
erroneous HTML is that it lacks a DTD.

> > > > > Mind you, web browsers do not require web pages with DTD
> > > > > declaration.  Unless Word acted up, Internet Explorer,
> > > > > Netscape Navigator, and Opera have no problem with the
> > > > > Word-produced HTML.
> > > >
> > > > Not strictly true.
> > >
> > > "True" versus "strictly true" is a game of semantic.
> > >
> > > The fact is that web browsers do not require web pages with DTD.
> >
> > No, the fact is that interactive UAs are required to make best
> > attempts at rendering incorrect HTML.
>
> Just because a web page does not have DTD, the HTML is incorrect?

Yes. This isn't Liber Al Vel Legis. You aren't meant to take the spec away
and mediate on it privately to take away your own personal interpretation.
It's a protocol that is designed to be used by communicating computers,
unless you build a mind-reading library and release it under a liberal
license so we can all use it the only way you can expect computers to know
what you want them to do is to use the language as it is defined.
The spec says it needs a DTD.
It needs a DTD.
RTFM: <http://www.w3.org/TR/html401/struct/global.html>

> You are confusing DTD with HTML.

huh? I'm pretty sure that sentence is wrong but I'm not sure exactly what
you are accusing me of confusing with HTML, the declaration, or the DTD
documents?

There are HTML rules that go beyond those encoded in the DTD. Compliance
with the DTD is a minimum for being valid, valid is a minimum for being
good. If it isn't valid it's just a bunch of bytes.




More information about the Sigia-l mailing list