« ReSharper 1.0 | Main | Comics question »


Mark Pilgrim:

"XML on the Web has failed. Miserably, utterly, completely. Multiple specifications and circumstances conspire to force 90% of the world into publishing XML as ASCII. But further widespread bugginess counteracts this accidental conspiracy, and allows XML to fulfill its first promise ("publish in any character encoding") at the expense of the second ("draconian error handling will save us")".

Aside from Atom, where there was a long running discussion on XML + HTTP + RFC3023, I have some experience of this (along with Sean). In Ireland there exists an eGovernment integration messaging hub called the IAMS. The envelope format is XML but it's subsetted to be ASCII only on the wire - anything else is rejected. In truth this was done originally because one of the two first agencies on the hub insisted upon it, but the format has not been upgraded to allow higher encodings and in some part it is due to the current state of incoherence in application protocols between MIME and XML. Yes, borked encoding is a an issue, but not enough for us all to play Cassandra just yet.

"The entire world of syndication only works because everyone happens to ignore the rules in the same way. So much for ensuring interoperability."

I'm not as pessimistic about this, as I think this speaks to the rules not to interop - on the web the XML sky has not yet fallen in. And the rules are being looked at. Since Mark wrote (published? issued? created?*) his article, Paul Hoffman has informed the Atom WG that Apache will do the right thing with .atom files in a future release, Tim Bray has managed to persuade Microsoft to address the issue in a future release of IIS, and there is a new I-D obsoleting the Dread Pirate RFC3023 (specifically text/xml is gone). There is also a workaround that is acceptable in my mind and consistent with Postel's laws - decouple HTTP and XML processing altogether. At the end of the day the situation is restricted to encoding arcana, which is just one facet of the XML value-add and I suspect nothing like as substantial a part as Mark claims, which he has to in order to bolster the argument that XML has failed. The situation with XML on the web appears better than HTML, and imo that is the practical benchmark - things have never been better.

This does strike at one issue with the way the Internet and Web specifies its technologies - overall architectural consistency can get put on the long finger. Usually this is a good thing, but sometimes it leaves room for incoherence as formats and protocols are combined to new uses. And we will see another media types and applciation protocols problem like this in the future. Rather than encodings, it will be to do with the incoherence between media types and extensions to RDF semantics.

* Apologies - it's something of Atom joke

July 25, 2004 01:58 PM


Trackback Pings

TrackBack URL for this entry: