« RDF: syntax refuted? | Main | I'd rather use a GET »

Why readable XML matters, even to RDF

(I'm lifting this out of comments, Danny we both have pingback ;):

Thanks for quoting one of my better statements ;-) For there to be a change of syntax to be justified, there would need to be pretty major improvements (and that's apart from the politics needed to get something through the W3C). Given the graph/tree mismatch I'm not sure this is even possible - personally I thought Tim Bray's RPV syntax was actually uglier than abbreviated syntax RDF/XML. If all you want is human-readable, Notation3's not bad, it's just not XML. I'd also question how important readability is for any data-oriented XML. RDF isn't XML but (early) HTML isn't XML either. Once the generators/parsers or whatever are set up, there shouldn't be much need to go poking around. In the case of RDF/XML, it may not be that simple, but I think there's enough legibility for maintenance purposes. I do think there's a bit of exagerration going on too - it may not be optimum, but RDF/XML syntax isn't harder to work with than say XHTML(+CSS) or XSLT. Overall I just reckon effort will be better used at this point in time working on getting some good tools together, rather than worrying about the inelegance of RDF/XML syntax. If someone does come up with a neat XML syntax, great, if not, no big deal, adoption of RDF will just be a little slower.

Danny, I've agreed RPV isn't any better. And while N3 isn't XML, it's not RDF either so I'm happy to let it go by the by. I'd like to not be able to ignore n-triples. The rest of this post is where I'm coming from on the syntax thing.

I don't question how important readable XML is, I know it's important. Away from specland and in the trenches, the problems with XML often to come down the same old things- things like Namespaces, the DOM, too much XSLT to mange, memory usage. But truly the combination of hard to read data formats with tools designed to protect the developer from the raw XML data, or just plain hide it, amortizes all the other problems. Not being able to get at or work with the text drives me insane - I do not need to be protected from the stuff, and if I do, I shouldn't be using it.

In a past life, I never appreciated or understood HTTP until I worked with it directly (ie not through a CGI or a Servlet). Today it's obvious to me that one of the reasons that protocols like SMTP and HTTP have come to domination is because they can be read outside a debugger or some vendor's tool. I can often solve HTTP issues on the job by looking at the traffic, something I can't always do through server tools and GUIs (this has actually happened twice this month). That makes HTTP valuable to me professionally in precisely the same way being able to build sofware outside an IDE with make or Ant is - it helps me to do my job, better and more flexibly, which ultimately is not to use HTTP, but deliver software that meets a need. [The other characteristics in HTTP articulated by REST are important too over the life of a system, but for me, shipping with, and ongoing maintenance of, are primary characteristics of any software technology, especially in a networked environment]

It's very hard to do that with some of the data-focused XML I'm seeing in the last two years. And I don't believe (like some perhaps) that's to do with anything inherent in being data-focused, just that the emphasis on API access and expectations about tool support results in a dissonant and I suspect unintended outcome - write only XML.

The potential cost of dealing with write-only XML on the ground isn't worth it in my experience, and hand on heart I simply can't justify the risk of depending on tools to maniplate it. For these reasons, I'd shy away from recommending RDF in the heart of a system, in precisely the same way I would with XSLT or Perl, or some vendor's proprietrary format. That's even where I think RDF might be a good fit. Claims that RDF/XML is not much worse than XSLT is not in its favour - I've seen projects get into trouble simply because there was a lot of XSLT and it gets difficult to manage and organize. The only people I know that seem to be geting good use from RDF that would be near my current line of work are FourThought - but they wrote their technology to get there; I'm not sure I have the inclination or energy to do that just yet.

So when I'm whining about RDF/XML syntax, it's not just a matter of personal taste or being a curmudgeonly git. It's all very boring and practical really. If some technology is ungainly, it needs to be so for a reason - but RDF/XML is pure overhead - it adds no interesting expressive power, does not fall into the good enough category, and thus to me can't be rationalized by a charter, tools, or the fact that RDF is potentially valuable somewhere down the wire. The single exception might be that everyone else is using it - that's not the case.

May 25, 2003 06:46 PM


Trackback Pings

TrackBack URL for this entry: