« Weblog comments: just say broken. | Main | Meta-models all the way up: and why RDF is useful anyway »

The road to Damascus: Jon Udell and RDF

Jon Udell: An RSS/RDF epiphany.

Arguably I should get a life :-), but for me this remark was an epiphany. I've long suspected that we won't really understand what it means to mix XML namespaces until we do some large-scale experimentation. What I hadn't fully appreciated, until just now, is the deep connection between RDF and namespace-mixing. Dan's original hard-line position, he now explains, was that there is no sane way to mix namespaces without some higher-order model, and that RDF is that model. That he is now modulating that position, and saying that none of us yet knows whether or not that is true, strikes me as both intellectually honest and potentially a logjam-breaker.

One of the use cases for XML namespaces was in fact RDF which had a (perhaps at the time the) strong need to mix vocabularies. Especially, it needed a way to embed URIs in a syntax that would not allow them. The solution was the QName. As it turned out, XML Namespaces are less than ideal for transporting URIs and they are certainly not a sufficient mechanism to deliver RDF - you still need to finesse the XML into the infamous striped syntax, which often seems like a pointless burden.

But, to their credit, what the RDF people have always understood and what the XML people of all walks who look to XML Namespaces to this day still do not appreciate, is that you can't mix and match with XML Namespaces without a underlying data model to unify the vocabularies. RDF has a model - it's the triple relation.

We have to understand that XML namespaces is not a technology for integrating and extending vocabularies - it's a technology for firewalling them keeping them apart. Yet, without the shared model, there's not much point in doing this. Indeed, it's a fool's errand - for as soon as you've defined your set of names, job one is to turn around and integrate them with someone else's set of names. You may well think you need to be able to add new names without them banging into someone else's names, but what you really want to do is compose names to create new compound vocabularies. Integration is where the value is.

In response to Danny's article, Jon says:

Shouldn't we then substitute XML for RSS 2.0 in that sentence, and say there is no consistent way to interpret material from other namespaces in any XML document, period?
Shouldn't we then say, there is no reason to create any mixed-namespace XML document that is not RDF?

Yes we should say that, but that would be saying the Emperor has no clothes. Does anyone want to hear it?

And Jon is right, this does go beyond RSS. Web services for the most part are predicated on XML namespaced vocabularies, as are any number of behind the firewall integration efforts. In those worlds, there's historically been zero agreement on uniform content models, which is precisely why transformation is such an effective technology for integrating systems. Get the data into XML and start pipelining. And though neither the declarative or the API/RPC school of integration may like the idea of chaining processes with XML, in my and my employer's experience, the results speak for themselves. In truth, XML Namespaces are incidental to a transformation architecture.

Finally, a plea to all concerned. Let's stop punishing RSS syndication for its success by asking it to carry the whole burden of XML usage in the semantic Web.

I don't think anyone's punishing RSS. But the RSS community had a shot to rewrite the web with RSS1.0 and blew it by confusing the simple with the simplistic. The RDF community too had their shot to rewrite the web with RSS1.0 and we blew it, by digging in with an XML syntax that nobody wanted. There's still bad blood from the RSS1.0 wars, but I firmly believe neither statement is controversial anymore, nor is this one - everybody lost.

At the end of the day all proposals to extend RSS via XML namespaces reduce RSS to SOAP - a carrier format for another content model you may or may not have the codec for. The problem with this is not immediately obvious since all we've really done with RSS is render and clickthrough, but that's changing now. The missing piece is the shared content model, which many people believe should be RDF.

By the way, if you don't like all the semantic web stuff that RDF is associated with, here's another way of looking at it. Think of RDF as a CVM, a Content Virtual Machine, out of which any content can be described and by which content codecs can interoperate, by sharing a uniform view of the data. That's all there really is to RDF - an instruction set for content description. This is no more naive a view than Java's WORA.

In a surreal take on Greenspun's tenth rule, Danny Ayers among others have said that RSS would end up reinventing RDF, but altogether ad-hoc and badly conceived. That can still be avoided.


August 8, 2003 09:05 PM

Comments

Danny
(August 9, 2003 12:46 PM #)

"... a carrier format for another content model you may or may not have the codec for..." good point - I need to think more on that ;-)

CVM - absolutely brilliant!

btw, when I read Udell's piece I felt like he'd been very picky with his quotes, lining those that would (apparently) support his conclusion. So I posted the full text of one of his mails to me, with my response -
http://dannyayers.com/archives/001693.html

A bit later I realised why I got irritated, pretty much what you just said! -

http://dannyayers.com/archives/001694.html

Fred Grott
(August 10, 2003 08:56 PM #)

I agree but what has stopped this effort is not the superfical skin deep efforts of those who do not understand or want to understand RDF such as Dave Winer(Userland) but absence of tools to manipulate and create RDF vocabularies and triplets esily for the non semantic web person..

when those tools show up it becomes a very new inventive ball game..becuase then the power to change, modify and create is once again in the average web user hands..

Bo
(August 10, 2003 09:35 PM #)

"... a carrier format for another content model you may or may not have the codec for..."

How does RDF solve this problem exactly? It seems to me if you receive a FOAF file and you don't have a "codec" for FOAF then you're just as much up the river as if FOAF was ordinary XML. Sure, you might know the FOAF file consists of triples but you still have no idea what those triples mean and how to process them.

RDF doesn't solve the semantic problem. I really don't know how people came to think that if all data is RDF then it'd be possible to just glance at a set of triples and know what they mean. RDF doesn't solve this problem and if you think you can solve this problem by throwing OWL and the other ontology languages on top then I suspect you're simply in a state of denial. RDF just doesn't solve the mixed vocabularies problem.

The codec problem was solved a long, long time ago by the first http web browsers. The solution is clear: if you encounter an element you don't understand than ignore it. That's that. Or, if you're dealing with certain sensitive scenarios then just fail-fast and abort the process.

Next, this notion of a CVM is very misleading. Just because you know the format of a data doesn't mean you can do any significant processing with it. Knowing a file is a bunch of triples is really not much better than knowing it's a stream of bytes. And I can make a strong argument that there will never ever be a successful "universal data model" nor should there be; any context-transitions should occur by transformation.

I really don't see the value-add of RDF as a data-interchange format. I've read the specs and used it plenty. I've built applications. But almost every time, I've regretted the choice of using RDF to exchange data simply because RDF sucks when you try to use all those neat XML tools like schema, XSLT, and XPath. (And it's kinda funny that you mention RSS re-inventing RDF. I'd say it's far more likely that RDF will re-invent XML. Witness RDF-Schema and the newfound interest in "porting" XPath to RDF).

The only value-add I've experienced from RDF is using RDF-QL. The ability to query arbitrary data based on the simple {Attribute/Value} criteria actually works out really well. But this doesn't mean that RDF should be used to encode RSS and in fact I'd say RDF should never be used as a data interchange format. Unless somebody can make a real, technical argument for the values of RDF (rather than this handwaving "newsreaders will magically 'understand' all feeds and RSS1.0 extensions) or better yet, provide a demonstration where the utlity of RDF exceeds XML (most applications of RDF, particularly FOAF, are better done using XML+XLink), then I will continue to remain unconvinced that RDF has any place on the wire.

Trackback Pings

TrackBack URL for this entry:
http://www.dehora.net/mt/mt-tb.cgi/1061

Listed below are links to weblogs that reference The road to Damascus: Jon Udell and RDF:

» All the way up from Raw Blog
The RDF-related developments around Atom put model vs. syntax discussions back on the agenda. Jay Fienberg has a nice roundup... [Read More]

Tracked on August 15, 2003 10:39 AM

» evil markup from Gregor J. Rothfuss :: Imagination is key to your dreams coming true
i just finished a conversion of a MT blog to KAYWA. most of the posts were stored in some non-xhtml... [Read More]

Tracked on August 15, 2003 11:20 PM