« RELAX NG book in pre-publication | Main | On the value of Turing machines »

Does anyone really think RSS-Data is a good idea?

Danny, being polite:

What we're looking at is application-specific data structures being encoded in XML-RPC and RSS being used as a blind transport.

It all sounds like SOAP RPC-encoded. Why bother reinventing a failed approach?

RSS 2.0 itself is defined essentially as an application for pumping newslike information. Nothing else. You can extend it using namespaces, but there is no standard way of doing so, so every new extension module exists as a parallel pipe.

For some strange definition of extensible. This is like saying org.apache.tools.ant.* is an extension of Java, or that jelly is an extension of a peanut butter sandwich. Really, extensible here means someone can tunnel another vocabulary through RSS 2.0 to you using namespaces and if you can dispatch on the namespace you're laughing. It does not mean the RSS2.0 vocabulary itself was extended or that those two sets of names have anything whatsover to do with each other by implication. Worse, without RDF, namespaces are 100% overhead. The most useful purpose of XML Namespaces is to map XML elements onto URIs, and the most useful purpose of mapping URIs to XML elements is to ship RDF around. However, this not what most people are using XML namespaces for. When it comes to namespaces, if there's no RDF, there's no extensibility.

Extension, in my book, comes through a shared model of content or processing, the former which RDF happens to provide in a manner quite similar to the UML or the relational data model. If you provide RDF content inside RSS1.0 all mapped onto XML+namespace, I can map that content and the RSS into an RDF graph - there's no tunneling other than tunneling of RDF. Think of RDF as being a bit more pedantic about how you express the domain model - sometimes that will make sense, sometimes straight XML is the way to go.

Unfortunately, the semantic web AI hoopla has put many people off RDF. I guess if all we ever did with the relational data model was talk about theorem proving, that would put people off too. No-one (almost no-one) calls relational data a solution in search of a problem or pie the in the sky AI research - with a good query language (SQL) and largely excellent tools (RDBMS), of course they don't. Quite the opposite, enterprise architectures today depend way too heavily on SQL and RDBMSes - whole platforms and industries are predicated on their presence.

In the meantime we have to wait for the W3C to standardize the Ontology Query Language and for someone enterprising to invent the RDFMS before RDF becomes an obviously good option.

But beyond very simple stuff, getting RSS-Data from RDF would almost certainly be a waste of time because it's so lossy.

What would be the point of mapping a struct or a string to RDF or vice versa? All the data types are telling you after all is the allowable range of a value for some property. And there's not much interesting we want to assert about struct or a string itself. Although, if we're talking about objects in the domain, that's different. What's interesting is the property and what the property is slotted into. For example, in a OO language we might want to say things about a User, which happens only as an implementation detail of the programming language to be written down as a struct of strings. We might want to associate business logic to that User, or assign certain permissions to her, but ultimately we're not really interested in mapping strings and structs onto domain objects; these are just things we use to keep our compiler and our managed runtimes happy. I'd argue, strongly, when you break down domain objects into essentially programming language primitives and send those across the wire, you're making exactly the same mistake as the SOAP RPC-encoded approach. This approach only ever makes sense if you control both endpoints and specifically can ensure the computing platforms are interoperable - on the web this is a nonsense approach. The right approach is for parties to agree to share some minimal assumptions about a domain structure without fussing unduly over data types, which is where XML shines (especially running over REST or an SOA), or to bite the bullet and use RDF to describe things in the domain, which is not unlike the approach anyone using ER diagrams or the UML is taking with a domain model (and will have something of the same limitations). Surely the idea is to protect your data from the vagaries of things like structs and varchars not force agreement on them?

October 7, 2003 11:39 PM


Tom Passin
(October 8, 2003 05:13 AM #)


(October 8, 2003 08:50 AM #)

Nicely put.

Patrick Logan
(October 8, 2003 05:19 PM #)

"This approach only ever makes sense if you control both endpoints and specifically can ensure the computing platforms are interoperable..."

Of course, this is not true of the XML-RPC data model, since there are more implementations than there are of SOAP.

"...without fussing unduly over data types..."

I can't go with this argument. There are good and bad things about RSS-Data, as well as for other XML-based representations. Ultimately, data types and data models have to be described sufficiently.

I think this position is making too many assumptions that the RSS-Data types "map to an RPC approach". That seems like a Red Herring to me.

Trackback Pings

TrackBack URL for this entry: