« On the futility of popular uprising in 21stC Ireland | Main | An email and some book recommendations from Booby Woolf »

RDFX: RDF XML serialization

Mainly design notes and musings for now.

Naturally, a search project which is going to use RDF needs a way to manipulate and interchange the stuff. XML - yes please. RDF/XML - no thanks. So, in the grand software tradition, I'm hacking a (nother) XML serialization of RDF.

Design goals:


  • It's for serializing RDF in XML, not XML in RDF

  • Readable

  • Writable

  • Hackable

  • CutAndPastable

  • Optional support for contexts

  • Optional support for quotation

  • A decision in 10 minutes whether to use it

When it comes to XML, I have never bought reasoning of the form "blahML will be mainly processed by machines,so who cares what it looks like". That's not what markup's about baby. It always ends up in front of person who is in front of text editor which has no schema support and who is not even a Desparate Perl Hacker, just Desparate. I want people to be able to fire up an editor and start typing RDF triples or to be able to hack some code together to start producing or consuming the stuff. Just because RDF is formally stuffy doesn't mean the markup can't be simple.

My experience of RDF to date suggests that Uche Ogbuji and Graham Klyne (and others) have been right all along - RDF needs contextual support. I think this has been missed because folks (included myself once) assumed reification would magically provide such support (and for other things like provenance and quotation, such is the power of magic). But in 2004, RDF reification is a syntactic and semantic roadcrash; it isn't fit for anything. So folks can have quads if they want them (and we'll sort the semantics out in code for now - typed contexts anyone?).

The design artefacts:


  • Schemas: RNG and WXS:

  • Samples

  • Spec

  • Transforms: rdfxml2rdfx, rdfx2rdfxml

  • Parsers, serializers (Python, C#, Java)

So far I have the RNG, WXS. There's a sample here. I'm still sorting out whitespace normalization, but am getting happy with the literal, typed literal and bnode support (need to review against the abstract syntax). The XML is in a namespace (bah) - this may change if I decide namespaces hurt the design goals (the same sample, without namespaces).

I would like to think it's simple enough to be used as a powerful alternative to the kind of property/value extension hooks you see sometimes in XML (aka attribute driven markup), though that's not a design goal.

Things that will not be present or supported:


  • XMLBase (and by extension no relative URIs)

  • QName abbreviations in place of URIs

  • Canonicalization of XML literals (I haven't looked at this issue in 2 years, but I would like RDFX to be be "dsigabble")

  • Literal as subjects

  • Nested graphs

  • Reification (use quotation and get over it)

On the long finger is RDFC, or a compact form a la RNC (if only to get the W3C to mandate ntriples - if it's good enough for testing it's good enough). But an rdfx2yaml hack might do the job anyway.

This is fun.

[mary poppins: let's go fly a kite]


March 9, 2004 09:24 PM

Comments

Alexander
(March 9, 2004 10:20 PM #)

Excelent! RDF:XML sucks big time, not because machines can't figure it out, but because I can't figure it out.

Cut'n'paste is the backbone of the www. Let's continue that fine tradition. I'm in, whatever 'in" means.

Jon Hanna
(March 13, 2004 04:09 AM #)

Everything should be as human-readable as possible, but no more human-readable.

Jon Hanna
(March 15, 2004 10:09 AM #)

Regarding context. Do you think reification can't supply this at all, or that it can but it's too unwieldly?
Perhaps a serialisation that had a concise way of supplying context could be represented in triples in terms of reification, satisfying both camps?

Trackback Pings

TrackBack URL for this entry:
http://www.dehora.net/mt/mt-tb.cgi/1184