« Freak Atom occurrence | Main | cvs2svn: brilliant! »

R

R is an XML serialization for RDF.

Design goals

  • It's for serializing RDF in XML, not XML in RDF
  • Readable
  • Writable
  • Hackable
  • No choices
  • Cut and paste friendly
  • Optional support for contexts
  • Optional support for quotation
  • A decision in 10 minutes whether to use it

Things R will not be supporting:

  • XMLBase (and by extension no relative URIs)
  • QName abbreviations in place of URIs
  • Canonicalization of XML literals
  • Literal as subjects
  • Nested graphs
  • Reification (use quotation and get over it)

Background

The name was originally RDFX and was announced last March. A Google search shows that RDFX is already taken by an Eclipse plugin for RDF. So, a new name was needed. The design goals are the same now as then but the R syntax has changed quite a bit since, enough to put it into a new namespace. I'm doing this because except for the simplest graphs I don't like RDF/XML on a number of levels: technical, aesthetic, usability; and I have a need to interchange RDF data rather than do mixins between RDF and XML/HTML.

The markup

Here you go, this example covers most of capabilities:


  <r:rdf xmlns:r="http://www.dehora.net/r/2004/07/" version="20040718">
    <r:graph>
      <r:description>I am not a description </r:description>
      <r:context uri="http://example.org/con10"/>
      <r:statement>
        <r:context uri="http://example.org/con12" />
        <r:subject uri="http://example.org/10"/>
        <r:property uri="http://example.org/11"/>
        <r:value uri="http://example.org/12"/>
      </r:statement>
      <r:statement>
        <r:subject bnode="K12"/>
        <r:property uri="http://example.org/20"/>
        <r:value type="http://www.w3.org/2001/XMLSchema#integer">22</r:value>
      </r:statement>
      <r:statement>
        <r:subject uri="http://example.org/31"/>
        <r:property uri="http://example.org/32"/>
        <r:value>33</r:value>
      </r:statement>
    </r:graph>
    <r:graph>
      <r:statement>
        <r:context uri="http://example.org/con10"/>
        <r:subject uri="http://example.org/1000"/>
        <r:property uri="http://example.org/1001"/>
        <r:value>http://example.org/1002</r:value>
      </r:statement>
      <r:statement>
        <r:subject uri="http://example.org/one"/>
        <r:property uri="http://example.org/two"/>
        <r:value type="http://www.dehora.net/r/2004/07/type/xml" xmlns="" >
          <target name="get-ant-extensions" description="">
            <get src="${ant.href.ext}/jdepend.jar" dest="${ant.lib.home}/jdepend.jar"/>
          </target>
        </r:value>
      </r:statement>
    </r:graph>
    <r:graph quoted="yes">
      <r:description>I am not asserted, but I'm not sure what that means</r:description>
      <r:statement>
        <r:subject uri="http://example.org/1"/>
        <r:property uri="http://example.org/1/2"/>
        <r:value uri="http://example.org/1/2/3"/>
      </r:statement>
    </r:graph>
  </r:rdf>

It's a relatively flat, repeating structure, much like what you'd expect from database XML dumps or record oriented markup. An R document contains sequences of graph elements which in turn contain sequences of statements. A statement is the usual three part RDF structure.

Subjects and properties

Subjects can be URIs or blank nodes (sometimes called bnodes). To indicate a subject is a blank node, use the 'bnode' attribute' otherwise use 'uri'. Properties are simple; they're always just URIs.

Literals and values

Every RDF triple has a property value (usually called the 'object'). For resources, we just use a uri attribute. Anything else is an RDF Literal. It turns out that Literals are the most complicated thing to serialize. R has 3 literal forms:

  • Raw element content: this is just straight text inside the value element. The value element has no uri attribute in this case.
  • Typed literals: if you want to say that a literal has a type, you can do this by setting a type attribute on value. The type must be a URI along the lines of the things you might see in xsd.
  • Embedded XML: this is special case of a typed literal. Where the type attribute value has the value 'http://www.dehora.net/r/2004/07/18/type/xml' you can assume the element content is XML.

Descriptions and Contexts

Any graph element can carry an optional description element. A graph or a statement may also have a context element. Contexts are mildly controversial since they're not core RDF but a lot of practitioners, including myself, find them useful to do things like track where statements came from or to qualify statements in some way. Some people might use reification for this - but personally I find reification is broken and best avoided.

Quotation

It's a design goal, but I'm really not sure about this part - let's call it "experimental". A graph element may carry a 'quoted' attribute, which would mean that the graph in question may be referred to but its triples should not be asserted. This is definitely not in RDF, but computing logic languages tend to have a concept of quotation (again this is something people used to think you could use reification for). Like I said, I'm not sure about it.

RDF Collections

No support as of yet. I'm not sure how best to write down collections right now, but I'll try to ensure their support in the syntax will be backward compatible with this version.

RDFX

Some things have changed in R since RDFX last March. All URIs have been moved into attributes (they have no structure that would require them to be in element content). The literal constructs inside the value element were more verbose. Values did not support bnodes. R lives in a new namespace and has a version attribute added. Subject, property and value are no longer ordered in the schemata.

It's still ugly!

Yes it is remarkably ugly. If you use a default namespace you might alleviate some of that, but in truth the ugliness is driven by the URIs - it's hard to look at a lot of URIs in XML and not object to it aesthetically. Here's an example with the URIs and prefix elided:

  <rdf 
     xmlns="http://www.dehora.net/r/2004/07/" 
     version="20040718">
    <graph>
      <statement>
        <context uri="..." />
        <subject uri="..."/>
        <property uri="..."/>
        <value uri="..."/>
      </statement>
      <statement>
        <context uri="..." />
        <subject bnode="..."/>
        <property uri="..."/>
        <value uri="..."/>
      </statement>
      <statement>
        <subject uri="..."/>
        <property uri="..."/>
        <value>33</value>
      </statement>
    </graph>
  </rdf>

The point of this however is that the document structure should be regular and clean enough to read and write - no striped syntax, no abbreviated forms, few surprises.

The schemata

An RNG schema is available here, and the a WXS one generated by trang is available here. Here's the schema in rnc form:

  default namespace = "http://www.dehora.net/r/2004/07/"
  start = rdf
  rdf = element rdf {
      attribute version { "20040718" },
      quoted?,
      graph*
    }
  graph = element graph { quoted?, description?, context?, statement* }
  description = element description { text }
  context = element context { uri }
  statement = element statement { context?, (subject & property & value) }
  subject = element subject { bnode | uri }
  property = element property { uri }
  value = valueURI | valueBNODE | valueTYP | valueTYPXML
  valueURI = element value { uri }
  valueBNODE = element value { bnode }
  valueTYP =  element value {
      attribute type { xsd:anyURI }?,
      text
    }
  valueTYPXML = element value {
      attribute type { "http://www.dehora.net/r/2004/07/type/xml" },
      ANY
    }
  ANY =  element * {
      (attribute * { text }
       | text
       | ANY)*
    }
  quoted = attribute quoted { "yes" | "no" }
  uri = attribute uri { xsd:anyURI }
  bnode = attribute bnode { xsd:NMTOKEN }


July 18, 2004 02:32 AM

Comments

Vincent D Murphy
(July 18, 2004 10:11 PM #)

I'm impressed. The repeating regular structure appeals to me. I instantly see the simularity with database dumps, or arrays of hashes in Perl.

Thanks for doing this work and sharing it.

bryan
(July 19, 2004 08:32 AM #)

I've long thought that it would be nice to use GraphML instead, extended with semantics. Then there is a clear indication as to what is a node, an edge, and a port on a node in the graph.

The following is of course a very stupid example:

<graph id="G" edgedefault="directed">
<node id="Employees"/>
<node id="Company"><data>Company Name</data></node>
<node id="John"/>
<edge source="John" target="Employees"/>
<edge source="Employees" target="Company"/>

</graph>

but I think that it's a nice format to build graphs from, and thereby derive all benefits of having graphs.

http://graphml.graphdrawing.org/primer/graphml-primer.html

Trackback Pings

TrackBack URL for this entry:
http://www.dehora.net/mt/mt-tb.cgi/1348

Listed below are links to weblogs that reference R:

» R - Alternative RDF XML Serialization from Stefan Tilkov's Random Stuff
Bill de hÓra introduces R, a very readable alternative XML serialization of RDF graphs.... [Read More]

Tracked on July 18, 2004 11:32 AM

» Posts from del.icio.us from protocol7
R - a better XML serialization of RDF? Geografisk distribution av svenska bloggar IM Watching Enterprise Integration Patterns... [Read More]

Tracked on July 18, 2004 12:15 PM

» Posts from del.icio.us from protocol7
R - a better XML serialization of RDF? Geografisk distribution av svenska bloggar IM Watching Enterprise Integration Patterns... [Read More]

Tracked on July 18, 2004 12:15 PM

» Posts from del.icio.us from protocol7
R - a better XML serialization of RDF? Geografisk distribution av svenska bloggar IM Watching Enterprise Integration Patterns... [Read More]

Tracked on July 18, 2004 12:18 PM

» Posts from del.icio.us from protocol7
R - a better XML serialization of RDF? Geografisk distribution av svenska bloggar IM Watching Enterprise Integration Patterns... [Read More]

Tracked on July 18, 2004 12:18 PM

» Posts from del.icio.us from protocol7
R - a better XML serialization of RDF? Geografisk distribution av svenska bloggar IM Watching Enterprise Integration Patterns... [Read More]

Tracked on July 18, 2004 01:03 PM