« On the Pepys | Main | Sing Li's third JMX article »

Joshua Allen on the semantic web

Better Living Through Software

Joshua Allen has consistently had interesting thoughts on the semantic web. Here he's redefining (slightly) the semantic web in terms of human voices rather than truthful assertions. I guess human voice means our opinions. That's a useful reduction, and should help make RDF scale in the face of mutually inconsistent claims.

He also talks about indelibilty of information. This ignores an interesting feature of storing data digitally on the web. Web data is not indelible - in fact it's easily changed in retrospect. Consider the Church of Scientology forcing the removal of information from the Internet Archive. I suspect this a dangerous characteristic, just as Orwell said.

Now, time for some nit-picking:

RDF is often a whipping-boy, but a red-herring in this discussion. To know why, you need to understand that RDF is simply a syntax for exchanging knowledge representations, and not even a particularly ambitious or cutting-edge syntax.

I think I know what Joshua means here, but RDF is not a syntax. You need to give it a syntax like n-triples or RDF/XML, before what Joshua say makes sense True that RDF is not especially novel or expressive in terms of other Knowledge Representaion languages, but it has a strong semantic basis that syntactically driven technologies, like XML, do not.

From a minor infraction to more serious ones:

In fact, Dare Obasanjo has remarked to me long ago that XML is not really different from Lisp's s-expressions -- a point elaborated in the paper by Jerome Simeon -- so in a sense, Mark Pilgrim and the XHTML advocates are lobbying to have people write their web pages in Lisp instead of HTML.

Mark Pilgrim is doing nothing of the sort; to say so is to confuse on a number of levels. I haven't read Jerome's paper (yet), but beyond passing similarity, XML is nothing like Lisp. XML is pure syntax, Lisp has semantics due to its status as a programming language. the evaluation rules set out for Lisp are not those set out for RDF, though you could certainly use Lisp to encode those rules in a reasoner. You can map back and forth between XML and S-exps syntax, but you can map back and forth between XML and CSV files as well. In fact CSV files have much more in common with XML than Lisp has (or any programming language). Mapping XML to S-exp syntax is not the same as mapping XML to Lisp - throwing RDF into the pot only muddies things further.

Later: I followed through on Jerome's paper. Seems it's about adding a semantics to XML Schema and XQuery. So I don't need to read it in full - XML Schema and XQuery are no more XML than RDF.

# "RDF is too complicated" - This also is a very potent argument. The primary serialization for RDF is XML. This really starts to hurt your brain when you realize that RDF and XML are almost the same thing. Too much meta and your mind can't bootstrap. And the two main non-XML serializations that exist are named "N3" and "N-Triples", but bear no resemblance to one another -- a prank that lends credence to the allegations of gratuitous complexity.

No. Why? Because RDF has semantics. Those semantics are the same, irregardless of the way you decide to inscribe an RDF graph. XML has no semantics, it is at the very most, invariants imposed on a sequence of Unicode characters. RDF's relationship to XML is so fleeting as to be hardly worth mentioning, except for the fact the two are continually conflated when they should kept very distinct. Granted however than arguments pro complexity of RDF are often spurious.

And by the way, N3 is not RDF. This seems to be such a widespread misunderstanding I've marked it in red so you won't miss it. Allow me to quickly elaborate. N3 is much more expressive than RDF, although it lacks RDF's semantic precision. You can say more interesting things in N3 than in RDF. But to understand what some N3 means, you need to look at the source code to cwm, Tim Berners Lee's N3 inference engine. To understand what some RDF means you need to read the RDF Model Theory. N3 and RDF happen to look quite alike, but N3 statements are much more like a regular programming language's.

Furthermore, the existence of multiple serializations leads people to the understandable misconception that RDF is not simply a syntax for exchanging knowledge representations.

Proof by repeated assertion is no proof at all.

I'm being somewhat pedantic (but just try running Joshua's reasoning on the rdf-logic list). I'm doing this because you can't make good sense of RDF and the semantic web without some precision of thought, however irritating and stuffy that may be. The RDF community spent years in a state of confusion, talking past each other, because of fundamental errors in understanding such as not distinguishing between syntax and semantics, or use and mention. If we don't take the effort to be clear about RDF, there's little chance the machinery and information we build on it will. It's precisely because computers are so brittle to nuance and context that we need to be precise when it comes to talking about a language for interchanging knowledge between machines.

People coming to RDF, especially from XML, may well come away with the wrong ideas altogether about what RDF is; that will doom them to a frustrating experience with the technology until they take to time to pick things apart.

January 6, 2003 01:03 PM


Trackback Pings

TrackBack URL for this entry: