« HTTP dark matter | Main | Why can't IDEs use Ant for their classpaths? »

SemanticIntegration 101: something even an AI grad would know

AI is often said to be largely useless, but if you had done enough of it you would already know this:

In conclusion, it is clear that Semantic Web can be used to map between XML vocabularies however in non-trivial situations the extra work that must be layered on top of such approaches tends to favor using XML-centric techniques such as XSLT to map between the vocabularies instead. Dare Obasanjo

Among other things, you would also know that the an important lesson folks picked up after the AI winter (who's 15 year anniversary cannnot be far away) is that how you model the inert data is key; that's one reason why all the SUO and WebOnt folks are so hung up on getting the ontologies just so, and that there remain a wasteland of decent tools and syntax (they just don't matter as much as abstract data models in the scheme of things). So I guess AI ain't so bad after all; if nothing else it'll keep you out of the weeds.

As for mapping the complicated stuff; we've been doing that for years in Propylon. Our CTO, Sean McGrath, can wax lyrical on this. It's called pipelining, and it's the way to go for systems integrations in general, not just munging a date format (perl will do just fine there). The main advantage of pipelines are an ability to keep recomposing as requirements change. In short - you can keep changing the transformation as fast as the business changes its mind. Try doing that with an XSLT write-only trainwreck.

I see Clemens Vasters has has caught the pipelining bug and that .NET has had it going a good while back in no small way due to Tim Ewald - WSE 1.0 supported a kind of in-memory pipeline for SOAP Envelopes; for Java folks it's not a million miles away from servlet filter chains.

Perhaps that's not representative, but it does seem that the .NET crowd gets the pipeline model. I'll go out on a limb here- I suspect that has something to do with MS programming culture been less inured to object orientation and object patterns. A key thing for XML pipelining is that you want to separate data from process acting on that data, which is heresy in some OO circles. The only process really tied to an XML document is schema validation, and even then the behaviour is so data driven, so late bound it's hardly worth picking it out. Off the top of my head, I can only think of one OO pattern where it's ok to decouple data from behaviour and it's the Visitor. It seems that at the system edges, where XML does matter, functional programming and lazy evaluation are the way to go.

The pipeline is the most important pattern/idiom in XML programming. The difference between it and the semantic web outlook, is that any good XML hacker knows that transformation is also primary stuff, not something to be cast aside as a small matter of programming because the model theories can't support it.

February 22, 2004 09:57 AM


Trackback Pings

TrackBack URL for this entry: