« :-( | Main | Cruisecontrol not starting JBoss container »


With the current furore over Andrew Tridgell reverse engineering the Bitkeeper wire protocol, it's interesting to note that the argument seems to be not over the wire protocol but enabled access to the metadata via understanding the protocol. Tridgell has done this before with Samba. I imagine Bitmover have every right to claim that the metadata is part of product (presumably it's generated by the software), but it seems then to be difficult or impossible to manage the code without the metadata. Caveat Emptor then.

If so, the Linux kernel SCM argument ratifies the notion that data is the new lock-in. Who owns data and metadata and who has access to them is an important issue.

From a technical perspective, it's arguable that the higher you go up the programing language stack the fuzzier the distinction between software and data is. If you had to look at a typical Java or C# system, it'd be clear enough for the most part what's data and metadata and what's code. Technologies like annotations make this fuzzier, but not impenetrably so. XSLT scripts can get fuzzy, as can systems utilizing code generation. A significant Lisp system could make for an interesting data ownership argument. Lisp advocates having been preaching code==data for decades. Consider that the configuration files for my emacs editor are in Lisp, or that using Python or Ruby source code to store configuration details (rather than XML) is a common idiom. Down the line, I can imagine a rules driven system based on Topic Maps or RDF data being equally fuzzy.

In short, a lot of innovation in enterprise and commercial software is about blurring the line between data and code. I would love to see those knowledgeable in Open Source, Web, Compliance and IT Governance matters pick up on this issue, and maybe focus less on software licencing. Most RFPs that pass my desk assume that what is the data is in a system is largely obvious. They've no doubt been set straight on music, but I would guess that most folks think that they own their IM conversations, their email, their weblogs, and their photos. It's not just that people won't own their data - it's not unfeasible to imagine a situation where a software provider had to turn the code over and give up a strategic technology advantage to enable access to the data.

April 17, 2005 06:45 PM


Post a comment

(you may use HTML tags for style)

Remember Me?

Trackback Pings

TrackBack URL for this entry: