« They just work | Main | Danny Ayers: Spotting disinformation »

Web services: Rosencrantz and Guildernstern are not dead

A popular view held by WS and MEST proponents is that for the purposes of message transmission (and typically SOAP messages), TCP and HTTP are equivalent. In other words layers 4 and 7 in the OSI model are no different in Webservices architecture (OSI layers 5 and 6, Session and Presentation we'll ignore, since the Internet does). This is argued as a good thing since we move the processing semantics into SOAP headers which will transcend the mere details of various transports.

I should point out that at the W3C at least, the ws-architecture group haven't feel informed by the OSI stack or the Internet subset and have a working definition of transport that is roughly 'everything that delivers SOAP messages'. Which is convenient, though it may be somewhat confusing to those involved in networking or anyone that did distributed systems 101 at college.

Before we start, some history. A mistake was made in the past with the once widely held view in distributed systems that location transparency was a good thing - you shouldn't care where an object is on the network. No number of systems and networking engineers waving copies Waldo and Emerson or protesting at the watercooler with the eight fallacies on placards was enough to presuade anyone who made decisions on this stuff that transparency was a bad idea. We had to build out systems and see them fail, just to be sure. We can argue that transport transparency is a similar error - highly desirable, but highly misleading.

So *is* there a useful difference? First, we need to be aware that 'application protocol' and 'transport protocol'are not well-defined sets, and what is meant can change depending on who you're talking to. On the other hand this is computing and programming, not science and mathematics; we don't always need or expect precise definitions to make sense to each other. But let's try to be more precise and identify a distinction - Actors.

The primary difference between application and transport protocols is that they differ in *intended effect*. The application protocols like FTP, SMTP and HTTP have the rudiments of what are known as "performatives". Now in computing terms, a performative is a highly formalized action word, or verb. 'Highly formalized' in turn means a computer program could make a decison on what to do based on the word along with some surrounding context (such as who said it and when it was said). You use a performative solely to influence another entity to do something on your behalf. The sent message in combination with the performative is designed to influence the sender. Compare that with the notion of a medium. A medium in this sense is that which carries the performative message.

Transport protocols aren't like this - they're not messenger boys, they're delivery boys. Transport protocols don't make the same utilization of action words. For example TCP use terms such as SYN, ACK, FIN, which are more like grunting than conversation. Claiming they're the same class of animal as application protocols is like claiming black is the same colour as white by progressing through infinite shades of grey; an interesting but crashingly trivial rhetorical trick.

So what makes application protocols distinct from transports (and much more useful and interesting) is that when you use an application protocol your are trying to get another system to do something for you by asking it. In English we do this with action words, or verbs. In application protocols we use action words too, but just a handful. HTTP has 8, but the vast majority of the web is getting things done with just three of them, 'get', 'post' and 'connect'.

In the world of computer protocols, application protocols deal with performatives and transports are media. The protocol neutral school of though would have us treat Internet protocols not as protocols, but as media.

Application protocols aren't the only things that use a controlled set of well defined action words. SQL is based around a few. Space or tuple based systems such as Javaspaces and Linda are based on action words. Most of the things you do with files and directories on your computer only require a few verbs (new, move, read, write, delete, copy).

In Internet protocols you can't just make up new actions words; they're usually a controlled set, and extended carefully. The main reason to add a new verb to an Internet protocol is to avoid corruption of the meaning of an existing one, but that's no guarantee of adoption by implementations. Much of thinking around making the most out of a few verbs is that it induces useful properties in a system, such as scalability and cheap global coordination. This would be very much part of REST doctrine, known there as the 'uniform interface'. Allowing anyone to make up new verbs seems like a good idea initially and may work for local or 'gated' cases (like a software component, an API or a chunk of object middleware), but imposes costs on all the other entities in the system to learn the new verbs and cross reference them with which objects they can be applied to. As the number of possible conversations increases the cost of communications quickly gets out hand and ceases to become sustainable. The counter-argument to uniformity is that no-one ever has to talk to everyone else, so the N-squared problem is theoretical. To which one could counter, no-one ever has to use all the possible 'transports' so SOAP transport transparency is in turn solving a theoretical problem (some study in how actual communications networks emerge and coalesce into hub and spoke models can be revealing here). The point of course is that the problems are not theoretical and that you do not have to reach asymptotic limit in either case to feel a pinch.

The main concerns of a WS architect, if we did accept a difference between transport and application, are two-fold. First that the architectural model for Web Services has a serious hole - SOAP does not run over all things equally. Second that the meaning of the SOAP message changes depending on the protocol it runs on, which is disturbing in a Heseinberg Uncertainty kind of way. If we have a payload with a FIPA-ACL 'inform' or getStockQuote stuffed into a SOAP envelope which is in turn stuffed into a HTTP envelope, we don't expect 'inform' or getStockQuote to mean different things depending on where it is. I don't think anyone wants this. Even hard core RESTafarians would probably agree there's value in being able to span systems with the meaning of SOAP headers intact. On the other hand it's not clear it's avoidable any more than probability is avoidable in subatomic physics. Most of the Internet application protocols that one might want to to run SOAP across involve actors in the roles of client and server who are communicating using performatives (or at least a version of perfomatives analogous to grunts and utterances).

Webservice and MEST architectures seems intent on modelling systems solely in terms of SOAP actors. There is little chance of giving up on protocol independence - it's too desirable a property. Fair enough. An application protocol changing the meaning of a payload is not a desirable outcome. But that is not to say there is no room for dissonance between the word games played at the SOAP level and what SOAP is being transmitted over. It's engaging in bogosity akin to that of location transparency to pretend that two SOAP actors having conversation 'Play' over UDP will be the same language game as the same SOAP actors having conversation 'Play' over HTTP, because the client and server actors in HTTP are there too, having their /own/ conversation and acting as middlemen.

Hamlet just isn't the same play without Rosencrantz and Guildernstern.

April 9, 2003 12:04 AM


Post a comment

(you may use HTML tags for style)

Remember Me?

Trackback Pings

TrackBack URL for this entry: