« G$$gle's new toolbar | Main | XML events: not that difficult »

Foundations for component and service models

[This post was inspired by the article Evolving Java-based APIs , and how misguided its principles would be for the web or components. It's biased toward Java, but the principles are applicable beyond Java]

What's the problem

The core problem of component and service architectures is easy to express: how do I change a published interface without breaking the callers?

Java best practices won't help

There's no nice way to say this - idiomatic Java is not good for component or services architectures. Which might suggest that idiomatic Java has no business either on the Web or in component architectures. Please note the use of the word idiomatic - we have a lot of learning to do in a short time. But there are a few ways to mitigate things - by looking at what other programming languages have done, and in particular, by looking at what are arguably the best existence proofs we have of loosely coupled component architectures - Internet protocols.

Avoid changing or extending the interface methods

Use highly generic method calls that are qualified with metadata - this is how SMTP and HTTP work and is often cited as a reason for their phenomenal success. If you're lucky, the remoting you happen to be working with will be organized this way. You'll notice that this is the polar opposite of models such as the EJB spec, which actively encourage you to pile on the methods and go n^2.

The thing to realize is that every time we add a non-standard method to an object call we force a cost on all possible clients of that object to understand that method. The cost of integration is rises as a function of dependent methods in the system, not dependent objects. This only ever makes sense if you own all the endpoints (as is often the case with J2EE projects starting out). It makes no sense if you don't, and over the life of the system it will make less and less sense. Count the number of published method names in all the third party apis you use in your code to get an idea of how tightly coupled you are. Then count the number of published methods in your own APIs to get an idea of how tightly someone can get coupled to yours.

In systems based around services or components, this cost can quickly get out of hand, compounded by the fact the methods are not usually constrained by a protocol or exchange pattern - the semantics are not uniform. This is why RPC based web services require choreography and why protocol neutrality is an architectural defect, not a feature, of web services.

The issue with Java (or C#, or C++) is that if you're like me, you've been conditioned to think in terms of objects, not methods. So when it comes to determing the how coupled the system is, you're liable to look at package or object level dependencies, but the real damage is happening on at the method call level. For example the JDepend tool calculates coupling based on static imports, not method calls.

Control change by using a dictionary interface

If the idea of a once and only once component interface sounds impossible, then there is a working compromise. Design the interface as a dictionary (ie prefer a Map over a Bean). I've seen Python and Lisp code that does this well, plus they have good support for meta-class hacking; and it's sometimes called data-driven programming in the Lisp world. Java can do this via using method names as map keys and using reflection to invoke the method against its object.

However Java has made this approach difficult. To do it, functions really ought to be first class elements of the language so we can bind them to names and pass them around as arguments. The reflection API goes some way to addressing that as I mentioned, but it's a complicated kludge by comparison to what's available in other languages. By the way the only java API I know that's designed this way is JSR-187, but it's stalled at the moment. The Command pattern or Plugin lifecycle APIs are a step toward the idea of a uniform object interface and anyone who has worked with these patterns will understand their value (imagine programming against Eclipse without the plugin API) - the problem is their uniformity is only local to an implementation. The API that has gotten the furthest in this approach is JavaSpaces, which standardized the programmatic interface to a tuple space (think of it as a poor man's Internet protocol).

Calls should return documents not objects

There isn't much point in having a coarse grain component or services architecture if all you do is use them to gain pointer access to fine grained objects. Again this is a failing with the bean/dto style best practices of J2EE, which have arisen as an optimization technique for calling over RMI-IIOP rather than through analysis of how to control change between tiers. Thankfully web services architects have been back-pedalling away from RPC to doc/lit over the last year.

Avoid binary compatability

I can't emphasize this strongly enough. Any system that requires binary compability is non-scalable in terms of adoption, mangement and change responsiveness. I submit J2EE's single biggest architectural failing is forcing clients into binary compatability and thus, lockstepped upgrades. To be fair, .NET is not much better.

Web services are in danger of regressing to the same situation. This week, SOAP 1.2 has gone to recommendation status by making SOAP an application of the XML Infoset, not XML. This is a bug not a feature. Previously SOAP 1.1 was an application of XML. With XML, you could at least read my data with off the shelf parsers - once we agreed on the data. If I start sending you a non-XML stream in the guise of Infoset, you'll need a decoder to make the stream emit Infoset events - this means we need to agree on codecs on top of data agreement. You also need the decoder that works in your system, not mine - since what works in my system doesn't matter a damn to you. Naturally the pair of codecs need to be tested together - but before all you had to do was parse my XML. It only takes a handful of these codecs to derail interoperability.

At this point SOAP is no longer the interoperation hotspot. What will matter are the codecs needed to unpack an Infoset. At best this is interoperation at the level of RMI-IIOP, DCE or DCOM. At worst this is interoperation at the level of device drivers. You'll also note that it's easier to interoprate using codecs that are from a single source - this leverages the codec provider to lock-in the users of codecs. In fact, given what we know to date about building distributed systems (not designing them, building them), that's very much the point. The W3C have made a mistake in allowing that change to go through.

Don't confuse an API with a contract

Don't compose contracts from APIs, compose them from protocols delivering data. The key point is that an API is insufficient means to agree a contract, at least in normal usage. Contracts are better modelled as an ordered exchange of documents.

Version the contract

Versioning is something we're not good at in the industry. JAR hell is little or no improvement over DLL hell. This is a topic I'll go into in greater detail another time, since it is so problematic. But in my experience, at the level of components and services, the right approach is to version the protocol and shared data format, and not the API signatures (how you version that is local to you). Seeing @deprecated in a component's published interface is an indication something has gone wrong. Again this is a good argument for only publishing highly generic methods.

Don't build an API for data transfer

This is subtle since there's no clear boundary between transfer and invocation (transfer versus invocation is arguably the point where the REST and SOA styles depart). But if your component's or service's job is essentially to shuttle information rather than perform computation or do work that that involves significant state management, you may be better off placing the component on the web and integrating clients via a transfer protocol such as HTTP. Amazon's REST style API is a good example.

As another example, there's probably no reason to build a weblog API via something like XML-RPC or SOAP, given the ubiquity of HTTP and the fact that every blog on the planet is web accessible by default, unless you're angling to lock clients into your service by stealth.


June 28, 2003 12:47 PM

Comments

Trackback Pings

TrackBack URL for this entry:
http://www.dehora.net/mt/mt-tb.cgi/1015

Listed below are links to weblogs that reference Foundations for component and service models:

» Foundations from Raw Blog
I've been thinking of how the API side of the Echo project. I agree with Shelley that it's a pretty... [Read More]

Tracked on June 29, 2003 11:38 AM

» web services: contracts and coupling from Ted Leung on the air
I'm working through a backlog of interesting posts. Sometimes that produces interesting juxtapositions, like this old post by Bill de Hora on component and service models and this new article by Rich Salz on Typeless Schemas and Services. Both a [Read More]

Tracked on September 7, 2003 08:42 AM

» Java and Web Services are not a match made in heaven from Random Stuff
The more I think about this, and read about it, e.g. here, or here, or think about it in this context, the more firmly I believe that Java, or any strongly typed programming language, is a good vehicle for building [Read More]

Tracked on September 11, 2003 08:54 PM