« links for 2007-08-20 | Main | links for 2007-08-21 »

Rates of decay: XMPP Push, HTTP Pull

Mike Herrick:

"Clearly Atom can't replace ALL pub/sub use cases, but for every day integration architecture where you want business events / EDA why can't we use Atom feeds? In an extreme case, you might have an event sink requesting the feed every 10 seconds - in most cases every 10 minutes would likely be fine?

Who is doing this today? Any lessons from the trenches?"

One thing. The sliding window. If the publisher is publishing entries faster than the clients are reading them, and the publisher is limiting the number of entries in the feed at any given time, then clients are prone to missing entries. Actually you don't need a fast publisher; a downed client will do.

What you really don't want to do in that case is temporal coupling - synchronizing publishing and polling rates so that there are no window misses. It stops independent evolution.

The second last time I had to solve this kind of integration problem (2004), events were going to be arriving at a fair clip, and so the sliding window is why I and my colleague settled on XMPP as the application protocol.

Another thing to note is that with XMPP, you won't need an Atom feed, pushing individual Atom entries is all you have to do. This lead me to conclude that the "feed" is a construct specific to HTTP, or even client/server polling in general. I am hence glad that Atom allows an Entry to be its own document.

Now that I know about feed paging, I would reconsider using HTTP - albeit over-XMPP is fast, and has interesting integration characteristics, such as event routing to ad-hoc listeners. Feed paging allows you to walk back through a feed archive; going from the "current" feed URL to a succession of "previous" feeds.

In that case, if your client goes offline, when it comes back up, it pulls down the current feed. It can then check the "previous" URL and if it doesn't recognize it, it should pull it down and continue back-paging until the previous is known. The client then resets its "previous" and can within the last unknown previous feed save the entries it doesn't have (each Atom entry has a unique id). Clients do have to keep pager state, and probably a decent number of previously fetched entry URLs.

Feed paging is likely to become an Internet Standard.

Consequently, the last time I solved this problem, I chose a feed paging; the publish rate was also lower. The design was liked, but not taken up for scheduling reasons. The alternative design I and another colleague gave was to reverse data flow to a push style - POST events to a URL (it was not Atompub, but the endpoint did have to return a URL). The work estimate for the POST solution came as less than implementing a feed pager and archiving feeds according to that model. There was another reason for the pager approach; it was deemed easier to secure GET to a feed URL than POST to an endpoint.

Either way, XMPP and HTTP and Atom are all you need for this class of problem.

If I was a bookie, I'd close bets and pay out Sam's long bet as far as XMPP goes.


August 20, 2007 09:45 PM

Trackback Pings

TrackBack URL for this entry:
http://www.dehora.net/mt/mt-tb.cgi/2138