« Got Game | Main | links for 2007-06-10 »

APP on the Web has failed: miserably, utterly, and completely

In his post "Why GData/APP Fails as a General Purpose Editing Protocol for the Web" Dare Obasanjo says

"I thought it would be useful to describe the limitations we saw in the Atom Publishing Protocol which made it unsuitable as the data access protocol for a large class of online services. "

and provides 3 issues with Atom Protocol's. data model.

  1. Mismatch with data models that aren't microcontent
  2. Lack of support for granular updates to fields of an item
  3. Poor support for hierarchy

The post is a good read, and informative, but the title and the above quotation has something of Chicken Little about it. Let's go though Dare's 3 problems, provide some options for dealing with them, and then state 2 further problems with APP that are indeed worth thinking about.

  1. Mismatch with data models that aren't microcontent

    "I guess we could keep the existing XML format used by the Facebook REST API and treat the user documents as media resources. But in that case, we aren't really using the Atom Publishing Protocol, instead we've reinvented WebDAV. Poorly."

    Actually do treat it as a media entry; it'll work fine.

    Here's some speculation about formats. First, an awful lot of needless custom markup formats are going to be replaced by Atom entries; a good example is anything that looks like an event. Yes, some fields become pointless (atom:summary being an example I keep running into), but I'd say the problem of carrying around some junk DNA fields is outweighed by not starting over, plus you are easily integrated with the planet's syndication technology, for some definition of "free". Second, anything that looks like a bag of descriptive metadata (and.Facebook markup about users is exactly that) should be starting at RDF and working back to custom only based on real needs. The problem here is that the markup is describing more than the User, what it's describing reflects what Facebook's feature set can do. Facebook then risk going to revving the data as part of the platform*. Whereas something like FOAF would ameliorate much of that and allow people to concentrate on work that's actually valuable.

    The acid test here is whether Facebook's custom format is worthy of a media type. If it is, it probably has a reason to exist.

    [Incidentally, APP + non-Atom content strikes me as nothing like WebDAV; I'd like to hear more about that.]

  2. Lack of support for granular updates to fields of an item

    "Thus each client is responsible for ensuring that it doesn't lose any XML that was in the original atom:entry element it downloaded. The second problem is more serious and should be of concern to anyone who's read Editing the Web: Detecting the Lost Update Problem Using Unreserved Checkout. The problem is that there is data loss if the entry has changed between the time the client downloaded it and when it tries to PUT its changes."

    The solution at the protocol level is PATCH. In other words, this is not just a data problem. Using PUT to send deltas mucks about with PUT semantics in too subtle a way. The correct choice in that case is to choose a new method, not overload an existing one that has "nearby" semantics. Assuming it's really needed, it might take a few years to see proper support for PATCH - no doubt we'll see some ropey ideas rolled out in the meantime such as diff annotations in formats or method override headers.

    At the data level, Atom presents challenges; there's a minimum set of elements you need to be valid, but the truth is general purpose deltafication support across formats is a hard problem - just deltifying XML infosets alone is a hard problem. If you want to do this above the byte level, with data elements rather than offsets, again I'd say to look at RDF. Every RDF statement and collection of statements is a graph, and all its operations are closed under graphs. RDF is thus ideal for granular updates, including sending incomplete data sets in the first place.

    That said, once the client is sending a PATCH request, the intent is explicit irregardless of the format in play; that includes servers being able to say they do/don't support that instead of trashing content.

    In fact this came up this year in the atom-syntax working group as a design issue. I feel the atom working group made the right choice not trying to standardize it yet. Frankly, part of me sees this concern as somewhat Enterprisey; the kind of requirement only a WS-* standards group could care about. But if it's a real problem, I suspect it can be dealt with without running off and defining half-baked custom protocols.

  3. Poor support for hierarchy

    "The Atom data model is that it doesn't directly support nesting or hierarchies. You can have a collection of media resources or entry resources but the entry resources cannot themselves contain entry resources."

    I wouldn't say "poor" so much as non-existent. So I agree, and have banged my head against representing hierarchal data with Atom (or any RSS) in the past.

    It turns out the solution is provided by Microformats.- send a XOXO map file in the body of an Entry (or directly as a Media Entry). You can chose to inline all the data in the XOXO, provide basic metadata in description lists, or just links. There's not much point trying to force Atom Entries and Feeds to represent something they're not designed for.

All that said, I'm very happy to see real implementors provide some pushback on the Atom Protocol for their needs. However going on to claim GData/APP has failed is random enough conclusion, especially for the problems mentioned, which in one case ,is a deliberate design exclusion (for now). If these are the most serious problems encountered inside MSFT, it strikes me that APP's overall design is in good shape. Given the level of thought and discussion he indicates seems to have gone on inside MSFT, I'm surprised Dare didn't mention these two issues, which strike me as much more substantial:

  1. Update resumption: some clients need the ability to be able to upload data in segments. Aside from a poor user experience and general bandwidth costs, this is important for certain billing models; otherwise consumers have to pay on every failed attempt to upload a fote. APP doesn't state support for this at all; it might be doable using HTTP more generally, but to get decent client support you'd want it documented in an RFC at least.
  2. Batch and multi-part uploads: This feature was considered and let go by the atom-syntax working group. The reason was that processing around batching (aka "boxcarring") can get surprisingly complicated. That is, it's deceptively simple to just say "send a bunch of entries". Still, it would be good to look at this at some point in th future.

* I'd like to think inventing a custom format to describe a user that is lockstepped to a platform was part of a platform play, or even technical resistance because of using databases for storing arbitrary graphs - anything really. More likely it was lack of knowledge/research and/or fud about RDF. Oh well, at least we know what's down that raod.

The title is taken from Mark Pilgrim's article "XML on the web has failed"

June 9, 2007 11:21 AM


Dare Obasanjo
(June 9, 2007 02:14 PM #)

Thanks for the response. With regards to the two extra points you raised

1.) Being able to do partial and resumable updates seemed to be outside the scope of Atom since I'd only encountered this problem uploading large media types. And HTTP has some mechanisms for supporting these scenarios.

2.) For batch uploads, I don't think we have an agreement across the board on what the semantics or requirements around batching are, so it was kinda premature to try to standardize it then critique APP for not supporting it. This will eventually be an issue as well. We're just not there yet.

Koranteng Ofosu-Amaah
(June 9, 2007 02:46 PM #)

A funny thing. You must have posted between the time I started drafting my comment to Dare and when I hit post to submit said comment. I wonder if I'd had even bothered if I'd seen your bits.

John Cowan
(June 11, 2007 12:58 AM #)

Your PATCH is just a particular case of POST. There is nothing about POST that requires it to create a new resource rather than (or in addition to) modifying the old.

Fred Blasdel
(June 18, 2007 06:49 PM #)

Hell, at least Atom isn't RSS, and hasn't been fucked by the likes of Dave Winer. That's really what it's got going for it. It's the SVN to RSS's CVS.

Post a comment

(you may use HTML tags for style)

Remember Me?

Trackback Pings

TrackBack URL for this entry: