« links for 2007-06-10 | Main | Limitations »

Social networks, web publishing and strategy tax

On Failure

Stefan Tilkov paraphrased my responses to Dare's post on the Atom Protocol as:

"Bill de hÓra acknowledges that the third is indeed missing from APP, considers the second problem a general issue with PUT, and disagrees about the first one; but he adds two more problems: update resumption and batch/multi-part uploads."

To recap, the issues Dare raised are:

  • Mismatch with data models that aren't microcontent
  • Lack of support for granular updates to fields of an item
  • Poor support for hierarchy

Stefan is a connector across a number of communities, so I'd like to qualify his reduction as follows:

  1. Atom as Joe points out, is more than an envelope, it's content. I pointed out, valuable formats - ones with media types, and not just the usual blogging suspects - are properly supported in APP. Lolcats won't be a problem.
  2. Use PATCH. More on this below.
  3. I do not think Atom is a good format for hierarchical data, but it's not clear to me that's a problem (certainly it's not a protocol level problem). You probably want to start with a placeless model as APP/Atom does and declare hierarchies and maps out of band. There are all kinds of options for this that will work within the APP constraints.

Perhaps the title of my post was misleading (that's what you get for being clever). The point wasn't to criticize some detailed observations, or suggest APP has serious problems, but rather to criticize the dual conclusions that 1) the APP has failed for some definition of "general purpose" publishing, and 2) it's necessary to roll your own publishing protocol for the reasons given. Feedback on the protocol is a good thing, but I couldn't get to those conclusions following the arguments given. It didn't take long for some people to provide workable options, and I presented some other issues to chew on (batch updates and resuming uploads).

On PATCH

I mentioned using PATCH as an option for dealing with partial updates. Matthias Ernst questioned the need for a different method:

"I don't see that need. PUT with the If-Match: header is just enough to do the work on the client side using optimistic concurrency control."

Stefan also questioned the need for PATCH:

"I’m not at all sure I like the PATCH approach, too — I’m not really keen on having to tunnel even more verbs through POST because they’re not widely supported"

update: Stefan explained to me that his concern is adding another method rather than tunneling; a valid concern. I probably wasn't clear enough on where I was going with this. First of all, PATCH is defined in RFC2608 19.6.1.1 (sort of) and arguably part of HTTP, it's not a POST tunnel (thanks to Julian for the reference). Second, what Matthias says is true for the case of multiple editors (and APP has mention of how to deal with lost updates using If-Match and friends), but this is a different problem to sending deltas - ie, you don't need partial updates to have lost updates.

The design value in using a new method to deal with delta updates is twofold.

First no matter what the format is, or the optimal algorithm/policy for merging data on the format, the PATCH method is explicit in its intent - the server is getting a change delta from the client as a function of the representation sent down to the client. With PUT you have to infer outside the method whether the server is receiving a delta or a full update. You can deal with this format by format using PUT, and APP has specifications in place for avoid the problem altogether (the atom-syntax working group felt that sending partials was overloading PUT). Joe points to the following in section 9.3:

"To avoid unintentional loss of data when editing Member Entries or Media Link Entries, Atom Protocol clients SHOULD preserve all metadata that has not been intentionally modified, including unknown foreign markup as defined in Section 6 of [RFC4287]."

But "general purpose" diff/patch is another matter, especially if people want to work at a higher level than bytes. I see no reason to disallow it in the future; the best way to do that is not redefine or muddy PUT now (or later on), but allow the protocol room to use PATCH.

Second the broader guideline I had in mind was this - whenever you have you two operations that resemble each other superficially but are semantically different and have different expected outcomes, you should consider separate and explicit definitions to avoid interop issues. It's not just about finding efficient techniques for important approaches to readers and writers like optimistic concurrency - it's about providing a uniform means of expression in the protocol design.

On Strategy Tax

Broadening things beyond direct issues with Atom Protocol for a minute, it should be clear that defining your own publishing and data access protocol, means building your own tools and platform infrastructure from top to bottom. The amount of work to do this, again for some definition of "general purpose" shouldn't be underestimated. It's much more likely in a high pressure commercial environment to produce a protocol that is highly limited and works for one platform - yours. That is you end up with less capability and yet another silo. This is analogous at the protocol level to Facebook's choosing to create markup format for users - one that says more about Facebook's current capabilities than the actual users - instead of rolling with something like FOAF. Arguably controlling of data portability is largely the point, but the overall costs of doing so shouldn't underestimated. Going custom will up the overall design and engineering dollars spent 'below the waterline'. Companies, even big ones, are resource bound so each engineering dollar spent on publishing infrastructure is a dollar not spent on a cool feature a user might care about. You want to be sure it's the right thing to do. For those integrating against such a provider you probably want to keep custom formats/protocols at the edge and convert them to open models for that internal use.

This reluctance to roll out on an open protocol is a good example of a strategy tax, where creating barriers to data allows companies building social network platforms to maximize a return on that data and all importantly, monetize the graph of social relations. This balance around open data and platform franchises is a difficult problem for social network providers, who are especially subject to moddish swings in interest or perceived coolness. They don't yet seem to have the stable revenue streams that Google has from adsense or that Ebay and Amazon have from providing marketplaces. It's surely tempting then to reduce the fluidity of user data while figuring out how to become an 800lb gorilla. However web history suggests betting on a user silo will be a short lived tactical advantage, not a strategic play a la desktop operating systems. Perhaps there are other models to lockin - people have been pointing out for years that Google has precious little lockin on the search page and it's trivial to use a different search engine - yet somehow they manage to get by.


June 10, 2007 05:25 PM

Comments

Julian Reschke
(June 10, 2007 07:23 PM #)

Two comments: I do agree that PATCH is sort-of defined, but not because of the expired Dusseault drafts, but by means of RFC2068, Section 19.6.1.1 (). All that's needed here is a separate spec that defines PATCH the way it was intended to be (which is different from the way it was done in draft-dusseault-http-patch), plus a few types of patch formats that are well understood and have a proper media type registration (for instance, we need somebody to do the work to get application/diff specified).

On Stefan's comment on tunneling: I don't see why using PATCH can make things worse for people who already have trouble with PUT and DELETE. It 's the same kind of problem, so the same solutions (fixing the software, or tunneling) should work.

Julian

Bill de hOra
(June 10, 2007 09:58 PM #)

Stefan:

"so adding another one, or relying on it (even one as well-defined as PATCH) should not be done without very good reasons."

I agree! PUT and DELETE were considered problematic even showstoppers, for APP only a few years ago (there was talk of going over SOAP to deal with that). It seems we took long enough for it not to be a problem.

"I'm not enough of an Atom guru (that's a new skill, I guess) to judge whether this is the case here or not ;-)"

I think partial updates are a candidate exactly because they're close to normal PUT to make thing messy and inconsistent. For example; it took some time to work out what was appropriate for PUT out on the atom wg, and they/we are fairly close to the problem.

Julian:

thanks for the clarifications (as always). I'll fix up the references.

Post a comment

(you may use HTML tags for style)




Remember Me?

Trackback Pings

TrackBack URL for this entry:
http://www.dehora.net/mt/mt-tb.cgi/2108