ArghPC

Steve Loughran: "Hadoop uses RPC to chat between nodes; everything has custom serialization and the heartbeats include data -tricks that work on a LAN. But they have a hard time keeping clients and clusters in Sync: what makes a workable and efficient protocol for the gigabit LAN in the datacentre is not appropriate for client-cluster comms, not when the clients aren't under control, or when they are a distance away. I'd like to work on a good REST api there -something like S3FS for storage, a pure REST model for Jobs too."

Me too. This issue (RPC) comes up for me from time to time - how to have a tight call model for what are essentially control messages, across a cluster of servers, without getting into 

  • Binary lockstep (eg fine grained RMI or shunting Python pickles around) and the risk of a cyclic dependency so large you can barely see it.   
  • The inability to examine problems with standard tools like wireshark without a decodec. This applies to any binary wire protocol, including base 64'd XML.
  • Designing some cross language type system

I'm down to 3 options:

  • Forget RPC, use HTTP calls to post state
  • Use XMPP messaging and pub/sub
  • Use RPC but with JSON as your wire format.

All of which abide by the notion that if you must ship non-documents, then ship using a handful of data structures (list, dict) and a limited number of scalar types (unicode strings, numbers, iso-dates, booleans). In other words JSON is the sweet spot of type driven interop.

Steve Vinoski: "For years we’ve known RPC and its descendants to be fundamentally flawed, yet many still willingly use the approach. Why? I believe the reason is simply convenience. Regardless of RPC’s well-understood problems, many developers continue to go down the RPC-oriented path because it conveniently fits the abstractions of the popular general-purpose programming languages they limit themselves to using. Making a function or method call to a remote or distributed function, object, or service appear just like any other function or method call allows such developers to stay within the comfortable confines of their language. Those who choose this approach essentially decide that developer convenience and comfort is more important than dealing with hard distribution issues like latency, concurrency, reliability, scalability, and partial failure."

All important, but binary on the wire messages and lockstepped upgrades are a massive problem as well. IOW a core practical issue with RPC is sending non-text around.

It's interesting then, that Facebook thrift has gone into the Apache Incubator. It looks sort of like JSON but has so-90s stuff like signed integer types.

June 31st

Mark Pilgrim: " It checks for June 31st. I swear to God it does. One day I was writing test cases and Sam was writing code to pass them, and when he saw that test case fail he almost reached through his cable modem and strangled me. He almost removed the test case out of spite. He gave in and coded it anyway, and checked it in, and we deployed, and three days later I got a bug report from someone who couldn’t figure out why his feed wasn’t validating. And I couldn’t figure it out either, until he mentioned that it only seemed to choke on the date for one specific entry, and I looked at it one more time and I swear to God it said '2006-06-31'"

There are no exceptions to Postel's Law on the Internet. Upstream servers will break downstream clients irrespective of what we regard as the finished downstream article. This is normal and to be expected.

1: Every downstream you visit that refuses to handle the law will be dead, or dying.

2: Everyone is downstream.

In particular I want to dispose of the idea that negotiated agreements are a let. No SLA gets you out of Postel's Law. Just because we have recently starting calling these contracts "APIs" makes no matter. Postel's Law holds, I think, the supreme position among Internet principles. If your client is found to be against Postel's Law I can give you no hope. 

 

Time to market

Jim Meyer and Ikai Lan:"One of the best things about working for LinkedIn is the constant exposure to new technologies and challenges. While the majority of the LinkedIn infrastructure runs on Java, it’s no secret that we don’t shy away from other interesting languages and frameworks – we’re always looking for ways to make better software faster. This is the spirit that launched the LED group with the charter to see how quickly it could prototype new ideas and features using Ruby on Rails." -

From the company who knows something about scaling graph traversal. The numbers are interesting:

"Here's a quick snapshot of Bumper Sticker statistics at this moment:

  • 13.5 million installations
  • 1.5 million daily active users
  • 20-27 million canvas page views a day

All of this is served by:

  • 13 web application servers running nginx and mongrel
  • 8 static asset servers serving over 3,500,000 stickers, soon to migrate to a content distribution network (CDN)
  • 4 MySQL servers in a master/slave configuration using Rick Olson’s excellent masochism plugin"

I'd love to know what percentage of that traffic is being served by nginx as opposed to dynamic Rails. The rest of the post is interesting - it seems Rails was chosen for its development speed - Linkedin is a Java shop.

Anyway to beat a horse to death - scaling is design thing, not a language thing.

Conservation of attractive profits

Arstechnica: "Nokia announced plans today to transform the Symbian mobile phone operating system into an open source platform. Nokia, which already owns roughly half of all shares of Symbian, is in the process of obtaining the other half from various holders for €264 million. The company has partnered with several other major handset makers to launch the new nonprofit Symbian Foundation, which will facilitate the liberation of the platform."

Clayton M Christensen - "Companies make attractive money when they solve the hardest problems. The hardest technical problems mandate solutions that are tightly coupled integrated systems. When modularity and commodization cause attractive profits to disappear at one stage in the value chain, conservation of integration means the opportunity to earn attractive profits with proprietary products will emerge at an adjacent stage. - Seeing What's Next"

Controlled semantic vocabularies tool

Peter Krantz: "Hence, there is a need for a simple vocabulary editor that allows domain experts to create a vocabulary without knowing the innards of semantic web technologies. I had a discussion with an IT-strategist from a large government authority today and we agreed that a tool like this would greatly benefit the use of controlled vocabularies in the public sector."

(A patched) PloneOntology is used to back vocabularies on this site. Requires Plone, not MySQL. Uses a form to fill out content. PloneOntology is built on the Plone relations product, which is immensely good for this kind of work (relations lets you describe things like cardinality). I'm not sure it could be ported to a database easily; iirc it relied heavily on ZODB's object storage.

Plugin pros and cons

Stephen O'Grady has a nice post about the Gnome Do plugin model. So I thought I'd write down some thoughts on the pros and cons of plugin software architectures. When I say plugins here, I'm overloaded the word as a general concept to include things like Zope products, OSGi bundles, Eclipse/Jira/IDEA plugins, XPIs, Apache modules, and even javascript in browser (yeah, that's a stretch - it's more a comment on how the modern browser fails as a platform for code on demand). So apologies in advance for the imprecision.


What's to like.

Granularity. A plugin bundle has a nice modular granularity about it - I don't know how else to put except to say that components/bundles feel "right-sized". For example take Spring Dynamic Modules (OSGi based) - bundles have a better functional abstraction that can lead to strong modular cohesion and solid organisation of codebases, in comparison to implementation wiring, which imposes no real packaging or build constraints. They also seem to provide a more coherent basis for organising code than objects when considered against acceptance tests, BDD, requirements, or even simple things like release notes.
 
Partial upgrades. Monolithic systems tend to require monolithic upgrades, or workarounds to the build process and version model to support partial upgrades. In critical operational environments that will put even minor fixes onto a high-ceremony release process. Plugins allow for surgical upgrades. They also reduce the cost of regression testing, drastically. Undeployment is straightforward as long as the plugin manages its reated data properly (see "Data contracts and trust"  below).
 
Concurrent engineering. This is related to the partial upgrades notion. A nice side effect of a plugin architecture is that development can support multiple parallel streams of work and along with that, multiple parallel release streams. Unless you've worked on a codebase that allows this it's hard to explain how effective it can be - it's the kind of stuff project and release managers dream about, but rarely if ever get to see. Concurrent engineering is probably the most effective process technique for managing failure risk in product design; that aside, the goal here is to be able to treat system S.1.0.0 as version umbrella for a collection of plugins, P.1.0.0 ... Pn.1.0.0. Minor and feature upgrades then can be managed as release configuration where one or two plugins are upgraded with the others left in place resulting in S.1.1.0. This configuration can be done (almost) entirely through metatdata. I should probably follow up with another post just about how that works.
 
Contracts. Good plugin systems force the platform to expose strong contracts to hosted services - this tends to shore up the software architecture in general.

Democratisation. Developers using your product or service can code to it without having to know much about your code internals. This allows innovation and independent evolution. In some cases it can be used to scale development efforts. This is especially important if you have a database centric architecture (extending domains driven by table designs is notoriously difficult and messy).
 
Configuration. This isn't cited much around plugins but is a very important. Configuration in a plugin architecture will tend be split out along functional lines avoiding systems that are functionally cohesive but exhibit no cohesion around configuration (big-ball-of-mud.conf); this sounds like a little thing but can complicate everything post green-bar - build, packaging, deployment, regression. For example, look at the way apache2 httpd.conf gets split out to individual files compared to the apache single file approach.

They're cool. No, really. Complexity, incohesion and the tendency for software systems to move toward entropy is almost impossible to understand unless you're a software developer. As a result business people really like plugins. It's how a lot of the rest of the industrial world works and makes instant sense to them - a lot more than saying seemingly trivial feature X will take 3d to implement but require a mass refactoring that will take 10d. Showing a stakeholder a web page with a list of deployed and available plugins makes them regard the software in a totally new light.

What's not to like

Abstraction and the whole "metaness" of it all. Plugins require a lot of extra abstraction - service provider and callback interfaces, manifest formats, registries, dependency chains, plugin lifecycles - all need to be defined. A lot of things that are implicit in monolithic architectures need to be made explicit and consistent, even architectures that support dependency inversion techniques.
 
Platform/Plugin bitrot. Plugin A.1 and B.1 run on P.1. You upgrade to P.2 because. But B.1 will not run on P.2. Or worse you cannot upgrade to B.2 because A.1 won't run on P.2 - in the meantime the rest of the product ecosystem is moving to P.3 and you risk being stranded  and/or unsupported on P.1. The latter tends to happen when the plugin itself becomes as important as the supporting platform. I saw this a lot with Zope2/Plone, which has a very sophisticated product plugin architecture (Plone itself is a Zope plugin) and to a (much lesser) degree I've seen it with Jira and Firefox. Arguably this is a kind of business model - waiting for people to pay you to upgrade the plugin.
 
Development and testing. Plugins have to be plugged into something which mean deployment, unless great has been taken to abstract away the runtime.
 
 Isolation. Plugins need to not interact in uncontrolled ways and avoid shared state - this is hard to do in shared environments like runtimes and virtual machines. I think this as much as anything is why OSGi is the future of Java plugin architectures. Unless Sun decide to ship Isolates in a future JDK, OSGi's the only proven game in town for classpath isolation. Java's classloader architecture doesn't support the kind of "multihoming" plugins need beyond trivial handler classes. The browser as a platform for javascript "plugins" (more accurately code-on-demand) is a mess in this regard - global variables, xhr hijacking and the kind of weird stuff prototype does all need to go away. Even then that's only the structural/contract side - managing access to shared resources like memory or cycles or IO is much harder -witness how Google App Engine, a grandiose plugin system (really!?), restricts access to external resources.
 
 Data contracts and trust. Plugins that generate data and then break the data contract on upgrade are a massive headache and imo are a bigger problem that API breakage. Arguably the market weeds these out, code that doesn't respect data can't be trusted, but for some people it can be too late.

Salesforce.com and Google integration: AtomPub

Salesforce integrate with Google Data via GData/Atom Protocol: "The metadata APIs from Google Apps means you can read the APIs directly from Google and Salesforce servers into your own developer tools, most notably Eclipse.

During his presentation at the developer conference, Benioff showed a demo in which all of the Google definitions, such as its calendar, could be read into the Eclipse IDE and then used to build Force.com apps.

The Google data APIs are built on top of the Atom publishing protocol, a more modern version of the RSS protocol used to syndicate newsfeeds.

Using them, developers will be able to access Google Data APIs directly from Salesforce.com's APEX programming language"

Apex will call Google via GData/AtomPub, Google won't serve up the Apex formats (should be a heck of a lot better than the SOAP approach Apex uses). 

So far the API services available to Apex clients are:

  • Google Documents API
  • Google Calendar API
  • Google Spreadsheets API
  • Blogger API
  • Contacts API
  • Google Data Authentication

The dump format exposed via the Apex client looks a bit weird (not quite JSON); hopefully it's not what's going over the wire or can be negotiated using a format parameter/Accept header to avoid compability problems. 

One upside is that adopting Atom protocol and the associated HTTP-like-it-oughta-be approach a should help salesforce.com stop creating multiple incompatible APIs on top of the many (~18?)  they have already (apparently these APIs account for 45% of sf.com's transactions). Or at least slow that kind of unneccessary churn down.

Also interesting that they're using multipart related to optionally upload an Atom Entry along with media content: "Documents are uploaded to the server via an HTTP POST optionally using MIME multipart encoding to combine the the document contents with a Atom entry describing the document." - I wonder if that's based on the 'Picasa style' multipart uploads which are going to be standardized on the AtomPub working group.

ICompatible

Brian McCallister: "To take a concrete example, a coworker (thanks Jax!) recently re-added first class support for callable statements to jDBI. jDBI uses a Handle interface to expose operations against a database. It has gained a method:

public <ReturnType> Call<ReturnType> createCall(String callableSql, 
CallableStatementMapper<ReturnType> mapper);

If you implement this interface, the change is backwards incompatible. An implementation of HandleHandle, it is to expose the libraries functionality. It is made against 2.2.2 will not compile against this. On the other hand, the intent of the library is not for people to implement almost a header file.  So, 2.3 or 3.0? "

It's 3.0. Semantics of what we mean by "API" aside, jDBI here is closer to an 'SPI', structurally it's not going to compile. That's how Java rolls. Fwiw, I think is a good example of where not to use an Interface rather than get into distinctions about public v published. I understand not programming to interfaces is heresy, but in this case I wouldn't care much - an abstract class will avoid uneccessary versioning pain.

I guess other people can ponder on what the generics are getting you there ;)

Exterminate!

Jim O'Donnell: "I don't know if anyone else agrees, but my own rule of thumb for a title attribute is 'if you read it aloud to the person next to you, would they understand it?' For IS0 8601 dates, that test is only going to work if the person sitting next to you is a dalek." - spotted on a haccessibility comment thread.

I came across that, via this: "Removing Microformats from bbc.co.uk/programmes." (or more accurately removing uF using the abbr/date techniques). The abbr/date thing is never going to go away.

Git the inevitable?

I had been putting it off, but it looks like I'll have to dive into Git some more.

Peer pressure

I'm a big, big Mercurial fan (I think the command set is awesome and I think Git has some warts), but the ground up community drive and sense of urgency around Git is something else. I'll be using Mercurial for managing my own stuff and a few other codebases, but the social activity around Git I think means it'll be increasingly harder to function "from downstream" without knowing how to use it well. A few examples of things that have conspired to tip me over:

 - Jacob Kaplan Moss has served out a Git repository for Django. I've been managing local Django trunks via mercurial for a while now. Now I don't have to.
 
 - Scott Chacon's presentation (here's a PDF). Has the best explanation of how to map standard cantral models to DVCS I've seen. Scott also has a Git book on peepcode.
 
 - The Prags are going to ship a Git book. Prag books influence developers (their Subversion book was a godsend a few years back).

 - Ryan Tomayko likes it.
 
 - Google alledegly choosing git for Android. Assuming it's true (and it might not be, Perforce tools for modern Java IDEs are stellar), I wager Google code will support Git as a side-effect. Arguably DVCS is a better fit for that company's DNA than a central repository anyway.
 
 - git-svn seems to be far ahead of anything hg/svn related. This I think is the most important thing - it means Git will be the upgrade path to DVCS for Subversion repositories.

Versioning
 
Finally, a long bet - automated deployment tools will be built on Git. Why? Well I think deployment is best modelled as a publishing problem and Git internally is closer to a versioned content management system than the other DVCSes (in particular the way it doesn't use deltas to manage files). Combine that with 

  • the pull/push features (publishing systems always need them)
  • multiway branch merging (which naturally supports the configuration-pulled-from-another-repository pattern)
  • it manages source code to begin with (doh!)

means (I think) someone eventually will see deployment as an extension of checkout/push instead of an entitrely separate workflow, and start writing command tooling around that. Also one thing a DVCS supports is end to end versioning by having the production systems (especially configuration) versioned in place. Let's not pretend otherwise - everyone has at sometime or another, patched a live server directly. Bad practice or no, when that happens a centralised VCS gives you no direction home - but a DVCS support bidirectional flow because the production server is just another branch.

Dates in Atom

Atom Feeds and AtomPub collections are time ordered data. I think most people intuitively know that Atom feeds are time ordered data, but perhaps not that they're ordered by update and edit times, or why time is the natural order for atom serviced content even though domain content might have other natural orders that make sense. Since it's not that commonly talked about, I figure it's worth at least one post to explain why.

Dates in Atom

There's a long (torrid) history of datestamping in the Atom standards and more generally feed syndication. When the Atom format was being designed some working group members felt you needed 3 dates - an edit date, a publish date and a creation date. Or maybe an edit, updated and published. Or... you get the idea. And as prior art to Atom Dublin Core had already settled on 3 dates. Anyway, the Atom working group couldn't agree on 3 (really), but we could identify and agree on 2 meaningful dates - updated and published. As a result, Atom Entries must have an updated date, and can have a published date. 

Why all the work to naturally order by time? Historically it's because feeds come from blogs, which are diaries, which are lists of entries ordered by date. Today it's increasingly for systems reasons, most importantly, to support cheap synchronisation by clients. What happens is that the combination of atom:id and atom:updated is enough information for clients to synchronise new or updated content - they work from the top of the feed and walk the entries and/or the feed's previous links until they hit the first atom:id/atom:updated pair that matches their local Entry cache - sync over. This lowers overall traffic and data loading costs out of persistent storage.

Dates in AtomPub

AtomPub (RFC5023) added another date. The working group said that AtomPub  collections (feeds you can post content to) should be ordered by a date called app:edited.  Entries in AtomPub collections should contain one app:edited element, and must not contain more than one.

Ideally this natural ordering  would have been be a must level specification, but RFC5023 couldn't mandate the app:edited be universally understood, as that would break Atom's versioning policies which say that new elements are 'foreign markup' and can be optionally processed or must be ignored. In other words no-one can introduce a new must understand datum into Atom (RFC4287) markup and retroactively break the planet's deployed Atom aware systems - not even AtomPub (RFC5023). Unless you are unlucky, app:edited works well, even where the feed itself is latently updated.

[By the way in the "real world" feeds that can act as AtomPub collections will also appear as being ordered by atom:updated, even though app:edited is what the spec says you should expose. Some systems will update on every edit; that's just how they roll.]

Domain gnarliness

The AtomPub spec doesn't say why app:edited exists, but the following example should help explain why.

Not all domain content is naturally time ordered (there's more to digital life than blogging). Address and contact books for example will tend to be sorted and presented to a user by some other key, maybe last name. This is a gnarly case, that came up on the Atom protocol list a while back.

So say my information store has a list of contacts - and a collection resource for managing those contacts.  Generally I'm not interested in retrieving things by last edit/update, I want contacts alpha ordered, becuase my client is a useful application that happens to use Atom/AtomPub, not some kind of an entry cache. If I'm using Atom to represent an address book, using atom:updated or ap:edited seems to be the wrong approach for the UI.

The problem is, not ordering collection entries by update time will result in inefficient syncing (syncing is probably use case 2 or 3 for a network address book, hence you tend to see SyncML and address books go hand in hand).

For example if I add new contact with a last name of "Wordsworth", that will go to the back of the feed and not the front, where it can be picked up cheaply on the next sync. The client the edit came from could of course either hold onto the recent additions/edits (essentially acting as a writethrough cache) instead of paging back to "W". But my client got a bit more complicated. And my other HTTP connected devices wanting the newest stuff will need to page all the way back to "W" in the book to sync up. In fact to be sure they'll have to pull the whole book evey time they sync. The approach of stopping at the first matching id/update pair won't work - algorithmically speaking, syncing will always be a worst case.

Eventually something like the following will happen to deal with the UI being slow, or concurrent client refreshes pegging the server. A new "recently added" contacts feed will be added. Or the sort will be extended to allow by-added/by-updated. Either way, it'll be a reinvention of AtomPub's app:edited default sorting. In that case we'll want move the order by last-name feature of the domain/UI into the implementation detail, perhaps by defining some query params that provides the user optimised view of the data (ie the one that makes most sense for the user browsing the content), and keep the time ordered feed as the protocol default.

What's happening is that there are two use cases. One for viewing an address book in an application (sorted by alpha), and another for adding and syncing contacts to it, and probably the server needs to provide different views on the data for each. 

Incidently an AtomPub client can work without app:edited sorting (it won't necessarily know the sort order, unless there's a private contract between client and server), but it will be inefficient on update. So it seems to be in the general case, even for a domain like an address book, order by time is the best natural sort for an AtomPub collection.

 

Backend databases

Most people I think use databases to back web sites and sometimes you'll want to just use the database primary key to sort the entries. Ordering on the pk is great because it's FaF (Fast as ****). And if the database is using autoincrementing keys we'll naturally sort by content creation date. But there are downsides. For example, this technique won't be optimal for updates as they won't be captured in the order-by clause. At the system level it means that clients will have to start paging more data to sync up content, which means more load against the DB. Non-auto-incrementing keys and very possibly split/federated databases won't be support the implicit creation. And a database wipeout potentially loses the order of actual creation (who knows how the data will be reimported and new keys assigned).

atom dates

What this means that RDBMS managed content being served up for feeds or managed using AtomPub (which will over time trend to being most web content) will have multiple date columns. An insert time (generally good for data management anyway) will be very common. But for content management they'll need an updated column that's indexed, to track recent changes. You might have a third published date, and maybe and edit one as well (if you need to distinguish between an update and an edit), but to let AtomPub clients use and manage the data, an updated date seems to be the minimum must have.

Django Software Foundation

Django Foundation: From LjWorld:

"Django, started nearly five years ago by programmers affiliated with The World Company, now joins a lineup of pervasive computer languages and systems — including Mozilla, Apache and Linux — to be overseen by a nonprofit organization.

The Django Software Foundation, based in Lawrence, now owns the trademark and intellectual property that form the basis for the application that is used to create increasingly popular Web publishing programs, the likes of which are used by operations ranging from LJWorld.com here in town to The Washington Post and others worldwide.

With a foundation, Django will be able to continue its growth with new assistance from software operations large and small, from one-person outfits to industry leaders such as Google."

This is good news for Django's long term viability. It'll encourage an ecosystem to grow around Django and should also help avoid any issues around trademarking  (as happened with Rails a while back) and intellectual property rights.

AtomPub Multipart Media Creation draft-01

Joe has released a new draft of AtomPub Multipart Media Creation: http://www.ietf.org/internet-drafts/draft-gregorio-atompub-multipart-01.txt. And it's now on the standards track, which is great.

Essentially what this draft does is define a way to upload media and the atom entry about that media (in AtomPub jargon, those are called the Media Resource and the Media Link Entry respectively) in one HTTP POST to an AtomPub server, instead of two as per RFC5023. A number of people have implemented "something like this" on their own stacks already; I can see it becoming a default approach to media posting.

The technique uses MIME multipart, specifically multipart/related and not multipart/form-data, which I suspect people working with browsers will be more familiar with. As a result, I'd be interested in hearing anyone's experience implementing atompub-multipart via a browser/ajax client.

Turnaround

Peter Norvig: "We can't fix the process in time for the 2008 elections, but I can look at the candidates as if they were interviewing for the job of CEO of the US. What is clear in 2008, more so than in 2004, is that what we need now is a turnaround specialist. The hard part about effecting a turnaround is fear of change."

Bob Cringely: "If you are a new CEO who needs to turn around a business 10 minutes after walking through the door, there are two things you can do: 1) cut costs, and 2) focus on your top 20 percent customers."

The Lisp of the Web

Steve Yegge: "Now we all see this happening in clients. Excel, for instance, is scriptable. And the reason that Excel is so powerful, I mean the reason that you can go to the bookstore and get a book that's this thick on Excel, and scientific computing people use it, whatever, is that it has a very very powerful scripting engine.

In fact, all of Microsoft Office has it. Right? You can fire up Ruby or Python or Perl, and you can actually control, though the COM interface, you can actually tell IE to open a document and scroll to a certain point in it. Or you can open up Microsoft Word and actually... I mean, if you want to do the work, you could actually get to where you're typing into your Perl console and it's showing up over in Word.

Server-side computing has to get there. It's gonna get there.

But how many server-side apps are user scriptable today? Precious few. Google has a couple, you know, like our JotSpot acquisition, which is [scriptable] in Rhino...

So we're talking about something that's kind of new. I mean, we can all see it coming! But it's still kind of new, the idea, right? Why?"

When I read this, the first thing I thought was... Zope. Not JavaScript on the server/jvm which is what the rest of Steve's post is about. Zope.

Zope has had this kind of scripting for nearly a decade - Python, and through the web if you want it. Zope's had an OODB since forever - you can do stuff like changing cardinality without migrations. Every Zope object can respond to HTTP methods - HTTP is built into the interface. Zope supports arbitrary/semi-structured metadata - can't settle on a geotagging scheme? - support as many as you need. It has a repl that works over gnureadline/ipython - go ahead, shell into the running server, poke around, enjoy the fact you can transiently manipulate the object state on your instance of the data. Zope has components with full dependency management and lifecycle support - these are called products. This is a Zope product. This is another.

Yes, I know we don't like Zope - modern fullstack frameworks are better, OSGi based microkernels are better, plugins are better. Continuations are better. Zope's a dinosaur, it's too complicated, it's labyrinthine, it's deeply unfashionable.

We might not want to use Zope, but we do want to learn from it. Want to know the downsides and issues around using an OODB? Bidirectional RPC/Object Caching? Incremental backup policies. Component lifecycles and transitive dependencies? Hot reloading? OO based URL dispatching? Properties and Folder/Collection Acquisition? Supporting arbitrary metadata? Document Translation/i18n? Not running on an RDBMS? Open classes? Multisite publishing? Through the web scripting? Look at Zope. Zope is the Lisp of the Web.


Anyway, there's a good book on the most recent Zope, that gets into the meat of what it can do. Enjoy.