" /> Bill de hÓra: August 2005 Archives

« July 2005 | Main | September 2005 »

August 30, 2005

XPCOM to UNO

Read all about it.

Browsers are a hold

I had plenty of good push back on the idea that browsers need to become aggregators.

James Governor: "...another reason we cant replace the browser with an aggregator is because not everyone is going to offer full text feeds. Instead we have people trying to maintain control, drive up their web stats or whatever it is. If information wants to be free, Doc Searls, then how come I have to come to you for it (slight barrier to entry)." To which I say - >doh< - never even thought of that.

Brian Rowe: "His point of view seems to be that browsers will become feed aggregators. I'm not so sure of this idea. I don't know how an aggregator would handle static documents, such as a reference for a programming API."

Peter Williams "Permalinking should be done at the resource level. The HTML and XML feed formats are just different representations of the same resource and therefore they should have the same URI." I wish I'd said that.

Exhuming O'Reilly

Here's a trick for dealing with all those articles on O'Reilly.

  1. Go to OReilly.com and search for something, say "XUL".
  2. Refine that search using the 'articles' link under the search box.
  3. Post that result URL to del.icio.us and tag it as "oreilly xul".

Ain't web architecture grand?

August 18, 2005

Should we solve one-click subscription by turning the HTML off?

In response to what Tim Bray said about one-click subscriptions,Dare Obasanjo had this to say about one-click subscriptions:

"As long as people expect one click subscription to depend on websites using the right icons, the right HTML and the right MIME types for their documents it won't become widespread. On the other hand, this debate is about to become moot anyway because every major web browser is going to have a [Subscribe to this website] button on it in a year or so."

I agree with the first bit. The last bit would sound like goll for next year, except I don't use browsers much anymore and will be using them even less next year. Aggregators are so much better than browsers for following content. Really, if you have to read stuff on the web and are using a browser for that, you should try an aggregator. And then, what's the browser good for?

Clicksub as Programmer Usability

There are some suggestions that 'clicksub' (that's not a new jargon play , it's just easier to type than 'one-click subscription') should work like 'mailto:' or 'aim:goaim' links and fire up your default aggregator. Even if you could fix Dare's problem no 1 (infrastructure, do-rightness), that idea doesn't work because doing mail stuff is different than doing web stuff. Whereas rading stuff in a browser isn't suffiicently different to reading stuff in an aggregator. Having clicksub links in browsers to fire up your aggregator for feeds is like having clicksub links in Notepad to fire up Excel for CSV files. Even as a migration strategy it boggles the mind. Until there's one aggregator to rule them all (or at least 87.6% of them), it doesn't make sense for the world to punt on one-click in aggregators because the browsers will save us. They should just get to it directly with an aggregator. [And before anyone tells me that aggregators are unusable they'll have to explain in what way browsers are usable by comparison.]

To be honest, next year's browsers need to be aggregators, else I don't see the point in using them. Why would I get a new browser just so I can subscribe to web feeds?

While the browser wars continue on their merry percentage-driven dance, it all seems somehow kind of pointless and wistful, like having a really satisfying argument over the pros and cons of various 8-track tape players, while the rest of world are sucking down MP3s into their iPods.

Like I said, I just don't read much from a browser anymore. The browser is sort of incidental and using it as a really big startup file for my aggregator feel likes the long way around.

Conclusion: getting a new browser just so I can subscribe to stuff for my aggregator has Programmer Usability written all over it.

HTML as web fluff

Maybe it's time to evolve. I say let's restate the problem.

The idea of turning off the website for this place and just serving up the feed does not look unreasonable at this point. I'm betting 90% of traffic to the archived html files here is only driven because the permalinks and trackbacks point there instead of direct to feed entries. It's slavish. Honestly, permalinking to a html file is starting to look more and more like a bug. Why not point to the XML entries? (Answer: I'm not sure, but in my case it might have something to do with having a Perl Deficit).

So my answer to clicksub - don't start from there. Instead this would be great: at some point weblogs flip over and the HTML website bits will become secondary fluff to the XML content, like how PDFs are secondary web fluff to HTML today. The frontpage would be the feed, the archives would be Atom entries, and instead of a "subscribe to the feed" buttons, you could have "read this stuff in a browser" buttons. And reading this stuff in browser would be retro-cool in a Harris tweed sports jacket kind of way - you could use Lynx at tech conferences to read weblogs and get some respect for keeping it real. It would be strictly for the weekends. Otherwise, no more handwringing about one-click subscriptions - if you got here, you're already subscribed.

Conclusion: problem solved.

Browsers as muscle memory

You can (and probably should) dismiss all this as an irrelevant outlier opinion from one tech user. Or you can take the idea of not using a browser and not having a html based web site as a precursor to how people will interact with content. This is no bad thing. I happen to feel that browsers do not exactly rock as user-interfaces. Browsing is a hokey metaphor, that we only made up because "surfing" was so shockingly awful, anything else would do. Nobody "browses the blogosphere", which I see as progress, although "blogosphere" clearly requires some work.

Web browsers are still good for the following however:

  1. Testing webapps
  2. Shopping
  3. Posting to delicious
  4. Search forms

1 is a self-fulfilling prophesy (or a death-spiral, I can't tell). 2, well, Better Living Through Shopping obviously, but it's only conditioning to be unlearned - how long can it be before I start buying stuff via an aggregator? 3 and 4 represents feature deficit in today's aggregators, insofar as they they don't have much by the way of tool bar goodness. A Mozilla based aggregator will eventually fix that right up.

Conclusion: at this stage using a browser is muscle memory.


[update 2005-08-30: some good pushback from Brian Rowe.]

Team Development in Plone

Michael Thornhill: "Do not use ZMI at all ever!, use Filesystem Products". A must-read set of notes for working with Plone.

August 17, 2005

Automated mapping between RDF and forms, part I

Let's start with some FOAF, that shows some issues around roundtripping RDF data through a web form.

Foaf only has one property for a phone, namely foaf:phone. It has like, much values for IM, but not for phones:

  <rdf:Description 
    rdf:about="http://example.com/person/elvis"> 
    <foaf:phone   
      rdf:resource="tel:987654321"/> 
    <foaf:phone   
      rdf:resource="tel:123456789"/>     
  </rdf:Description>

So how do you distinguish between your home, work phones and 2 mobiles? No worries, let's type annotate these little jokers:

  <rdf:Description 
    rdf:about="http://example.com/person/elvis"> 
    <foaf:phone   
      rdf:resource="tel:987654321"/> 
    <foaf:phone   
      rdf:resource="tel:123456789"/>     
  </rdf:Description>
  <rdf:Description rdf:about="tel:987654321"> 
    <rdf:type 
      rdf:resource="urn:mobile"/> 
  </rdf:Description> 
  <rdf:Description rdf:about="tel:123456789"> 
    <rdf:type 
      rdf:resource="urn:work"/> 
  </rdf:Description> 

In psuedo-excel that looks like this:

  foaf:person, foaf:phone, tel:987654321
  foaf:person, foaf:phone, tel:123456789
  tel:987654321, rdf:type, urn:mobile
  tel:123456789, rdf:type, urn:work


Sorted. But not for forms.

Suppose you want to send down those two phone values to a form. You can divvy them based on the RDF type annotation (this one's a mobile, that one's for work) without too much trouble. This is what the form fragment might look like:

  <p>  
    <label for="work">work:</label> 
    <input size="25" value="tel:123456789" 
      type="text" name="foaf_phone_work">
    <input type="checkbox" 
      id="delete_foaf_phone_work" >delete</input>
    </p>
    <p>
    <label for="mobile">mobile:</label> 
    <input size="25" value="tel:987654321" 
      type="text" name="foaf_phone_mobile">
    <input type="checkbox" 
      id="delete_foaf_phone_mobile" >delete</input>
  </p>

There's some name munging ("foaf_phone_mobile", "foaf_phone_work") as you can see, but that's ok.

But on the way back if you have more than one foaf:phone value, things get tricky. That's because after the reversing the name munging and mapped the "foaf_phone_mobile" back, the way you're going to update a value typically is to find it first by matching against the subject/property and wildcard the value, thus:

  foaf:person, foaf:phone, ???

That would work for the typical case where you only have one possible property/value, but won't distinguish between the two phone property/values we have here. If you were using a delete/insert approach, the chances are you'd end up overwriting the wrong one, or worse, blowing one of the phone values out of the datastore.

So we need to use a richer pattern, after getting the form data back. Something like this that leverages the type annotations we declared would do it:

  foaf:person, foaf:phone, ???
  ???, rdf:type, urn:mobile

But. When you have two mobile phones you'll need even further name munging because the above pattern isn't sufficient to pick out the right phone anymore.

Generally you'll end up doing something like this:

  <p>  
    <label for="work">work:</label> 
    <input size="25" value="tel:123456789" 
      type="text" name="foaf_phone_work_14387975">
    <input type="checkbox" 
      id="delete_foaf_phone_work_14387975" >delete</input>
    </p>
    <p>
    <label for="mobile">mobile:</label> 
    <input size="25" value="tel:987654321" 
      type="text" name="foaf_phone_mobile_3434535">
    <input type="checkbox" 
      id="delete_foaf_phone_mobile_3434535" >delete</input>
  </p>

If you don't, one day the form will blow up the data. Normally you'd manage the roundtrip through the web tier. That means there's a hashmap somewhere tying up the RDF with the name "foaf_phone_work_14387975". The other way to do this, if you don't minding hitting the storage in the interim, is to write the name value to the datastore first and relate it to the phone number:

  foaf:person, foaf:phone, tel:987654321
  foaf:person, foaf:phone, tel:123456789
  tel:987654321, rdf:type, urn:mobile
  tel:123456789, rdf:type, urn:work
  tel:987654321, form:bind, foaf_phone_mobile_3434535
  tel:123456789, form:bind, foaf_phone_work_14387975

Now we're looking for the pattern:

  foaf:person, foaf:phone, ???
  ???, rdf:type, urn:mobile
  ???, form:bind, foaf_phone_mobile_3434535


And that will be enough to get you back to the target data. It's verbose, but on the fact of there's good potential for automation without expanding too much your web toolchain and the number of data structures to have to manage.

Wrap-up

So, here are the takeways:

  1. Round tripping RDF to forms and back is tricky, not as simple as RDBMS backed data. When the RDF data looks like a hashmap where keys are non-unique you will have some work to do. With an RDBMS your psuedo excel for a phone would just be one long row (as opposed to lots of little rows) so you'd be roundtripping based on a row key.
  2. Any RDF vocab that allows multiple values for the same property without support for further qualification on the data isn't going to give you enough information to roundtrip with a form. You'll want to do some kind of extra type annotation in that case.
  3. You can only really do the RDF type annotation trick sensibly if the value is itself a URI, as with the phone numbers shown here. If you are working with literals things will be more complicated.
  4. Anywhere you end up using a hashmap to manage bindings in the web tier, you can manage those bindings as more RDF those and thus keep your form engine code as generic as possible. RDF graphs being pretty much hashmaps on steroids.
[update] I see Laughingmeme taglined this post as follows "The problem with RDF? Even something as fundamental to webdev as round tripping to a form is hard" Even with databases or objects, there and back again still requires plenty of manual mapping between form controls, form handlers and persistence mechanisms. Frameworks like RoR and Django show how this can be further automated - but it's not a done deal. The question with RDF/XML is whether it's too flexible to be automated cleanly for forms building.

August 15, 2005

Apples and Oranges

It seems that Dare Obasanjo thinks there are two ways to support podcasts in Atom and one in RSS2.0. He's incorrect in his analysis I disagree with his analysis, which goes as follows:

  • Atom for podcasts - use an atom:link whose @rel type is 'enclosure' to link to an audio file. Or use an atom:content whose type attribute is audio/mpeg (or some such media type value) and which links to an audio file.
  • RSS2.0 for podcasts - use rss2:enclosure whose type attribute is audio/mpeg (or some such media type value) and which links to an audio file.

Oranges

But, RSS2.0 also allows you to link to an an audio 'podcast' with rss2:link element. That would be similar in spirit to the atom:content way of doing it, insofar as it's not the spirit of either spec to link to audio files for such purposes, else neither would offer syntax for enclosures. As we're navel-gazing on the specs here rather than looking at what anyone actually does, the only interesting difference viz. podcasting is the fact that atom:link and atom:content both allow media type declarations as metadata. Whereas only rss2:enclosure allows the media type to be set and the rss2:link with its absence will defer to the media type set in the HTTP response. Some of this also comes down to conflating "podcasting" with "enclosures" with "links", which though it makes conversational sense to avoid such pendantry, in the way "AJAX" and "Web2.0" makes broad conversational sense, it is wooly thinking technically.

The conclusions I draw from this are:

  • Atom and RSS2.0 don't support "podcasting". Technically speaking, "podcasting" is about as meaningful here as "Web2.0" or "AJAX".
  • Atom and RSS2.0 support the notion of an enclosure, which is the basis for most "podcasting" functionality. Mixing up enclosures and podcasting is a mistake - it doesn't neccessarily make sense to limit enclosure functionality to podcasting.
  • Atom and RSS2.0 have the notion of a link, which could be used to support linking to audio and other non-textual media. Mixing up enclosures and linking is a mistake - it's probably easier to dispatch functionality for something like podcasting when you hang the data of an enclosure structure.
  • Irrespective of how podcasting is to be enabled, RSS aggregators need to figure what to do for audio/visual media types coming in via links, which might include doing nothing and deferring dispatch responsibility to the embedded browser/renderer control.

Apples

To add some flavour, Apple has published a spec for enabling podcasting via RSS2.0. It's specifically targeted at the iTunes application, and can be described as an RSS2.0 extension for that purpose. I think it would be weasel-worded to describe this as a 2nd (or even 3rd) way for RSS2.0 to support podcasting, even though it is more closely aligned with that kind of functionality than an rss2:link. It's more of an application-specific extension for iTunes.

As an extension it could be supported by any software that supported RSS2.0. More interesting, there is of nothing to stop this iTunes extension being published via Atom or any other syndication format, making it a kind of feed-agnostic vendor-specific microprotocol. I personally expect to see more of these as a function of the markets enabled by more generic innovations, such as enclosures.

August 11, 2005

More database type switching

I got some quick feedback on my question about how to treat disjoint types in an RDBMS, but it seems I left out some detail and I might have posed the wrong question altogether. To recap, there's an event structure as follows:

  class event:
    def __init__(self, what, where, when)
      self.what=what
      self.where=where
      self.when=when

whose 'what' value can be a string, a URI or an XML document. The thing is that these 3 types are disjoint and I was wondering what people thought the idiomatic way to deal with this issue was in an RDBMS.

Bill Seitz mentioned sparse tables where one of the 3 possible 'what' columns is populated for each row

"Or, maybe I'd make whatType, whatString, whatUri, and whatBlob fields in a single (sparse) table."

Aristotle Pagaltzis described a normalised approach and its potential runtime inefficiency:

"The clean, minimally redundant approach would be to use four tables, of which one is the 'event', table which holds only when/where pairs and a primary key, and of which the other three are 'what' tables whose the primary keys are simultaenously foreign keys to the event table. This way each datum can be stored in a properly typed column, without storing boatloads of NULLs as you’d have to if you did this with a single table having one column per type of value....Unfortunately, this is stupidly costly to query – you need three left joins in every single statement.Worse, you need the primary key from the event table before you can update any of the what tables, so you have to chatter back and forth with the database instead of dumping bulk statements on it."

Adam Vandenberg asked:

"Are you going to query against "what", or just process them when they come up in a query?"

So it seems to be the case that instead of thinking about a generified RDBMS setup here for disjoint types, we need to think about what needs to be done with the data. Ok, so of the 3 possible types (text, URIs, XML) two of them are candidates for querying against interactively:

  • The 'well-known' XML format is a candidate to be queried on as it has has a standard header set; that would be much more useful to capture as a table than as a blob. That way we can ask question like: "show me all the events where the XML header whose foo element (now a column) is 'bar'".
  • The URI is a candidate to be queried on since it is a name of some class of events: "show me all the events with a URI of 'X' since this date".
  • The text data I wouldn't expect to query against, just render.

That would tend to lead to them being kept in their own tables. In terms of volumes, I'd imagine we'd be seeing around 25,000 of these events each week, where about 80% are 'well-known' XML and for the sake of argument let's say we could flush the database annually leaving a running total of about 1,300,000 records.

Incidentally, Jimmy Cerra mentioned that in RDF:

"I'd use a NODE type and have the object be either a URIRef or a typed literal (definitely not rdf:type though)."

and if any of the RDF community are reading, you can multiple the above figures by 10 - each one of these events results in approximately 10 RDF statements. I gather that 6-10M triples is the state of the art in RDF storage but here we'd be talking about 13M statements, which I think would argue for partitioning the data in separate graphs.

August 10, 2005

Database type switching

Suppose you had an event object that had the following properties:

  class event:
    def __init__(self, what, where, when)
      self.what=what
      self.where=where
      self.when=when

Now, in the above imagine that 'when' is a date and 'where' is a URI. But suppose in your domain the 'what' could be one of three types:

  • a URI: this means something well known enough that we have a pre-canned name for it.
  • a String: this means it's some kind of opaque literal, like a java stacktrace or hexed coredump, or maybe some chunk of XHTML.
  • some XML format: this means the event has a 'well-known' XML format that we know how to process, ie some SOAP thing or an Atom entry.

The interesting thing here is that these 3 types are disjoint; they're not specializations of some common form, modulo the crashingly trivial observation they could all be represented as text. In code or markup you'd probably have a some extra metadata about the 'what' that would allow you to switch-on-type For example, in XML that could result in different element names altogether for each type maybe or some @type attribute, or in RDF, it would an rdf:type annotation.

So given that the range of value types can be disjoint I was wondering what people think the idiomatic way to represent this event structure in an RDBMS would be. Have a foreign-key for the 'what' column pointing to 1 of 3 possible tables? Make the 'what' column a blob? Change the design not to have use disjoint types? Have the three what tables hold foreign keys back the event table? Or some other approach?

August 07, 2005

Jive Messenger 2.2 ships

Jive Messenger is Java XMPP IM server (aka Jabber). Jive Messenger 2.2 is out and has some neat looking new features: Server to Server support with white/black lists, Asterisk phone integration and improved LDAP. Find out more here.

August 04, 2005

Django debugging followup

Adrian Holavaty came back quickly with a comment about my recent Django debugging session, which was centered around how Django loads your models and what it expects to find there. It's worth lifting in full for those of you interested in Django:

"Hey, thanks for checking out Django! Sweet.
We are *very much* interested in providing XML backends as an alternative to database backends. That sort of flexibility would be outstanding, and there's certainly an audience for it. Please bring this up on the django-developers list -- http://groups-beta.google.com/group/django-developers -- if you have a second. Anything would help, from broad implementation ideas to concrete patches.
Lemme address some of your issues --
* I18N -- We're very committed to getting this done. Problem is, we don't know the best way to do it; this is very new to us. People in the community have made suggestions here and there, but nobody feels comfortable taking on the *whole* problem. If you could contribute to that effort, *awesome*. Check out our developers list, whose link I've pasted above.
* LAYOUT -- Yeah, I'm not too crazy about the default module layout myself, and I'm the one who arranged it. We're very open to suggestions on this. Most of the layout doesn't really matter as far as Django is concerned, so we have a fair amount of flexibility in changing it.
* COMMUNITY -- The IRC channel is very-much self-sustaining at this point, in terms of people helping other people. Jacob (co-developer) and I might be open to giving people commit access, if that's what you're implying, but we're still getting our toes wet, getting comfortable.
* THE BUG YOU POINTED OUT -- I agree that that's a bug. Thanks for bringing it up. There are two ways to fix it: Change the default code generator to create "stub" models, and/or change the model parser to allow empty models. Both changes sound good to me, and I'll get to them ASAP."

and later:

"I've fixed the bug you brought up. Thanks again for reporting it!"


And I've duly updated against the Subversion trunk. I'm impressed, won over, and a little humbled. This is the kind of positive feedback that has marked out the RoR folks over the last year and is common (I think) in the Python world. Having signed up to the dev list, my concerns about community in particular seem like kvetching - the Django team are accepting and applying patches. And it's not like I formally filed a bug on the Django tracker.

The Django experiments are still going well - given that I'm not using any of the RDBMS heavy lifting and learning a framework, I'm getting through features at a fair clip.

Free your mind

"Still, I believe that forcing programmers to consider encoding issues whenever they have to store some text is a very useful exercise, since otherwise - this is important - foreign language users may be completely unable to use your application. What is to you simply a question-mark or box where you expected to see an "é" * is, to billions of users the world over, a page full of binary puke where they expected to see a letter they just typed.
Consider the other things that data - regular python 'str' objects - might represent. Image data, for example. If there were a culture of programmers that expected image data to always be unpacked 32-bit RGBA byte sequences, it would be very difficult to get the Internet off the ground; image formats like PNG and JPEG have to be decoded before they are useful image data, and it is very difficult to set a 'system default image format' and have them all magically decoded and encoded properly. If we did have sys.defaultimageformat, or sys.defaultaudiocodec, we'd end up with an upsetting amount of multi-color snow and shrieking noise on our computers." - Glyph Lefkowitz

The comparison of text encodings to image encodings is a good one. One of the hard thing about explaining or understanding this stuff is that you have to stop believing your eyes when you see text on a computer screen - with a computer text is an illusion.


* ed note: I had to hand convert this to UTF-8, since cut and paste would have resulted in binary puke

August 03, 2005

Lists. Absolutely. Expect no better.

"For the record, I had the same reaction to OPML. I still don't get that XML either. Do we need OPML *and* MSFT Simple List Extension? Do we need either? I am unconvinced." -Patrick Logan

I am very much enjoying Patrick's blog, syndicated as it is, using a simple XML format. So simple that once people thought it was pretty pointless, even for syndication and sitemaps. Today it's on the verge of becoming the lingua franca for everthing from searches to shopping to web services.

Lists are next. There is so much web data in ordered lists, it's not funny. An XML format for lists, however trivial, is going to have a massive impact. We could even end up with 2 or 3 different list formats and it would still have a massive impact. If it seems trivial beyond belief... well I guess there's no denying that, but it's exactly the absence of barriers to entry that make stuff like Atom/RSS and List markups cool.

August 02, 2005

Goll! That's like, way too big!

How come I can't get one of these for a hundred bucks on a USB key?. You know, Fortune 50 million and all that.

Down on the Client

Om Malik:

"I tag my post, Technorati benefits, and despite all that, my tags help spammers who clog my RSS readers gain more readers. Thats absolutely rotten! So essentially the spammers can write a script, generate tags, stay high on the Technorati listings and fool people into visiting their sites. By tagging I am helping this scumbags, the RSS-link blog spammers." -The Dark Side of Technorati Tags

I've said this before, tags belong down up on the client. Get those search engines out of clouds. As for spammers - that's the price of Web anonymity. Anonymity, last time I checked (yes, I did check) is not a neccessary feature to make the web work.

August 01, 2005

ping pubsub.com

James Robertson outlines in a roundabout way, one of pubsub's core offerings: "That used to work, but it doesn't anymore. What they need to do is track mentions of their products and services, and watch for emerging memes that might need higher level handling". This stuff will eventually function like security detection and control monitoring systems - think of it as a brand intrusion detection system.

Smalltalk by other means

CollectionClosureMethod is one of the best things Martin Fowler has written in a while. Go read it. I love it when he informs through code snippets.

You do get the feeling sometimes the statesmen of the agile community consider the last 10 years of statically typed container backed middleware a distraction from the real goal - the ability to implement business value faster than the business can think it up.

Django debugging, and MVCP frameworks

If you happen to be the only other person writing a django app without using the free (as in labour) RDBMS mapping stuff (yup, it's RDF + XML), here's a gotcha that might interest you.

If you see a stacktrace that ends like this:

    ...
    File "E:\lamp\python\Python24\lib\site-packages\django\core\meta.py", 
      line 303, in get_all_related_objects
        for klass in mod._MODELS:

  AttributeError: 'module' object has no attribute '_MODELS'

You're in for a treat. What the means is that a model file in one of your applications looks like this.

   from django.core import meta

   # Create your models here.

because you haven't put anything in it yet. When I came across this I was calling the top page of an app named 'units':

  http://localhost:8000/units/

Previously that had been working fine. I had added an app called contacts recently:

  http://localhost:8000/contacts/

and that was working fine - it was really just a placeholder returning a raw string while I was testing the URL dispatching. Here's the view stub for contacts.py:

  from django.utils.httpwrappers import HttpResponse

  def index(request):
      return HttpResponse("true")

But units/ was failing now. I hadn't changed the units view (honest!):

  from django.core import template_loader
  from django.core.extensions import DjangoContext
  from django.utils.httpwrappers import HttpResponse

  from mobius.apps.units.models import units

  def index(request):
      unit_list = units.get_summary_list()
      unit_list.sort()
      t = template_loader.get_template('units/index')
      c = DjangoContext(request, {
          'unit_list': unit_list,
      })
      return HttpResponse(t.render(c))

    ...

Nothing much fancy there. Did I say I hadn't changed it?

So I run off and check the main.py configuration, the URL mappings (the main pain source to date), and that the test runnner webserver was configured properly with envars and pythonpath. All fine. Take the contacts app away - units works. Put the contacts app back - contacts works, units fails. No joy: I'm left with a sort of Heisenbug - a new working app is breaking a previously working app.

Time to walk the execution stack in Django.

Here's what happens. When the /contacts/ URL is called, Django will look up its regex mappings and dispatch to the view that meets the pattern. Core will invoke the view and it will run just fine - bear in mind that the contact's model is never loaded. When the /units/ URL is called, Django will look up its regex mappings and dispatch to the view that meets the pattern as before unless that view is also calling DjangoContext, whereupon we go down the rabbithole.

Using DjangoContext results in a new execution path that causes each of your app's models to be introspected and reflected over using meta.py. What Django is doing is looking for all the related models that might be in scope for this request, and if so those models will be loaded, and if needed, out of the RDBMS.

Now, it's important to note that in this latter case, all the available apps are in scope to be reflected over, not just the one you called via the URL. Hence, if any of the model files has no content, this results in Django's introspector crashing out of iteration and sending the message "AttributeError: 'module' object has no attribute '_MODELS'".

Here's what I did to fix the problem:

  from django.core import meta

  # Create your models here.

  class Contact(meta.Model):
      fields = (
          meta.CharField('name', maxlength=256),
          meta.DateTimeField('email', maxlength=256),
      )
      def __repr__(self):
          return "contact"

All that Contact class does is stop Django's introspector crashing when units is invoked. The units/ view will work fine now.

So here's the deal: yes you can compose and plug apps in Django, and it's very nice, but an app with no attributes in its model can cause breakage in working apps that require a model check.

I'm still debating whether this is a real bug and is worth filing. I guess it can be dealt with by having Django's application code generator put some kind of stub attribute into the model. On the upside I got to learn a lot about Django internals.

Django nonetheless, is looking good. It has first rate support for URL design, intuitive URL/function mapping, a decent template language with pluggable templating an option, composable webapps (aka portlets, aka plugins), and on the whole is a productive environment. And it's not a 1.0 yet. I have some quibbles:

i18n: i18n isn't there yet (it's coming, I hear). While the java frameworks do lack on the productivity side, they generally do the right thing by i18n.

Layout: Django breaks your app into three folders: views/models/urls. If your app is called polls you get this layout polls/views/polls.py, polls/models/polls.py, polls/urls/polls.py. That's a lot of polls. Now, having three files called polls.py and trying to figure out which is which in the editor can get irritating. I like the idea of split directories in principle - in practice the less directories the better. I really doubt the Django people are going to do anything about this (they did build it after all, not me). But a whole module for a tuple of url regexes? My best guess it enables app composition and the name-munging needed for automation.

Community: I will be interested to see how the Django team deal with the influence of feature requests and bug reports. So far they are fixing them quickly and attributing in the commit comments, but at some point the Django team need to think about how the project could function and support a community without them.

RDBMS Models: as the issue I started with indicates, if you are not using a RDBMS to back your webapp's models you are a second class citizen in Django. Django has enough features and niceness to make living there ok, but you'll never be able to put it out of your mind. I imagine this could result in implementation by exhaustion: "oh screw it, slap it in the RDBMS, I is tired".

MVCP frameworks

This issue of the RDBMS is terribly conflicting, if you believe to some extent as I do, that RDBMSes are not neccessarily your first choice place to put long-lived data, or at least not the only choice. Ruby on Rails and Trails also present this dilemma. I really recommend the Prag's Ruby on Rails book for anyone building any kind of webapp, but it's interesting to see them promote a framework that defaults to using a RDBMS. If you know your agile history, you'll know that starting with a RDBMS is considered questionable. Something's afoot - or not. It's tempting to conclude that for all the talk about web MVC, you can't easily automate to the level of the next generation web frameworks without putting your Persistence mechanism up there with the Model. It seems to reach the levels of productivity implied by Rails, Trails and Django, we're talking about MVCP frameworks, where P stands for Persistence. It's for each to decide whether that's a good tradeoff. Open data is king, and putting data into a silo to gain a quick productivity win early on is dubious. It's not clear how good the support in any of these frameworks is for alternate persistence mechanisms.