" /> Bill de hÓra: January 2007 Archives

« December 2006 | Main | February 2007 »

January 30, 2007

Mixed Content

Sean McGrath: "If you need mixed content you really need it."

[Filing under, why didn't I think of that.]

links for 2007-01-30

Django generic views and page not found

If you're working with generic views, and are getting a "Page not found (404)" message for a date based view, it's probably because you haven't got an model item with a matching date .

January 28, 2007

links for 2007-01-28

January 27, 2007

3 pillars

Ben: "I have been helping preed bring Mozilla into the world of distributed version control systems. It sucks."

Ben is getting his head around DVCS. Having started to look at these recently coming from the world of Subversion, and prior to that, CVS, I can sympathise with his pain, especially if you're dealing with a boehemoth like the Mozilla tree.

The conclusion I draw from this and my own experience having migrating my fair share of source trees is that the version control system is a first order effect on software, along with two others - the build system and the bugtracker. Those choices impact absolutely everything else. Things like IDEs, by comparison, don't matter at all. Even choice of methodology might matter less. Although I'm betting there are plenty of software and management teams out there that see version control, build systems and bugtrackers as being incidental to the work, not mission critical tools.

links for 2007-01-27

January 26, 2007

my new publishing technique is unstoppable

How to work with blog posts:
GET Introspection URI
   scan the list of workspaces for the collection you want to post the blogpost to
GET to Collection URI
   read the nice atom feed 
POST to the blogpost's collection URI
   push a blogpost formatted as a nice atom entry
GET or HEAD to blogpost  URI
   grab the  blogpost
PUT or DELETE to blogpost URI 
   change or delete the  blogpost
Next week: working with timesheets.
This post's content is based on the following template
GET Introspection URI
<small><em>   scan the list of workspaces for 
the collection you want to post the %s to </em></small>
GET to Collection URI
<small><em>   read the nice atom feed </em></small>
POST to the %s's collection URI
<small><em>   create a %s formatted as a nice atom entry</em></small>
GET or HEAD to %s  URI
<small><em>   grab the  %s</em></small>
PUT or DELETE to %s URI 
<small><em>   change or delete the  %s</em></small>
(I'd tell you to buy mnftiu gear, but it's always sold out.)

from the mouth of babes

Nick Gall: "in almost every important aspect, WS-* violates (or at best ignores) the architectural principles of the Web as described in the W3C's Architecture of the World Wide Web". That, from a Gartner VP.

links for 2007-01-26

January 25, 2007

XForms in Firefox

Elliotte Rusty Harold : "XForms makes development of Web-deployed applications faster and easier.". Nice show-not-tell piece from Elliotte.

January 24, 2007

links for 2007-01-24

January 23, 2007

OSGi JSR passes public review

JSR291 sailed through the public ballot.. it will be nice to have OSGi packaging widely available on servers. But the ASF's comments about the current licensing are critical.

XQuery 1.0 and XSLT 2.0 ship

W3C XQuery 1.0 and XSLT 2.0 Become Standard

15,000 tests, one formal semantics and about 6 years later. Guess I'll have to buy the next version of Brundage, now that it's a rec.

links for 2007-01-23

January 20, 2007

links for 2007-01-20

January 19, 2007

links for 2007-01-19

January 18, 2007

It lives!

Jabber Software Foundation Renamed to XMPP Standards Foundation (some context for the title).

via cote


Bjarne Stroustrup: "There are more useful systems developed in languages deemed awful than in languages praised for being beautiful--many more. The purpose of a programming language is to help build good systems, where "good" can be defined in many ways. My brief definition is, correct, maintainable, and adequately fast. Aesthetics matter, but first and foremost a language must be useful; it must allow real-world programmers to express real-world ideas succinctly and affordably."

links for 2007-01-18

January 17, 2007

FIPA Abstract Architecture

Who are FIPA?

FIPA are the Foundation for Intelligent Physical Agents; a vendor consortium to define architecture and standards for software agents. FIPA has strong links with the OMG and there is some cross-fertilization with the W3C via the latter's Web Ontology program. So as to avoid the black hole of asking what is an agent, you can replace the word "agent" with "service", and this should give you a good idea of FIPA's relevance: think of agents as being consumers and producers of services.

FIPA Architecture

The key FIPA document of interest in is the Abstract Architecture (AA). This document describes basic infrastructure and plumbing required to enable agent communications. The architecture is remarkably similar to what is commonly called a Web Services stack (WS). FIPA has been working with this architecture for about 9 years.

There are numerous implementations of the AA, most of which are in Java. It's fair to say the FIPA were late to notice the similarities between Web Services (WS) and the AA. For whatever reasons, FIPA has not decided to publicize or orient its efforts with respect to web services infrastructure.

The main differences between FIPA AA and the WS "stack" are 1) the level of abstraction (the AA is more abstract), in that FIPA does not mandate specific technologies such as XML or specific standards such as WSDL and SOAP, 2) FIPA assumes the message payload to be a well structured and well understood item, based on what are called communicative acts (messages that are designed to have an effect on the receiver) and very probably grounded using shared ontologies, 3) FIPA does not have the concept of a procedure call, the only things sent between agents are messages, 4) FIPA have a full understanding of the requirements for a services oriented infrastructure thanks largely to its history with the OMG while the WS community is still working through the issues via XML based standards and extensions to both container managed platform middleware such as J2EE and Web based architectures (such as Apache Axis), often under the label of 'SOA'.

FIPA Enveloping

A key architectural component in the AA is the agent message and the methods whereby that message is enveloped for transport using arbitrary protocols and networks. I'd encourage anyone to look at how the AA enveloping model works, noting the following:

  • There is an architectural distinction between a message and a transport-message. This disinction derives from the separation of agents and transports. Messages are transformed into payloads that are suitable for insertion into a transport-message. The transport treats the payload as opaque both semantically and structurally. By way of comparison, SOAP is also semantically opaque, however SOAP requires that message content be well formed XML element content since payloads are physically nested, ie SOAP payloads are not structurally opaque. AA transport-message serializations for the web have tended to use multipart-mime to achieve structural opacity (much like using SOAP+attachments in favour of vanilla SOA.
  • There is a clear separation between the names of senders and receivers and their addresses. Agents only require names to send messages. Gateways and transports do not require names to manage message delivery and routing. The binding between names and addresses is acheived via an object called a Locator, which are designed to be accessible via directory services such as X500, LDAP or JNDI.
  • Both messages and transport-messages are based on dictionaries of well known keys with assscociated values, or in AA parlance "Key Value Tuples" (KVT). This makes both elements extensible; there are a minimal conformance levels required for interoperable messaging (read: you have to populate a certain number of headers to interoperate).
  • Indeed, almost ever architectural element of the AA is a KVT; the AA is entirely compositional and does not make use of subclassing or inheritence mechanisms. FIPA do specify keys and their meaning under the namespace "org.fipa.*".
  • The AA messaging elements don't specify semantics for routing or orchestration.

Why didn't this catch on?

It's not spurious to claim that the AA is the most mature (though not widely used) expression of a distributed services oriented architecture available today, particularly in its notion of what the core platform services are. You could take the AA, ground it in WS standards and have a well-structured platform that will be remain reasonably future proof as WS standards evolve.

FIPA AA perhaps did not catch on as a basis for either WS or SOA for a number of reasons. First, it's is lumbered with the notion of intelligent agents (IA), which fell out of favor post the dotcom bubble. Second, it is a very ambitious standard set and arguably required or assumed too much by way of deployed infrastructure (a case of running before you could walk). Third it carried over considerable baggage from distributed object technology (notably CORBA), that makes over-internet networking difficult. Fourth it might have been too abstract, and introduced too many options especially in terms of on the wire protocols and concrete syntax (even WS pragmatically fixed on a concrete syntax with SOAP). Fifth, the programming model for dealing with application message semantics is very different from the 'business logic' most commercial middleware developers are familiar with.

It's interesting to note that IA terminology is creeping back into the ESB and SOA parlance, typically where either services of clients or services are ocassionally called 'agents', or where the need for application level interop driven by shared message semantics or ontologies is considered important. It's more likely here that 'agency' happens to be a natural way for engineers to talk about SOA/ESB, rather than a sudden rediscovery of a decade's worth of IA research and development. Talking in terms of 'services' is arguably largely done for the benefit of non-technical stakeholders.

January 16, 2007

links for 2007-01-16

January 14, 2007

links for 2007-01-14

January 13, 2007

links for 2007-01-13

January 11, 2007

links for 2007-01-11

January 10, 2007

Get those 3d glasses on!

Originally uploaded by bdehora.
My new ubuntu's eclipse icon has purple fringing.

January 09, 2007

Mercurial, Part I

What is it? Mercurial is a distributed version control system (DVCS), written in Python and released under the GPLv2. This post is an initial impression and notes after playing with it for a few days - please don't construe it as a recommendation pro or con, or in any way a complete overview or assessment of what Mercurial can or can't do. [calling this a "review" would be unfortunate.]

Why look at a DVCS? For version control I've used Subversion almost exclusively for over 3 years, before that it was CVS. I'm interested in the distributed VCSes pragmatically for 4 reasons. First to allow offline or disconnected development. Second, to deal with OSS codebases and codestreams that I either depend on, have to patch, or have to upgrade and thus end up having something else to manage - this issue gets bigger for me each year. Third, as a way of distributing code without having to manage or worry about forks or long running branches - this issue for me is small, but growing. Fourth, as a potential publishing tool for content management and distributed authoring. And personally I'm plain interested in distributed computing systems.

Another theme with DVCSes is scaling for the code and committer base. The Linux kernel is famously distributed. OpenSolaris is, and the JDK will by the looks f things, run on Mercurial, primarily it seems for scaling reasons. A list of projects using hg is on the mercurial wiki (my surprise inclusion was MoinMoin). Not everyone has this need, but Mercurial is claimed to be able to scale down to work in the small as well, which is interesting.

Installation and setup. I upgraded to Ubuntu Edgy so I could run Mercurial 0.91. Ubuntu Dapper universe defaults to 0.7x *, and I wouldn't recommend using below 0.9 because of the improved revlog and HTTP push support. Installation was breezy. Command line usage is like CVS/SVN and is highly idiomatic. The executable is called 'hg'. Subversion users will recognize many of the commands. Creating a repository is easy - the repository folder is yours and hg adds a .hg subfolder that contains the metadata and version history.

To be clear - even though I used apt to install it, Mercurial is self-hosting, which is a must imo before looking at any kind of VCS, and impressive given the project is barely two years old.

Concepts.In Mercurial you don't check into a central server, you work locally. To be clear, the repository is created on your file system - you work on top of it and not in a checked out sandbox (this is somewhat like RCS as I remember it).

To coordinate changes you either push changes to another repository or publish your repository for others to pull from. This means you are passing around sets of changes between repositories that must be integrated ("merged") locally.

Each commit to a repository results in a new version of the repository, the commit itself is recorded as set of changes, called a changeset, or cset. Changesets have globally unique ids, and can be given symbolic names, called tags; this is not the same concept as found Subversion or CVS, but in mercurial tags are useful as a vernacular for sharing changes across repositories.

Mercurial's branching/merging model itself is conceptually simple - a branch results in two child repositories coming from a common parent - a merge creates a new repository that is the child of two parent repositories. New repositories are created by cloning an existing one.

You can branch and merge locally or publish changes to others over a network. This is not the same as Subversion, where a branch is an internal copy of a subtree; and compared to Mercurial, Subversion doesn't have any notion of tracking merges.

The thing to figure out, coming from a centralised repository background, will be idiomatic use. For example: when it comes to release management, whether to use named branches or repository clones seems to be an "it depends" matter.

Bryan O'Sullivan gives a good overview of the concepts in this video.

Best bits. Cloning+update or upstream push provides implicit backup solutions. The source is in Python, so I can read it without bleating about C being hard. RSS feeds are available by default over the Web UIs. However the most appealing features is being able to work disconnected with the entire history available. That's huge, assuming you and your development methodology can get past the non-central model. My sense is that a distributed VCS requires more individual discipline in the development process and more inter-developer communications than the 'command and control' policy implied by a central server.

Definitely, tools like Mercurial are not for those who don't see version control and patch management as a critical part of the development process, or think that a VCS is a glorified backup server. There is an implied way of working with this kind of tool that not everyone will need or want.

A distributed VCS potentially helps solves a real problem - forking. Forking is not just an OSS thing. I take a more general view of it as being cut and paste in the large that results in duplication and (often) unwitting adoption of codebases. This kind of code adoption is especially hurtful to commercial projects. Here's an all too common pattern - checkout a 3rd party codebase from one repository, check into another. Customize, extend or fix the 3rd party code. Don't send back the upgrades, perhaps because you don't have time, perhaps because the adopted code is welded into the new software, but fundamentally because you've diverged sufficiently far away from the original (and now changed code) that you've got no easy way to rationalize the code you have to manage. Service and fixed price engagements exacerbate this, by not being optimized financially for long term code maintenance, support and reuse. Each individual engagement thus costs more than it should and scale opportunities are lost (local v global risks are traded off). Anything that ameliorates this is worth looking at imo.

Worst bits. I found hg push and repository publishing clumsy to setup. Coming from subversion, one the first things I wanted to do was publish a repository and be able to push and pull from it. Setting up for push/publishing is a nuisance. Tunneling over SSH is, as ever with all things, a pain. I gave up after a couple of hours of fooling around with authorized_keys, ssh-agent and friends. Error messages were not helpful. Instead I set it up to run multiple repositories behind a single Apache conf (which is how I generally setup Subversion). That took an hour due to a control character in the config file that stopped it from being read (arrgh). Eventually I'll get that behind SSL+Basic; for now I have a working master server multiple repositories. In fairness to Mercurial, getting the administration setup right is a usability thing that can be fixed, and not intrinsic to the VCS itself.

I suspect Mercurial might not be IDE friendly as each new clone will need a new project set up for it. How important this is to you will depend. Those on Linux will end using symlinks to dupe the IDE and emulate Subversion's switch command. This isn't to do with exposing metadata or SPI hooks to IDE tools, it's fundamental to how Mercurial branches work as standalone repositories.

The key message from the Mercurial community seems to be an emphasis on speed - in other words, it's like other DVCSes but niftier. My personal thing with any VCS is stability over the data, not speed - fast is good, but safe is better. Still, there's no rush; I evaluated Subversion for the best part of a year before moving my personal work to it, and hg won't be done for a while yet.

Conclusion. I liked it, despite some pre1.0 rough edges and some conceptual hurdles of my own I'll have to clear. I love the fact that I can work offline while subscribing to others' RSS feeds to pick up other changes. The distributed patch and branch management support seems to be extremely powerful (as in, I haven't entirely 'gotten' what's possible yet, and am sure to blow at least one foot off). The ability to manage 3rd party codestreams is given first class treatment in Mercurial whereas in Subversion you work with idioms like vendor branches. I hope they get renaming sorted out. It's fun to use; I'm going to move one non-critical project to Mercurial and continue to play with it over the next 6 months.

January 08, 2007


Danny Ayers: "With the GRDDL mechanism in place, as far as the Semantic Web is concerned, microformat data is RDF.".

In other words, the work of generating RDF will be placed on people who want to use RDF. I think this idea of extracting RDF from published markup instead of using RDF as the backing data to generate the published markup is a big deal. For one, it will mean less RDF tax on existing publishers, who seem to be happy to stay with HTML, RSS and microformats (uF). Second it distributes costs fairly - RDF proponents will be forced to derive value from what they extract instead of playing schedule chicken with publishers, and pushing costs back onto them to supply the data just so. Third, from a systems design viewpoint, extraction is a much cleaner design than trying to kludge RDF support on top of existing RDBMS storage and web frameworks. It's cheaper today to publish uF via web frameworks, databases and templates than retool internally with RDF based technology - uF by being HTML is a relatively low-impact upgrade on the templating tier, not a rip and replace of the data/object tiers. I've been saying for some time that the Semweb is missing a layer, the one that infers the useful information from syntactic markup. Maybe uF and GRDDL are that layer's ingredients. For starters, it would be very interesting to augment Planet software with scanning tools extracting RDF from uF and republishing the RDF for SPARQL queries or as RSS1.0, if only to see if anyone can or wants to derive value from it.

January 07, 2007

links for 2007-01-07

January 05, 2007

That's exactly how I feel about Emacs

Simon Willison, WriteRoom: "A place to sit down and write."

Looks nice. There's a windows variant called Darkroom.

January 04, 2007

links for 2007-01-04

January 03, 2007

Intentional Programming

Charles Nutter: "JRuby will succeed, or I'll die trying."

links for 2007-01-03

Activity list for 2007

Learn Chinese. It was the Unicode book that made me do it. I ordered a copy of Yong Ho's Beginners Chinese. Back in the days of secondary school, I found learning natural languages the hardest; we'll see how it goes.

Probability. It seems I've forgotten large chunks of my secondary and undergrad math. More directly, I'm getting interested in using probability for software planning and testing purposes. And it's a fun subject.

Touch typing. I say this to myself every year. Never happens!

Atom Protocol. I'm starting to feel bad for Tim :) After having to step back for a while, the first thing is a sweep of draft12 along with an editorial bug list which Joe has started. I also want to look at how to use it to publish batch or tree-based content - both are common in my line of work, e.g. publishing mass content from a CMS into a site, but aren't directly supported by the protocol. I'd like to add support for APP for Django, but haven't figured whether to do it just for entries or embrace/extend the ORM to accept models as serialised entries.

Migrate from Movable Type. MT has been a wonderful app to use for the last four years. But without the original MTBlacklist I'm getting spammed just over once a minute on comments and trackback - and I'm not prolific at the moment. I'm thinking about Wordpress+Askimet. Or just rolling my own on Django with Askimet or Spambayes middleware (and taking a hard look at OpenID). Mostly what;s holding me back is preserving the exisiting URLs.

Improve a Programming Language. They say you should learn a new language every year. Instead of learning a new language, I'd like to become fully fluent in one I already 'know'. Javascript ECMAScript probably, or maybe Ruby.

Microformats and JSON. I'm sold on these guys, and want to do some more exploring.

'Relational metadata'. I'm interested in this for all kinds of reasons. If you've worked with RDF or OODB systems like Zope/Plone, you'll appreciate how flexible they can be for managing content compared to an RDBMS. You'll also understand perhaps how difficult it can be to get anything that isn't an RDBMS past procurement and departmental IT. What with the rise of tagging and Atom Protocol being able to store arbitrary extensions efficiently will become important (for example any hi-flex solutions I've seen for tagging break with relational idiom). I suspect putting metadata stores alongside or on top of existing relational systems as augments will be a better approach than expecting people to retool (or in some cases using the right tool in the first place). APIs like JCR and SDO implicitly support this decoupling in Java, as do Zope2 Archetypes via their ability to declare storage at the field level. As well as all that, I've also been looking at Maven repository metadata and how applications are coming to depend on it being accurate.

More Django. For building web sites, Django is the framework I keep coming back to. I think it will become dominant in the Python space*. It's just a joy to work with.

More Java. Java's about to get interesting again. I kind of had enough of it (and all the typing) 3 years ago, but there's lots of cool stuff going on now - the JDK going OSS, better support for concurrency , generics, Eclipse maturing, Jini relocating to ASF, JEE bloat on the wane. And there's real commercial interest in running dynamic languages on the JVM.

More tech writing. The posts that invariably impress me are technically focused, deep, and crisp. There should be more of that around here. The long essays pontificating on this and that seem to be more popular in terms of hits are fun to write, but I have a feeling the tech posts and howtos are more valuable.

Development processes. One thing I'm interested in figuring out is to do with aligning modern development processes with traditional project commercials and organisations - there's all this great Agile and lightweight RUP material out there, but very little said on how to fit it inside either existing payment and engagement models (namely fixed price and T&M), or deployed services (such as operations and support).

Open Source. I have to get my act together on contributions. Really.

Staying focused. I'm geared towards systematic analysis and spotting patterns and relationships in things - things to me tend to be secondary to how they interact with other things. The downside of that is that I can off on tangents and get distracted (Look! Shiny!). Staying focused is one of the main reason to do this list. I guess part of this will involve allocating blocks of extracurricular time properly.

Other Lists

Danny Ayers

Niklas Gustavsson

Some people will know I like Plone a lot, despite Zope2's innate complexity under the hood, but Plone is a CMS product, not a web framework - think of ASP.NET and Sharepoint to get a sense of the difference.

Technical reading list 2007

I want to keep this post updated over the course of the year. I'm curious to compare what I said I'd read versus what I actually did read. Update: added a list of book recommendations from people

Programming and Languages


Java Concurrency in Practice. This was on my last currently reading list and I'm not finished yet. Clearly the standard reference for understanding Java's recent concurrency features.

Javascript the definitive guide. Now that AJAX is white hot, there's a fifth edition with plenty of new material. I've heard mixed reviews, but I'll go on faith - Flanagan is a good writer.

Concepts, Techniques, and Models of Computer Programming. Also on my most recent "currently reading" list. I'm about 2/3 through, and I suspect this is one of those texts that you read over and over, and don't really "finish". Brilliant and important book, absent of the religious baggage that surrounds programming paradigms.

Code Complete, 2nd ed. More than any other book starting out, this one set me right as far as good habits go. Before the pragmatic and effective books, there was Code Complete. The first edition has timeless advice, but is technically out of out date; I'm curious to see how the second edition holds up.


The Toyota Product Development System. The lean movement has approaches that might be applied to software development. I'm interested in concurrent set based engineering - working on multiple design options in parallel and converging on solutions. It's a probabilistic technique, and like many things probabilistic, it's counter-intuitive.

Software Systems Architecture. As much out curiosity as anything else, to see how the SEI/CMMI/ISO crowd think about articulating a design.

Framework Design Guidelines. Apparently this is derived from internal material at Microsoft. I'm curious to see how they do things in the big house.

Compilers: Principles, Techniques, and Tools. Second edition! After 20 years! Apparently it's a full update, plus two new chapters.

Project Management and Planning

Agile Estimating and Planning. It'll be interesting to compare this to McConnell's Software Estimation. The contents of Chapter 2 "Why Planning Fails", along with the emphasis on stating exit criteria and features hit enough of my hot buttons to take a further look.

Proactive Risk Management. I'd like to understand more about quantative approaches to risk. Risk management only works if risks are managed; an unattended laundry list drawn up at the beginning of a project and left to wallow therefater doesn't cut it.

Applied Software Project Management. Expecting good things about this one going by articles by Stelling and Greene on oreilly.com.


Waltzing with Bears, Tom DeMarco. From Jonas.

January 02, 2007

links for 2007-01-02

January 01, 2007

links for 2007-01-01