Speculation: for Data APIs in 2009 there will be two developments and one debate. All are centered around an important technical principle in web design - uniform interfaces (an idea that goes back quite a bit in distirbuted systems).
First idea: putting links into API data. The REST community call this 'Hypermedia as the engine of application state' or HATEAOS. Yes. Worst. Abbreviation. Ever*. I tend to call it "links in content". Nonetheless the idea is simple - put links in your format data. Heavy Atom and HTML users do this already, almost subconsciously but a lot (most) proprietary data APIs fall down here and in doing miss out on a number of things. First are the network effects of being able pass along URLs. The very essence of a "web" is linking, to almost the point where operationally a "good" webapp is own that use plenty of links. Second are decoupling of clients from your servers -if you describe in your format both where links can be found and what their "type" is via a metadata qualifier you are free to refactor on the server side, relocate the server, introduce a CDN, whatever. For example, Atom qualifies links using the "rel" and "type" attribute in a way that will work for every web site on the planet. Clients extracting links out of the data and constructing an absolute minimum of URLs are loosely coupled to server structures - URL parsing and generation being important coupling point in API design. Third is simplification of client code - just pull out the links them, render them in the UI or call them using HTTP methods. You don't even need to design the link elements - steal Atom's link element or lace your current XML with "src" attributes that contains URLs.
For more detail on how links in content can work I recommend reading Mark Baker's "Hypermedia in RESTful applications" and Subbu Allamaraju's "Describing RESTful Applications", both on Infoq.
Second idea: "standardisation" of feed metadata. Recently Dare Obsanjo and Dave Winer have blogged about inconsistencies in MediaRSS that complicate data consumption. Dave Winer on adding a phot site into friendfeed:
"I always assumed you should just add the feed under "Blog" but then your readers will start asking why your pictures don't do all the neat things that happen automatically with Flickr, Picasa, SmugMug or Zooomr sites. I have such a site, and I don't want them to do anything special for it, I just want to tell FF that it's a photo site and have all the cool special goodies they have for Flickr kick in automatically."
"We have a similar problem when importing arbitrary RSS/Atom feeds onto a user's profile in Windows Live. For now, we treat each imported RSS feed as a blog entry and assume it has a title and a body that can be used as a summary. This breaks down if you are someone like Kevin Radcliffe who would like to import his Picasa Web albums. At this point we run smack-dab into the fact that there aren't actually consistent standards around how to represent photo albums from photo sharing sites in Atom/RSS feeds."
I went a few rounds last year with MediaRSS and have to agree I'd much rather have something as well specced as Atom is for syndication, or RFC5005 is for pagination. And it's not limited to media. Same goes for Geo Data (main criteria being, does it support WSG84?) , contacts, activity/events, representing arbitary site metadata, even Exif. Making things up or having to choose between competing formats is a real pain. There are two problems - where to place the data, because all the popular formats have sufficiently arbitrary structure that something like MediaRSS can appear in multiple places (the difference between Picasa and Zommr as Dare outlined in his post) and how to notate it (the difference between MediaRSS and Smugmug again as Dare outlined).
If you are a semwebber around long enough to remember the syndication wars, you will be having a good old chortle as this problem is arguably solved better by RDF/XML than any syndication or markup format. It's an interesting turnaround, since one of the arguments against RDF adoption for syndication back then was that clients and servers had common internal object models for syndication data and thus a formal model on the wire didn't matter that much - the parse/lex layers could switch. Extension metadata it seems is a bit different - varaibility has a cost. Whether you agree or not re RDF, if this impacts you, having a look at how RDF or even RSS1.0 modules work get described and how they are supposed to be parsed into a data structure is no harm at all.
RDF is worth learning for a different reason — the profound enlightenment experience you will have when you finally get it. That experience will make you a better format and data API designer for the rest of your days, even if you never actually use RDF itself a lot. (You can get some beginning experience with RDF fairly easily by writing and modifying simple files like FOAF and DOAP for social networks and software projects, or RDFa extensions for XHTML.)
The debate: should there be that many custom formats? Via Kevin Marks and Aristotle Pagaltzis, I came across the "precious snowflake" analogy for APIs which to me describes the situation perfectly both across hundreds of websites and within content domains such as geo/contacts/media. Here's Aristotle:
" There are a lot of good existing choices once you get over the idea that your domain is a unique and precious snowflake."
There are probably hundreds of publicly available APIs today, all different, each their own "SiteML", and you have to be able to mash them all. The big but smart companies, such as GOOG and MSFT that have application suites and not just individual web silos have adopted common syntax, posting and extension models that allow for consistency and evolvability over time - individual API offerings might seem suboptimal and indirect, even obtuse, but the overall product portfolio makes a ton of sense - as well as lowering consumer costs it allows them to ship client APIs with less hasssle. This is basic platform and product architecture - reduced variability at one layer allows for increased offerings with lower costs at higher layers. Standalone web properties just don't do this today; each individual API is like a precious snowflake, but being in the snowball business is expensive, and so is keeping that snowflake preserved (when you designed that API did you think about encoding, escaping, empty v not-present, namespaces, timestamps, bidi, versioning, extensions, content-negotiation, cacheability, required v optional, new formats, input sanitation? Didn't think so ;). This creates a new market for web integration providers such as Friendfeed and Gnip ("making data portability suck less") or silo publishing providers such as Mashery. We call them aggregators in the web consumer space but when you get to scores of providers it effectively requires the "EAIfication" of mashups, or if you prefer, the introduction of Value Added Networks (VANs) for consumer data. Others like EBay and sf.com seem to have become subject to X.Y.Z versioning issues which are maintenance nightmare* (I find these tend to beassociated with SOAP style processing models - YMMV). So how API families like DISO and OpenSocial, or specific formats like Portable Contacts, Activity Streams and Atom Media Extensions develop will be important this year. That or we start taking microformats and RSS/Atom/JSON extensibility a lot more seriously than we do today, or the number of APIs will soon be in their thousands.
* X.Y.Z for software binary compatibility, sure, but X.Y.Z in data formats is arguably missing the entire concept of web data APIs - when clients are out of your administrative control, lockstepped upgrades are practically speaking, impossible.