The API Join And Avoiding It
A year or so ago, whilst writing groupr (RIP), I came up with what I thought was a useful name for something I found myself doing a lot: the API join. I'm fairly sure this is common to a lot of Web (2.0) APIs, but it's especially common with Flickr. For example, take groupr. First, it would do a call to get the groups you're a member of. For each of these, it then fetched the photos in the group. Obviously, this has a problem: as the number of groups you're a member of goes up, so does the number of calls to the API - and each call takes about a tenth of a second. The only way to mitigate this, and the solution groupr used, was to page the groups - and even that leaves you making as many calls as you have groups on the page.
The problem reared its head again when I was looking at doing a ffffound-inspired Flickr favourites app. I wanted to display the usual Flickr size, rather than square thumbnails (as Flickr's own favourites page does). Unfortunately, the standard call to get favourites didn't list the size of the photos, and I really didn't want to spend two seconds fetching them all. Other people have raised similar questions on the Flickr API group discussions; for example, here's one about getInfo and getExif, and here's another about getting photo sizes.
Imagine my surprise, then, when I looked at the documentation for flickr.photos.search and noticed a new argument to the "extras" parameter: o_dims. It turns out this returns the original height and width, and is also available in the favorites methods, so now it's possible to avoid doing those calls, and to embed derived height and width for web-scale images from a single call, even for the 36 or so images on Flickr's version of the page.
Of course, this is simply because the API has now moved the join deeper; instead of being at the API level it's being done inside Flickr (presumably at the database level). In fact, I suspect that last weekend's database downtime may not be unrelated (perhaps it was needed for the launch of Apple TV's Flickr slideshows?). It also doesn't help with the other methods, such as getExif (there's a reason I've moved some of my EXIF data to machine tags, which are fetchable with another extras parameter to many calls).
Facebook, interestingly, allows a SQL-like query language as part of their API access, but I wonder how they deal with queries that could bring the database to its knees. I do notice the line
In order to make your query indexable, the
WHEREshould contain an=orINclause for one of the columns marked with a *.
Is that an enforced criteria, or is it merely a recommendation, and do they return long-running queries without results to keep up database performance? It's the sort of thing I'd love to see Flickr add to their API, but I can imagine the problems are far from trivial, and in the meantime, I'm very happy to see one API join bite the dust.
Comments