2 posts tagged “facebook”
I've been thinking a lot about aggregation (and lifestreaming) recently. Actually, that's not quite right; I've been thinking about it for a while, but the launch of friendfeed and another look at jaiku¹ brought it to the forefront again. (This post also overlaps somewhat with my Feeding the Daemons post over at my other blog, but I hope it's set out a bit more clearly.)
So, what do these services do? I'll try and make a couple of definitions that might be useful, even if only within the context of this post. Firstly, there's horizontal aggregation. This is at the level of a single service, but it spans users. Most of the entries in my Safari bookmark bar - pages I can get to with a single key combo - are like this: my friend's twitters, photos from my contacts, my del.icio.us network, and friends pages here and on LiveJournal. This means that I'm usually seeing the same sort of object (very short text, photos, links, and posts respectively). Lots of sites offer this (Tumblr's dashboard is another example).
Secondly, there's vertical aggregation. This spans services, but only takes in a single individual. My home page is an example of this, although it's not exhaustive. The best example I've seen is Les Orchard's accumulator, because it lets you control how much or how little you see, and it makes it very clear what's being collected. There are plenty of other examples; one of the first I saw was Jeremy Keith's stream.
Vertical aggregation is actually quite popular, and you can turn a bunch of services over to it; for a while that's what I used tumblr for. In the end, though, I decided I wanted a bit more control, and did it myself. More often, people don't have the infrastructure to collect everything together nicely, which is where services like LoudTwitter (which collects your tweets for the day to your blog) and Twitter Updater (which posts to Twitter announcing blog posts) come in. Unfortunately, they're a bit hit and miss; it's possible to configure circular posting, and I've seen people use tools to post their twitters to more than one of their blogs, which seems a bit much².
So, what's new with Friendfeed and Jaiku? Well, they're a combination of horizontal and vertical aggregators: they pull in updates from lots of people and lots of sites. The obvious problem is that, if you're dealing with purely horizontal or purely vertical aggregation, you'll never see the same item twice. If you start mixing them, though, duplication can become a real problem.
What's the solution? Well, I'm not sure, but it seems that filtering is going to be pretty important; not just the crude level of cutting out feeds, but individual items. In fact, full-blown email-type filtering ("if subject contains 'tweets for today' then hide") might well become important.
In fact, I wonder if the failure to tackle this is part of the problem with Facebook that the alpha geek crowd seems to have. They're the people who will import their web activity into the site, and so will their friends, meaning they're bombarded with not just the usual zombie/film/quiz noise, but also every twitter, photo and blog post. It's no surprise that many run from the information overload. Will real people suffer from this? Perhaps in time, but for the moment I suspect most Facebook users are getting about the right balance between no activity (so why use it?) and overload.
In conclusion, then. If you run a web site with a social component, please offer the tools to allow users see their friends activity easily on the site, and also to include their own data in their own aggregators. (To be honest, almost everyone does this already.) If you're running a service that allows users to pull in data and share it with friends, filtering is a must.
¹ Jaiku's odd, as it combines Twitter's status updates with aggregation, and that makes it much harder to explain to people. If it had concentrated on one or the other it might have found Twitter's market first.
² Movable Type 4's activity streams show that Six Apart know it's a problem that needs addressing. Oddly, Vox has enough information to build one, but it's not used.
A year or so ago, whilst writing groupr (RIP), I came up with what I thought was a useful name for something I found myself doing a lot: the API join. I'm fairly sure this is common to a lot of Web (2.0) APIs, but it's especially common with Flickr. For example, take groupr. First, it would do a call to get the groups you're a member of. For each of these, it then fetched the photos in the group. Obviously, this has a problem: as the number of groups you're a member of goes up, so does the number of calls to the API - and each call takes about a tenth of a second. The only way to mitigate this, and the solution groupr used, was to page the groups - and even that leaves you making as many calls as you have groups on the page.
The problem reared its head again when I was looking at doing a ffffound-inspired Flickr favourites app. I wanted to display the usual Flickr size, rather than square thumbnails (as Flickr's own favourites page does). Unfortunately, the standard call to get favourites didn't list the size of the photos, and I really didn't want to spend two seconds fetching them all. Other people have raised similar questions on the Flickr API group discussions; for example, here's one about getInfo and getExif, and here's another about getting photo sizes.
Imagine my surprise, then, when I looked at the documentation for flickr.photos.search and noticed a new argument to the "extras" parameter: o_dims. It turns out this returns the original height and width, and is also available in the favorites methods, so now it's possible to avoid doing those calls, and to embed derived height and width for web-scale images from a single call, even for the 36 or so images on Flickr's version of the page.
Of course, this is simply because the API has now moved the join deeper; instead of being at the API level it's being done inside Flickr (presumably at the database level). In fact, I suspect that last weekend's database downtime may not be unrelated (perhaps it was needed for the launch of Apple TV's Flickr slideshows?). It also doesn't help with the other methods, such as getExif (there's a reason I've moved some of my EXIF data to machine tags, which are fetchable with another extras parameter to many calls).
Facebook, interestingly, allows a SQL-like query language as part of their API access, but I wonder how they deal with queries that could bring the database to its knees. I do notice the line
In order to make your query indexable, the
WHEREshould contain an=orINclause for one of the columns marked with a *.
Is that an enforced criteria, or is it merely a recommendation, and do they return long-running queries without results to keep up database performance? It's the sort of thing I'd love to see Flickr add to their API, but I can imagine the problems are far from trivial, and in the meantime, I'm very happy to see one API join bite the dust.