4 posts tagged “perl”
Last week Kellan from Flickr published my interview on code.flickr. I'm still somewhat amazed that they chose me to ask, but then I'm also pleased at how much people are liking snaptrip, and I'm happy to see my words in print, as it were.
I actually compiled my answers a couple of weeks before it was posted, hence the reference to groupr as a "lost project". Now, of course, it's back, but I've already posted a couple of times about that. What I would like to do is - finally, and belatedly - document (and update the released version of) my EXIF machine tagger.
Why bother with such a thing? Flickr will extract EXIF metadata, but it won't allow you to do any aggregate queries on it. (Well, that's not quite true; at dConstruct 2007 Tom Coates leaked some URLs which I picked over, but they don't cover all the useful things I'd like. Plus, it's not documented.) By extracting all the data from my photos into machine tags (and a local SQLite database), it becomes possible to point people at all the photos taken at the wide end of my widest lens, or those taken with a particular make of camera (and to do more complex queries locally).
With that out of the way, how do you go about such a thing? Well, as usual, it's actually a fairly simple joining operation. Get a list of photos, and for each of them, get the EXIF data (using flickr.photos.getExif), then store the data locally, and add tags back to Flickr. There's not much munging invovled - I convert spaces in the EXIF field names to underscores, and some things get put in the "file:" or "camera:" namespace, rather than "exif:" - so it's all pretty straightforward. (I do preserve spaces in the EXIF values, though, by quoting my arguments to the addTags method.)
I also add an meta:exif field with either "none" or the epoch seconds of the time of tagging, so that it's easy to exclude previously-tagged images from being examined again. Another minor niggle is that, to add tags, a script has to be authorised. I copied the code chunk from the flickr_upload script in a Perl module, and it seems to work for me.
However, the fact that users need to get an API key, secret, and then a token, is naturally going to limit the audience for such a script. A few other users have metadata in the "exif:" namespace, but it's not exactly common. It's hard to turn the script into a web app, too, since it needs about a second per image to run, and the first run has to examine your entire library, which these days is typically thousands of images. I may still do it, but I haven't bothered for months, so I wouldn't count on it.
Another drawback is that machine tags are normalised at Flickr. This means that when I query on exposure bias, both -1/3EV and +1/3EV show as just "exif:exposure_bias=13ev". I've been thinking about ways around this - by querying raw tags - but it's not straightforward. (Ways around this normalising, and ways of getting all predicates for a namespace, and values for a namespace (at least within a given user's photos), would have made my list for "things you'd like to see in Flickr" if I'd felt able to get away with being so demanding.)
One final observation is that the script's in Perl, and uses XML (which is, apparently, sometimes compressed at Flickr's end; at least, I had to add Compress::Zlib at one point for some reason). If I was to redo it, either in Python or Ruby, the data would all be fetched as JSON, and it'd probably get a few more users. Ah well. Installing the prereqs shouldn't be too hard.
That said, of course the script, as is, proved useful. I run it manually after an upload, while Tom, who is (as ever) a bit more sensible, has his fork running as a cron job. Either way, please download it, play, and feel free to let me know what you think.
Today's stupid software idea comes courtesy of the most recent episode of Battlestar Galactica, which featured (as some previous stories have) the Hybrid, the organic controller of a Cylon base star (aka the big pointy bad guy space ships). This week, though, saw probably more Hybrid than any other.
Hybrids, you see, continually babble, a stream of consciousness mixing what sounds like system diagnostics, physics and poetry. After the episode ended, I thought "wait, system diagnostics? Well, if I open up Console, I have those. What if there was an additional process - call it hybridd - which emitted poetry to go along with the more prosaic debugging and whatnot that my computer spits out?"
I have an idea how to do it, too. Algorithm::MarkovChain is a venerable Perl module that puts out almost, but not quite, meaningful sentences, based on an input corpus. Tie that in to the syslog function, a bit of Launch Services, and there you are. (I'm sure you could do a bogstandard Unix version too.)
A further step would be to replace the Console UI with one that boils down the actual computer stuff and tries to fit it in with the hybrid's poetry, but that idea's a lot harder to do well, I'm sure.
Anyway. hybridd - an idea whose time has come. And now, thanks to Tom Insam, here's a Perl version. Requires the aforementioned module, along with File::Tail and Unix::Syslog, both available at your nearest CPAN mirror.
The idea of Hackday London 2007 was, unsurprisingly, to hack. Beforehand I'd had little idea of what to do, but candace managed to come up with a few ideas. Notably, one evening last week she noticed some photos on SpaceWeather.com of the International Space Station, as taken from the Netherlands, and thought that perhaps we'd have a chance of seeing it. We checked Heavens-Above, a venerable satellite tracking/prediction site, and we caught a flyby which included a moment of brightness as the newly-deployed solar panels caught the sun.
Wouldn't it be great, she mused, if it was possible to get messages to your phone when such things were going to happen? As well as ISS flybys, there are also Iridium flares, where the redundant communications satellites reflect sunlight down to the ground, and it'd be nice to be told about those, too. It looked like we had an idea.
Implementing the idea wasn't terribly tricky, either. There are two parts to it. Firstly, there's a scraper for Heavens-Above. We set up a special London account, and wrote a script that authenticates against the site, and downloads and parses the data tables for the ISS and iridium flares. This goes into a plain text file, with the date and time as one field and the text message to send as the next. Since the data tables list events for seven days in advance, this script doesn't have to run frequently- at the moment it's doing so once a day.
Secondly, there's the sender script, which runs every five minutes. It reads in the data file, parses the date (slightly hackily- I'll need to fix that eventually), and, if the event is within twenty minutes, sends it to Twitter (which we use as it's a simple way of sending SMSes to multiple users). Also - and this is where where the required use of a BBC or Yahoo! API comes in - the script checks the Yahoo feed's "current weather conditions" value, and if it's likely you'll be able to see the event, continues onwards to send it. Otherwise, it doesn't bother (but I do get an email from cron telling me what the weather actually was).
I was able to get all the coding done and put it on my colo before one of the flares that evening, but sadly the weather wasn't quite clear enough and we didn't have visibility in the right direction. Still, we had text messages and a Twitter page that we could point to as proof of a working hack, so we went home. (There's an aside here about the difference between the SF culture and London's more lackadaisical one, perhaps, but it'll have to wait for another day.)
I spent another 30 minutes on Sunday morning tidying up the verbosity of the script (it now only prints, and hence sends email, at the same time it has output to send), and then came the slightly nervewracking presentation, which thankfully seemed to go down well, despite us having nothing really in the way of UI to demo. (One of the best things about Hackday- you don't have to write up yourself...)
That evening I added a feed for SF (and took the chance to comment alongside all the bits of code that needed changing). If you're interested in getting messages for passes Above London or Above SF then get a Twitter account and follow the appropriate user. (Longer-term Twitter users might want to adjust their phone notification settings if they want to get SMSes late at night.)
What's next, then? Well, I've since looked at a Perl module (Astro-satpass) that would have let us cut out Heavens-Above, and possibly opened the door to more customised notifications. In particular, we've made some arbitrary decisions (we don't send flares that don't climb above 20°, for example) and it's really hard to add a new location. It'd be nice to remove those limitations, but doing so introduces a new problem; namely, Twitter is a very easy platform for notifications, although I'm concerned about its reliability and timeliness. Customised messages mean either abandoning it or (ab)using the direct messaging syntax.
It was notable that Twitter was used in a fair amount of the hacks (from the live blog post written during the presentations, at least 10%). I think it'd be a perfect fit for Yahoo, alongside Flickr and del.icio.us, as a developer-friendly site that, perhaps, needs a bit of resourcing behind it to make it truly reliable (and, perhaps, more international; the UK number isn't always cheap). How about it?
Anyway, the two applications/accounts/bots should now run without human intervention (at least until that date hack I mentioned rears its head around Christmas), and hopefully I'll remain inspired to tackle the more complex project of personalised feeds and notifications later in the summer. For now, enjoy spotting satellites.
For the last couple of months I've been sitting on a CGI script that would aggregate all my content for my personal site. Part of the reason is caching: the script doesn't have any, and it turns out that XML::Feed isn't Storable friendly, which knocks out my first approach.
So when I had a think about what Yahoo's Pipes promises, I thought it might be worth a look. I could get the service to do all the heavy lifting, hope they had a sensible caching policy (and if not, well, at least it was Someone Else's Problem), and then just format a single RSS feed locally.
Sadly, there's a major problem. The aforementioned XML::Feed Perl module does a wonderful job of hiding the mess of formats that labour under the acronym RSS and the name Atom. If you want to soft by date, you can do so easily. (In fact, you get lovely Perl DateTime objects. I can't sing DateTime's praises enough, even if it does look daunting at first.) Pipes, however, doesn't. I can sort my Vox Atom feed by its pubDate property, or my delicious and husk RSS feeds by dc:date, but neither sort has a date format in common, so I can't sort them once they're output.
I had a quick look to see if there was an obvious way of doing a date transformation on an element of an item, but unless I'm missing something it's far from obvious. I could write a small web service and call it, but that's a lot of work, and I might as well do things locally if I'm that bothered. So I've given up, but not before writing this, because it seems like a natural thing to handle in such a high-level environment, and I'm surprised they don't.