Home

Tue, Apr. 22nd, 2008, 12:18 pm
Abrasive words slurred with obtuse thought

I got joe-jobbed again recently and it reminded me of this mini-essay I wrote about 3 years ago about a more ... extreme solution to spam.

Click through for the details ... )

Tue, Apr. 8th, 2008, 11:50 am
You don't really want it any more from me

People use Excel a lot. Abuse is possibly a better way of putting it. In fact they abuse the whole of the office suite - you've all received the three word reply wrapped up in a Word document or the PowerPoint document that just has a bunch of images in it.

The difference is that some of the Excel abuses are actually pretty clever. There's those quizzes that get emailed around - guess the lyric, solve the puzzle. But even beyond that you get the Finance and Ad booking departments that run off a single shared Excel spreadsheet of such breathtaking utility that you kind of have to step back and gasp.

It's so horrible! Yet so beautiful! It does everything for them - it tracks bookings and various sales stats and shows past performance and projects future trends. Sure, the programmer in you is recoiling in absolute horror - this isn't programming, this is a crime against compilers. Forget spaghetti code or lasagne code. This is macaroni baked with Velveeta. Yet like Mac'n'Cheez[tm] it works. It's delicious. It's a remarkably sophisticated app that does exactly what they want and does it fast.

And they wrote it. The sales people wrote it. These people who seem to look down on programmers and who won't update the HTML for the copy on the web page because "I was never very good at maths at school".

I'm being flippant of course. And despite some occasional minor differences, I've always got on with the biz dev teams in the companies I've worked for which is why it's sometimes baffled me that they won't even touch, say, HTML.

The problem, I think, is familiarity. We know that the amount of damage they could do with HTML is minimal. Except, well, if they leave out a single character then maybe the whole page is blank. It's not a big deal to us - we know it's not hard and that anything is easy to fix but that's beside the point. We're familiar with HTML, with the Web stack. They're not. In much the same way we're not familiar with (plucks bogus example completely out of thin air) filling in form 64-8 for authorisation of a VAT agent.

Yet they're familiar with Excel. And not just as a spreadsheet - as an environment. They know what makes it tick, what its capabilities and quirks are and how to make it dance in a way that you and I may never.

I'm kind of reminded of this quote about welders

"One of the unexpected things about watching the steel guys work is how the solidity of metal means nothing to them. Most people think of metal as something hard and inflexible, but welders don't. Which should be obvious in hindsight, I guess. But, for example, they have these saw-horses that are made of tube steel. And I can see how that came about: they needed some saw-horses; they had some steel. It took them 30 seconds to make them. And, an example with the stairs: the legs of the stairs' landing platform have big threaded bolts for feet, to fine-tune the height of the legs for levelling. And there are these steel tube sleeves that go around the legs, that drop down and cover the bolts. So when they were moving this platform in, they had to flip it over, and they didn't want the sleeves to fall off while they did this. Now to me, that job calls for duct tape. To them: they welded the sleeves in place, then de-welded them when they needed them to move again."


When you step back and look at it - Excel is a programming language. But not in the way that you and I think about it - with source code and a compiler and a execute/debug cycle. It's a cell based programming language where data and code are intermixed but never mistaken for each other. It's self modifying but easily observable. Changes are implemented trivially and immediately. And in parallel. Whilst most programmers have been struggling to get to grips with functional programming and Monads our brethren at the other end of the office have been doing it for years. And their IDE is better than any of ours. Hell it even has visual data and code dependency tracking:



Think I'm joking? Take a look at this article about building a 3D Engine in Excel. A 3D Engine with multiple rendering backends. And, if we try and shoe horn conventional programming metrics in it does it in 30 lines (and 24 columns) of 'code'. Including comments.



To be honest I'm being a little light hearted here but actually my point is kind of serious. There's a new wave of programming paradigms - Map/Reduce, asynchronous execution, grid computing, sharded and column orientated databases and others. These aren't new ideas, especially in the academic world , but they're gaining more widespread acceptance. A cell, or to look at it slightly differently, a node based approach makes a lot of sense for a bunch of them.

At my previous job we were used to dealing with huge quantities of data a day. Our rendering farm sat on the list of the worlds top supercomputers. We dealt with parallelism all the time - from Renderman to pixel and vertex shaders. We did our compositing using a program called Shake which is entirely node based.



Shake's kind of interesting (apart from Apple's slightly comical attempts to dissuade people from using it on Linux) - it's a very different way of doing image creation than what most people are used to. I watched with amusement the blogosphere cooing over the price drop for OS X and then giggled when they fired it up for the first time and didn't know what the hell to do with it. But my co-workers who really knew how to drive it used it all the time - need to put a mountain behind the crusaders. Use Shake. Need to create a facsimile of a bustling medieval London Bridge using nothing but a background plate of somewhere in Prague, some smoke elements and some video of a pigeon from outside. Use Shake. Need to create an icon for that new Shake node you just wrote ... use Shake.

It's the same familiarity as the Excel users and the welders.

We ended up writing a Node based programming language called Ripple that automatically went and deployed itself over the farm. It self balanced, passed variables and sorted out the DAG. You just strung nodes together and ran the script and tied into Alfred if you wanted a visual feedback and/or to reprioritise or delete tasks. If you wanted new functionality then you just wrote a new node type - we had nodes that did everything from skinning and compositing to generating dailies and emailing people. It was, to be frank, pretty cool.

I've been lead to believe that several of the major banks have node based languages which do things like complex price matching or constantly take input from the market, ripple changes down through the DAG and give pretty much immediate access to the level of exposure that the bank. Want more capacity? Just chuck more servers in - it should just all work.

There's no real conclusion to this other than - parallel, grid based computing looks like it should be hard but it's coming, it has significant advantages once you can get your head round it and as long as the tools are good it might actually turn out to be a better way to program.

Thu, Mar. 27th, 2008, 03:36 pm
Who, when, where, whatever

I was trying to think of something I could do to play around with FireEagle and came up with something which both tickles my development fancy and also is so incredibly insular San Francisco navelgazing Wanking2.0 that I kind of feel compelled to do it.

So, and excuse the hand waving here, the way it would work is this:

  1. You'd purchase a cheapo RFID reader from somewhere - the ones from ThinkGeek, Phidget and Parallax all look good.

  2. Hook it up to a computer and run TheSoftware which, as yet, exists only in my brain. You will tell TheSoftware what the physical location of your card reader is.

  3. Swipe your brand new card which will prompt you to register yourself with a remote, centralised service.

  4. This service will prompt you to give it FireEagle access.

  5. From this point on whenever you swipe your card over the reader TheSoftware will inform the centralised service which will, in turn, tell FireEagle.


In and of itself this is not very useful but if you had a reader at work then you could swipe in there in the morning and then swipe in at home at the end of the day (or if you work somewhere suitably large then put multiple readers around the place). And then if your friends started getting RFID readers and installing them in their homes then when you went round there you'd be able to easily let FireEagle know where you were. Hell if you could persuade your favourite bars and clubs to do it then you could do it there. Hook it up to your Social Graph and then you can easily work out where all your friends are.

Then of course the data can be subpoenaed by the Government to prove that you're a terr'ist or something.

Thu, Mar. 27th, 2008, 03:16 pm
We didn't start the Fire

Yahoo! recently released FireEagle and jolly nice it was too - I've hooked it up to my Dopplr account and I have an idea of what to do with it of which more later.

However things were a bit confusing - there was a Net::FireEagle on CPAN by Aaron Straup Cope yet also a Net::FireEagle::Client linked to on the FireEagle page itself and they weren't really that much alike.

Because all of SF is a seething cabal I asked around and found that the CPAN version was an early version based on the old version of the API. And the ::Client version was somewhat lacking in things like, well, documentation. Or comments. Also it wasn't on CPAN which makes it somewhat of a second class citizen in the Perl world.

So after a bit, I ended up taking over the both of them. I renamed ::Client to just Net::FireEagle, adding CPAN scaffolding, refactored the hell out of it, wrote a load of docs and some (very basic) tests and a nifty little command line script which also serves as an example of how to do the Auth Dance[tm] (which reminds me - the OAuth Auth Dance is much nicer than the Google, Flickr and especially Facebook one).

And lo the updated version now resides on CPAN. It even has a user.

Mon, Mar. 24th, 2008, 06:33 pm
My life in Non-Tweets

Since I don't have, and refuse to ever get, a Twitter account I have decided to summarise this weekend in the style of LoudTwitter because, yes goddammit I AM that geeky.

  • Friday night French Laundry
  • Saturday afternoon A Luau complete with a pit baked pig
  • Saturday night Awesome ice cream with two completely random new friends on the way to ...
  • Saturday night (again) ... a house warming party where it turns out that the two random new friends who were just giving me a lift knew someone at the party.
  • Saturday night (even later, more Sunday morning) Watch second half of Malaysian Grand Prix at Overtime on 7th and Harrison
  • Sunday morning Tool hire, DIY, ladders, drilling
  • Sunday afternoon the Big Wheel race down Vermont. Hilarity and Panda Bears ensued.
  • Sunday evening post BYOBW pizza and beers and Jesus Camp at ydnar's.
Automatically shipped by My Hands


Sixers (and Jesus) on Vermont

Living in San Francisco really doesn't suck.

Originally posted on deflatermouse.vox.com

Tue, Mar. 18th, 2008, 04:08 pm
By the way, I tried to say

WRT my last post I think I managed to spectacularly avoid saying the most important thing in my head which was this ...

I don't think that's anything fundamentally wrong with any of the PubSub systems in existence at the moment. However most of them seem to have escaped from or are inspired by the kind of messaging you need at banks and other financial type institutions. This is great and many of the design goals are the same but they're designed to be complicated and complete from the get go. And this works for them.

However I want something more like Memcached or Rails or similar - you install it out of the box and it Just Works[tm] and for 80% of people that'll probably suffice modulo some trivial tweaking.

Then there will be another 10-15% of people who can take that base and after some simple to moderate modifications make it do what they want.

There may even be a further 1% who can make it go even further but, at this point, it's diminishing returns and really if things were changed to make things easier for them it would compromise how simple things are for the 80%. And to be frank, the 1% would probably be better off with something designed from the start to do what they want.

Not everyone wants Oracle - some people are just happy with SQLite and MySQL. Hell some people are more than happy with BerkeleyDB.

And that's a good thing.

Tue, Mar. 18th, 2008, 03:23 pm
Standing in Line

For years the default website stack has been something similar to the classic LAMP stack - originally "Linux, Apache, MySQL, Perl" but now really meaning "Free Unix, Free Web server, Free Database and Free Web Language" or just "OS, Web server, Database, Language" to be even more general and friendly to the Windows people out there.

Relatively recently we've started to see that we should add another layer - a Cache. To be honest, from what I can see, Memcached has pretty much got this sewn up by virtue of being awesome although there are other technologies like APC. So I'd like to coin a new phrase - the CLAMP stack for Caching LAMP. I can't find a reference to it so I'm going claim it as mine. MINE. MUHAHAHAHAHAAH. Maybe in the future it will make me famous. Maybe. *cough*

Even more recently I think there's been a need for some sort of new layer.
Snip addled musings ... )

Thu, Feb. 14th, 2008, 05:19 pm
What Do I Do Now?

I'm a little discombobulated at the moment for various reasons but this idea popped in to my head and I have no idea how stupid it is. What better to test than to fling it the internet like so much poop and see if it sticks.

Imagine if sites had a

<meta name="searchurl" value="http://example.com/search?query=%q" />[1]

tag in their headers. This would allow agents to autodiscover and utilise a site's search engine if one was available simply by substituting %q for a url encoded query. There could even be a type="..." attribute that gave the mime type of the results - Atom would be good. Although that could just as easily be done with Accept headers and the other standard mechanisms for negotiating types.

Search engines could even use it to get better results from stuff like shopping and review sites.

Of course there's a possibility (nay, a probability even) that it'd be co-opted by spammers and also you have to ask yourself - why would sites provide this as a service and who would want to use it anyway so it's probably one for the "WTF were you thinking" file but hey ho.

I need more tea.

[1] Although it occurs to me that

<link rel="alternative" name="search" href="..." />

might work even better.

Wed, Feb. 13th, 2008, 09:34 am
Anytime, anyplace, anywhere

There are tonnes of JSON modules on CPAN. Why do it one way right when you can do it a hundred ways wrong? JSON::Any mitigates some of these problems by abstracting away the interface so that you can use JSON, JSON::XS, JSON::Syck, JSON::DWIM ...

Annoyingly JSON::XS completely changed its API between versions 1 and 2. JSON::Any dropped support for JSON::XS 1.x and now only supports 2.x.

Until now. This patch feels somewhat dirty but, meh, what the hell, it works.

Wed, Feb. 13th, 2008, 08:31 am
Stuck in the Middle with You

Yesterday got me thinking - I think it was the combination of the impromptu burlesque show at the flower shop, the gig, the pint, the synchronicity and the conversation but for whatever reason it got me thinking about Shelf again.

Shelf is people orientated - it makes heavy use of the address book and finds connections between what you're doing now and people you know. Which is fine. But it could be cooler.

Instead of just have a person as an initial seed for the clues how about other things? Starting simply - how about urls?

There's already information out there about urls - for a start there's whether it's owned by someone you know. Or its stats from Alexa. Maybe its PageRank value. Then there's when you last visited it and how often you've visited it and what's changed since then. And whether you del.icio.us-ed it or Duggit or whatever. And whether it was mentioned in any of your RSS or Twitter feeds or emails. You could add notes to annotate it.

The next natural step is your friends - what have they said? Have they added notes? When did they last visit it (ignoring the glaring privacy concerns for the moment)? Where did they go next? Hell, throw it open to everyone. What has the rest of the world got to say about this? Suddenly every page has comments whether they like it or not. And notes and errata. It's a Web! It's a Wiki! It's a Dessert Wax and a Floor Topping!

And then there's places. You're looking at a museum or a gallery and it tells you what pubs and restaurants are nearby. And if any of your friends will be close by. Show you photos from the location. Throw in a map. Maybe some historical information or local trivia. Great for when you're sitting at your desk but even better when you're actually out on the street and you look down at your iBlackickreo95 and it's using Cell location or GPS to work out where you are.

Listening to music? Album covers, lyrics, other albums, recommendations. Films? Stuff from IMDB - the actors, what else they've been in, awards, trivia, more recommendations from my friends.

Nurse! Come quick! I think the restraints are coming loose.

Tue, Feb. 5th, 2008, 12:48 pm
You wrote a book about yourself

I've been thinking about latent meta data for a long time. A long time. Partly that's because it's such a large and vague topic - the amount of data is large and meta covers everything that you can infer about it.

In this case I've been thinking about how we can write tools to help us understand all the personal data we have knocking around. We have mountains of emails and contacts and web surfing history and conversations and other miscellania and the more we get the harder it is to organise yet perversely there are more rich informational pickings to sift over.

I've written apps that listen on IRC and try and build a view of the world based on what's said. I've written stuff that indexes email corpuses and helps you rotate the data about any point. I've written secretarial bots that act as stenographers and note takers and who do calculations and lookups and go fetch things without you having to context switch, without you having to even ask in some cases, just like a really good PA should. I've written things that crawl Wikipedia and infer and answer questions.

I've talked (ranted, really) about this sort of software alot to my friends, sometimes to the long suffering Tom Insam who seems to bear the brunt of more than his fair share of my insane ravings and half baked ideas.

One of the things I got excited about was Beagle (née Dashboard), the Gnome program that allows you to search all your information from a single interface. I liked its novel use of Clue Packets but in some ways it felt stale - unlike Dashboard you had to go type something in whereas Dashboard would infer from what you were doing. Something about that bothered me - it wasn't new enough I suppose. It was just an evolution of Windows Search and Sherlock and Spotlight. I want a PA, not a reticent knowitall in the corner I have to coax answers out of.

Because he's not a whiny bitch like me and because he seems to have more JFDI than is humanely possible, Tom competely ignored all my frothing and has since come up with his own system - Shelf - which has been getting some heavy weight coverage recently. It harkens back to the Dashboard model and uses polling to workout what you're doing in what apps and heads from there. It's, to be frank, pretty sweet.

But I'm still not totally satisfied and it boils down to this. I get distracted enough - I have IMs and IRC and emails and feeds and phones going all the time and I have to be careful because suddenly it's 3 hours later and the cursor is still blink accusatorily on the blank editor.

What I actually want is something more Exposé like - I want to hit a key combination and everything that Shelf knows about the current context it can gracefully swoosh up in an achingly translucent overlay. The app can then either continuously scan what I'm doing or, for lower spec machines, it can just do its mojo on demand. This also solves the problem that, if you're running the Social Graph plugin it doesn't need to send every url you're looking at to Google (which, in and of itself is part of a wider Seperation of Personas theme of which more sometime in the future).

But not everyone's like me so seperate the frontend and the backend. That way I can have my Exposé mechanism and info junkies can mainline clues and we'll all be happy. Apps could subsume the functionality giving richer integration through a common broker.

Hell for those who just need to know what's going on in the same way Britney needs the limelight, you could make a meta app that streams their Shelf clues, and Twitter streams and RSS feeds and access logs and email notifications and Calendar updates in some sort of context firehose that you get to drink from. Throw in some special self-referential sauce so that it understands itself and Oh look Dave Winer's accessing my site and now he's written a Twitter about it and someone else has written a blog post about it and 3 of my friends have commented defending me and ...

It's the sort of Intertwingularity that gives Ted Nelson a full on chubby.

Tue, Dec. 18th, 2007, 02:40 pm
If you don't want to then you could at least pretend

So, after writing my little rant on pseudo schemas someone sent me this rather breathless piece:

"Though the new feature is strangely undocumented by Apple, users have discovered that Mail now supports a system of URLs (yes, URLs can do more than point to porn) that allow you to link specific messages in other applications. For example, you could include links to a couple Mail messages from coworkers alongside notes, pictures, and web links in OmniOutliner or Yojimbo documents. This opens up a whole new productivity world, allowing you to bring your e-mail into other applications that aren't specifically designed to integrate with Leopard's Mail."

-New Url Features can make your email productive again



And it got me thinking about a post I wrote a while back about Facebook being how 'normal' people think email should work.

So, to revise my previous position slightly - I think it's becoming clear that we need some sort of hybrid protocol+action schema. I'm still of the opinion that these pseudo protocols are bad and detrimental in the long run and that it's better to deal with it now rather than deal with a schema-soup in the long run.

Tue, Dec. 18th, 2007, 02:33 pm
I done wrong and I want to suffer for my sins

It gets worse. These all work now:
    print four.pounds."\n";               # prints "4.00"
    print four.pounds.five."\n";          # prints "4.05"
    print four.pounds.fifty."\n";         # prints "4.50"
    print four.pounds.fifty.five."\n";    # prints "4.55"

    print fifty.pence."\n";               # prints "0.50"
    print fifty.five.pence."\n";          # prints "0.55"
    print four.pounds.fifty.pence."\n";   # prints "4.55"
    print four.pounds.and.fifty.p."\n";   # prints "4.50"

    print fifty.cents."\n";               # prints "0.50"
    print fifty.five.cents."\n";          # prints "0.55"
    print four.dollars.fifty.cents."\n";  # prints "4.50"

Must. Stop. Crack. Making. Me. Go. Blind.

Tue, Dec. 18th, 2007, 09:44 am
Young, dumb, don't see a problem

I've been occasionally reading the O'Reilly Beautiful Code blog which accompanies the book of the same name. I have to admit I haven't agreed with all of it - some of it reads similarly to occasional posts I see scattered around the Interweb by Pattern Language zealots who haven't quite grasped that what they're doing is only because their chosen language is broken. It's not wrong per se it's just that, well, can a best-practices work around ever be considered beautiful.

Either way, I was pointed at this post this morning about The Cardinality of a Fluent Interface. Again, I wasn't entirely sure I agreed but I was sufficently intrigued to start hacking around.

My first attempt yielded something surprisingly elegant (despite using a couple of mildly egregious hacks such as abusing an overload of the concatenation operator) which allowed you to do things like
    one.hundred
    twenty.two
A slight bit of hackery later and it could also do
    six.hundred.and.fifty
Making it do
    four.point.zero
    point.five
    three.point.one.four
    one.nine.zero.four
Required changing the object from the oddly satisfying bless scalar to a more complicated blessed hash and the internals got a lot uglier. I was initially skeptical that I could make it do both nine.point.five.five - which is arguably the correct english way to say it - and nine.point.fifty.five - which is semantically also correct (albeit clumsy and ugly) and useful for currency - but then I suddenly had a flash of inspiration and hacked in the two line change.

Currently it's labouring under the name Acme::Numbers and not on CPAN but feel free to have a look and suggest new test case.

Thu, Dec. 13th, 2007, 12:20 pm
What Shall We Do Now?

You've probably noticed, because you're a bright and observant lot with more than a large streak of geekiness, that there's been a proliferation of custom url protocol schema such as webcal://. Apple is especially guilty of this in much the way that they're often especially guilty of muddying the technical waters. They tend to get away with it because even usually sensible, standards oriented people think they're so gosh darn pretty.

I'm going to stick my neck out here and say

"STOP DOING THAT! YOU'RE WRONG!"



The protocol part of the schema is for the protocol, ffs. I don't actually care whether the .ics file you're serving me comes via HTTP, FTP or Gopher but I at least need to know what protocol to use. You might argue that I should default to HTTP but it's not exactly a huge stretch to imagine that, one day, HTTP might be replaced as the great circle of life continues. Hakuna Matata. Especially if we start having more mobile orientated protocols.

If only we had some way of saying what type the file was. Oh wait! We ALREADY FUCKING DO. It's called MIME types and they work just fine, thank you very much.

There are two objections to MIME types as far as I can tell:

  1. It's hard to configure handlers for them
    I concede this may be true but a) that's a UI problem, not a protocol problem and b) it's no more difficult than having to configure a handler for a new pseudo protocol, especially since most browsers don't have UIs to configure protocol handlers and even if they did now we have to have two (2) different UIs (or one confusing UI) to handle MIME type and protocol handler set up).[*].

  2. Configuring MIME types is a server problem and sometimes I don't have access to my server config or don't know how to set things up. Waaah waaah waaah.
    I concede that this would be a problem IF THE HTML 4.0 SPECIFICATION WHICH HAS BEEN AROUND SINCE 1998 DIDN'T HAVE A WAY TO SPECIFY THEM IN THE HTML



That's not to say that there isn't room for improvement - for example you can't paste a link into an email and then specify the mime type (although perhaps some pseudo protocol like http+text/calendar:// ought to be defined) however that said I wish people would stick to the standards - they're there for a reason and "because the URLs look prettier this way" isn't a valid reason to ignore them.

(I am fully expecting now someone to come and point out that this is all allowed under some RFC and that I'm wrong in which case I will concede gracefully and commit seppuku)

[*]Although I'm aware that browser have ways of configuring what mail client to use with mail://

Sun, Dec. 9th, 2007, 10:21 pm
Everybody knows what you've been through

A repost of a review I did on the London.pm Website of the new O'Reilly Book "Programming Collective Intelligence".

Click hither if I haven't already bored you with this stuff ... )

Wed, Nov. 28th, 2007, 06:02 pm
For your consideration

For some reason I imagine this appealing most to [info]aca and [info]easterbunny ...


Wed, Nov. 28th, 2007, 04:04 pm
The clock ticks life away

In a burst of insomnia last night I finished off my personal calendar refactoring and uploaded it to CPAN. I'm actually pretty pleased with the results - it's pretty modular and clean and should be easy to extend to do something like read hCalendar off web pages or syndication feeds or produce hCalendar infused feeds itself or read and write to Google Calendar. Writing mod_perl 1 or 2 handlers should be a doddle too. Or integrating it with Memcached. There are a couple of things I'd like to do to it still - make installation easier for example and do a proper user management page but, for now, I figured I'd just release and add those later.

That wasn't quite enough to send me off to the land of nod though so I wrote Cal::DAV which is, err, a CalDAV client in Perl.

It's actually a pretty thin wrapper round Data::ICal and HTTP::DAV but at the very least it should stand as an entry point for anybody who searches for "CalDav + perl" and it's actually pretty much a drop in replacement for Data::ICal.

I had a slight problem in that the CalDAV server I was testing against (the Darwin Calendar Server from OSX) seemed to allow me to modify and delete files on the remote server but didn't seem to let me create new ones (either by uploading or by copying or moving existing files) and returned a 403 Forbidden).

I couldn't see anything wrong with the permissions (it happened even with the moral equivalent of a doing it as root) and it happened with two different CalDAV clients (dave and cadaver) so I'm pretty sure it's not just my code. I'm pretty stumped so if someone else out there is running a Darwin Server or could give me access on another type of CalDAV server then that'd be much appreciated.

Weirdly enough that didn't send me to sleep either so I finished off a book on the Foreign Legion, read The Last Temptation and got halfway through a book on the state
of British food
before I conked out and woke up at 7:30 with the light still on.

Fri, Nov. 16th, 2007, 12:09 pm
You Get Me Closer To God

And ooooooooooooh, the music puns get worse and worse.

So, at work we stumbled across an interesting Perlism. Well, interesting if you're a huge nerd which, must as it pains me, I'm going to have to concede.

Anyway, so Perl has the concept of a DESTROY method which gets called on an object when it's, err, destroyed. Also, in Perl, objects can be any kind of a blessed scalar value - usually it's a hash reference but it could be a blessed array, or a bless sub routine reference. Which is where we join the story. Observe this:
my $foo = Foo->new;
$foo    = undef;

package Foo;

sub new {
    my $class = shift;
    return bless sub { print "Hello from Foo\n" }, $class; 
}
sub DESTROY {
    my $self = shift;
    $self->();
    print "Destroying Foo\n";
}
1;

We'd expect it to print
Hello from Foo
Destroying Foo

But it prints nothing.

How weird.

However if we do this (note the $class in the print statement):
my $foo = Foo->new;
$foo    = undef;

package Foo;

sub new {
    my $class = shift;
    return bless sub { print "Hello from $class\n" }, $class; 
}
sub DESTROY {
    my $self = shift;
    $self->();
    print "Destroying Foo\n";
}
1;

We get both statements printed.

Freaky, ne c'est pas?

So the reason, as far as I know is that DESTROY is only called when all references to the object are destroyed. Presumably for optimisation reasons, the anonymous subroutine in the first case is invariant and the new subroutine takes a reference to it and so the DESTROY is not called until the new goes away which it won't because the symbol table isn't wiped.

So the reason why the second one works is that by referencing the $class variable the anonymous sub is upgraded to a closure and a closure gets a new copy for every instantiation and thus gets DESTROYEDed.

It all makes sense if you squint at it askance but it's still the sort of thing that will trip you up.

Tue, Nov. 13th, 2007, 04:39 pm
All Tomorrow's Parties

After my last bit of Calendar fiddling I was left slightly disgusted by the state of the code. In my defence it was written over the course of a hungover Sunday afternoon but there was CGI and DBI code mixed up together, no DBI place holders and all sorts of other perversions and sickness. At least it was all templated, which is something.

So I ripped out all the DBI code and put that in an abstraction layer and did the same for the handling stuff and then just generally had a cleanup which also ended up with it gaining a config file rather than hardcoded values, marked everything up as hCalendar and made it multiuser in the process.

Because the DBI stuff was abstracted out it was pretty easy to add another provider which could read from iCalendar files both locally and remotely. To be honest, I was vaguely shocked when I plugged in my Dopplr feed url and lo and behold my trips showed up.

So far so good. A neat, tidy, clean and simple to install calendar system that, whilst it only has the concept of all-day events, can read and export iCalendar (and copes with recurring events through that). Although I've also got code lying around for a more complicated calendaring system I've been using this system for the last 5 years and it works just fine.

Two small bug bears I'm wrestling with. The iCalendar exporting is fine but I'd like to make it more flexible to output different formats, it's all a bit complicit at the moment.

I'd also like proper caching. Oddly enough with this it's not the caching code that's the problem - it's the config file.

Well, sort of. In concept it's very simple. At the moment we have two types of provider: DBI and iCalendar. Actually, if we look skewiff at it then we have 3 types: DBI, iCalendar and the one that takes the input from multiple providers. A cache would just be another provider and either sit after the multiple provider or in between the multiple provider and one of the other providers (probably a remote iCalendar provider since local iCal and DBI are fast enough).

So far so easy but I'm trying to think of an easy way to shim this into my config scheme which is currently .ini file based and looks something like
[providers]
default   = dbi
birthdays = ics
dopplr    = ics

[default]
dsn  = dbi:SQLite:calendar
user = username
pass = password

[birthdays]
# local file
file = birthdays.ics

[dopplr]
# remote url
file = http://dopplr.com/user/uid.ics

I'm also not in love with the fact that providers' names are in there twice. I have pondered something like this
providers = default birthdays dopplr
[default]
dsn  = dbi:SQLite:calendar
user = username
pass = password
type = dbi

[birthdays]
# local file
file = birthdays.ics
type = ics

[dopplr]
# remote url
file = http://dopplr.com/user/uid.ics
type = ics

And then I could change it so that
[simplecache]
cache_dir = .cache

[dopplr]
# remote url
file  = http://dopplr.com/user/uid.ics
type  = ics
cache = simplecache

which is doable but involves more complicity and blurs the abstractions since some classes are treated differently.

It's kind of vexing that it's not the coding that's the problem, it's designing a clean architecture. It really interrupts your rhythm.

20 most recent