October 16th, 2007


So don't mistake it, just try and fake it

I think it's one of the issues that confuses some users of sites - can the people who run the site read their private data. Now most sites will store any passwords hashed and salted but by their very nature entries (posts, comments, photos, tweets, pokes, zombie requests, movies) must be stored in the database completely en clair.

It's logical if you think about it - the text (for example) appears unencrypted on your screen if you're logged in. It must have got that way some how - if the site can decrypt it an display it on your screen then the site's owners can read it.

In practice access to that data is restricted through usual security techniques so there's no need to freak out although you should practice all your standard hygiene just as you would as if you had the data on your laptop - don't leave things that can lead to identity theft or bank fraud lying around.

However, as a gedanken experiment what would it take to have a fully encrypted social software site? There are plenty of reasons why this would be desirable - you reduce the chance of accidental privacy leakage, if hackers ever got into your service's servers and compromised the database then your posts will still be unreadable and there's no chance of the data being subpoenaed.

But if that's the case then why would a site not do it - well, for a start, it would be a resource nightmare and secondly it would be very difficult, usability wise. For those that are interested in the usability of cryptography then there's this fantastic paper called Why Johhny Can't Encrypt - in this case redone as a chapter of an O'Reilly book.

Anyway, so what would be need. Well, for a start it seems clear that we'd need some sort of public-private key system. As a user you would generate a key pair and then upload it to the service. You'd write the post client side, encrypt it with your public key and sign it with your private key (maybe in a Java applet or using a Javascript library) and then post it to the service. To read it you'd retrieve the encrypted post and decrypt it with your private key.

However, as the more astute of you will have noted - this is useless. All it is is a way of storing your posts on the remote server with no one else reading them whereas, in fact what you want is for your friends to read them.

Theoretically what you could do is give all your friends your private key but then they could read all your posts and even post as you. Which is dumb.

So, what we actually need to do is - every time you write a new post we generate a new public-private key pair let's call them priv_tmp and pub_tmp. We then encrypt the signed post with pub_tmp and then encrypt pub_tmp with your original public key (pub_orig) and store them both somewhere.

Now we have an encrypted post in the database which only you can read and write to (by retrieving pub_tmp with priv_orig).

Then, for each of your friends (and yourself presumably) you'd encrypt priv_tmp with their public keys.

Now, if they want to read a post they retrieve the encrypted post and priv_tmp encrypted with their public keys. They use their private keys to retrieve prive_tmp and then use priv_tmp to decrypt the post.

If you want to revoke someone's access to the post then you could just delete the priv_tmp encrypted with their public key. However, if they're unscrupulous then they could have saved priv_tmp so what you'd actually have to do is delete the post and then go through the encryption process again for all your friends minus them.

If you want to give someone permission to read an old post you retrieve priv_tmp and then re-encrypt it using the new person's public key.

Woohoo, that was easy! Ish! Let's go implement it!


Spotted the flaw yet?

Yes, any of your friends who have access to read your post can provide access to anybody else which is obviously undesirable.

So there will probably have to be extra steps with signing using private keys. Or alternatively I seem to remember that there was some interesting work being done in this area - I have hazy memories of a diagram of a box with multiple keys or some such but it's all slipped into an alcoholic haze and I'm way out of date with my crypto reading.

Either way I hope I've shown why what may seem simple conceptually is actually hard both theoretically and practically.

And now I have the phrase ♫'Coz there's no end To being your friend♫ stuck in my head.