I'm Sorry It's Come To This

If you want to be a great leader,
you must learn to follow the Tao.
Stop trying to control.
Let go of fixed plans and concepts,
and the world will govern itself.


I usually try not to get too political in my public persona – on blogs, twitter, IRC, mailing lists et cetera – and that's a conscious choice.

I work on open source software.  I have for the last ten years.  I am lucky enough to have founded a project of my own, but in open source, leaders are more beholden to their followers than vice versa.  I depend on people showing up to effectively work for me, for free, on a regular basis.  So, I try to avoid politics not because I don't have strong convictions (anyone who knows me personally can tell you that I certainly do) but because I don't want someone to avoid showing up and helping do some good in the world in one area, just because we might disagree in another.

This is a benefit of living in a free and democratic society: we have ways to dispute issues that we have strong feelings about, so we can cooperate on some things without having to agree on everything.  It's rarely perfect but we can usually get some good stuff done, with rough consensus and running code.

Today though, there's a political issue which I can't ignore.  The purpose of Twisted (the open source project which I founded) is to facilitate the transfer of information across the Internet.  A new law, SOPA, is threatening to radically alter the legal infrastructure of the Internet in the United States, granting sweeping new powers to copyright cartels and fundamentally restricting the legal right to transfer any information, and to build tools that transfer it.  Twisted is designed to make it easy to implement new protocols, to easily experiment with improvements to systems like the Domain Name System.  SOPA might well make those potential improvements, and with only a little paranoid fantasizing, Twisted itself, illegal.

It's my view that this law is a blatantly unconstitutional restriction on free speech.  It will kill job creation, at a time when our nation can scarce afford another blow to its economy.  It will create the infrastructure to suppress political dissent, similar to the infrastructure in China and Syria, at a time when our corrupt political system needs dissent more than ever.  It is the wrong thing at the wrong time.

This bill is being discussed in the house today.  If you're in the US, call your representative right now.

(As always, I don't speak for anyone but myself; no one else has reviewed or endorsed these remarks.)

Blocking vs. Running

I've heard tell of some confusion lately around what the term "non-blocking" means.  This isn't the first time I've tried to explain it, and it certainly won't be the last, but blogging is easier than the job Sisyphus got, so I can't complain.

A thread is blocking when it is performing an input or output operation that may take an unknown amount of time.  Crucially, a blocking thread is doing no useful work.  It is stuck, consuming resources - in particular, its thread stack, and its process table entry.  It is sucking up resources and getting nothing done.  These are resources that one can most definitely run out of, and are in fact artificially limited on most operating systems, because if one has too many of them, the system bogs down and becomes unusable.

A thread may also be "stuck" doing some computationally intensive work; performing a complex computation, and sucking up CPU cycles.  There is a very important distinction here, though.  If that thread is burning up CPU, it is getting work done.  It is computing.  This is why we have computers: to compute things.

It is of course possible for a program to have a bug where a program goes into an infinite loop, or otherwise performs work on the CPU without actually getting anything useful to the user done, but if that's happening then the program is just buggy, or inefficient.  But such a program is not blocking: it might be "thrashing" or "stuck" or "broken", but "blocking" means something more specific: that the program is sitting around, doing nothing, while it is waiting for some other thing to get work done, and not doing any of its own.

A program written in an event-driven style may be busy as long as it needs to be, but that does not mean it is blocking.  Hence, event-driven and non-blocking are synonyms.

Furthermore, non-blocking doesn't necessarily mean single-process.  Twisted is non-blocking, for example, but it has a sophisticated facility for starting, controlling and stopping other processes.  Information about changes to those processes is represented as plain old events, making it reasonably easy to fold the results of computation in another process back into the main one.

If you need to perform a lengthy computation in an event-driven program, that does not mean you need to stop the world in order to do it.  It doesn't mean that you need to give up on the relatively simple execution model of an event loop for a mess of threads, either.  Just ask another process to do the work, and handle the result of that work as just another event.

2L2T: DjangoCon Feedback

I've been having a great time over here at DjangoCon, but now that I've had an opportunity to relax and process some feedback from my talk, I have noticed a couple of themes to that feedback.  This isn't really a full article, just a response, but it's too long to tweet.

If you're curious about the talk, you can view it here.

For the most part, the talk was exceedingly well-received and I want to thank the Django community both for the opportunity to speak and for the overwhelmingly positively response.  Thanks for making an outsider to your community feel welcome and appreciated.

There have been a couple misconceptions though, and perhaps I didn't express myself clearly on a few points.

  1. I realize that there are times – plenty of times, even – when using some component that's in a different language from your main application is the right choice.  I wasn't trying to say "all Python all the time no matter what, no exceptions".  I just want you all to consider that there is a cost to using a component that's in a different language, and you should be aware of that cost.  It's not as simple as a tick-a-box feature comparison of the features and drawbacks of multiple products.  If I came out as sounding really extreme on this, it just was to provoke a response.
  2. You can have an architecture which is driven by Python and organized by Python without actually having all the implementation be in Python.  For example, an inordinate number of people asked me about memcache.  If you want using something like that, sure, use memcache, there's not a lot that it being in Python would buy you.  Some might say that the whole point of memcache is that it isn't very deeply configurable and doesn't have much in the way of behavior.  Plus, it's an internal component, not an externally visible service, so even my usual flimsy "no buffer overflows" argument doesn't really hold up; it's more like a library than a server.  You can incorporate memcache into a Python-in-the-driver's-seat architecture by spawning memcache from your Python process instead of making memcache a configuration dependency.  That way, you don't need a separate configuration file and a separately managed service or a chef script that boots memcache for you before your application.  This applies equally well to any other, similar services: write their config files from your Python code, and start them automatically.

Finally, thanks to everyone who really thought about what I said, took the time to respond, and prompted me to write this.

ἁγιολογία for r0ml

I have the rare distinction of being a second-generation software developer.  Most recently, I mentioned this in an interview when asked who my programming heroes are.  It might sound kind of corny, but I'm serious when I say that my father is my programming hero.
Robert "r0ml" Lefkowitz
My dad had a cool hacker alias in the seventies.  He's been known as "r0ml" around the web since before there was a web. If you are in a particularly typographically hip part of the internet, it might even be "RØML".  How many of your parents have a nom de plume with a digit, or a non-ASCII character in it?  Or, for that matter, any kind of hacker pseudonym?
I had the good fortune to work with one of r0ml's colleagues, Amir Bakhtiar.  Amir paid me one of the highest compliments I've ever received: he said that the code for systems I've worked on is similar to r0ml's in its style and exposition.  My dad taught me how to program in x86 assembler, and in that process, I learned a lot about the way he thought about solving problems and building systems.  I regard thinking that well, or even comparably well, as a real achievement.
That's not to say that I would do everything exactly the way that he does.  For example, he writes a lot of networking code in Java.  He doesn't use Twisted, for the most part.  If you know me and you know my dad, you know that we disagree on plenty of stuff.
Unlike the stereotypical, often-satirized filial argument, these discussions are something I look forward to.  Disagreeing with my dad is still one of the most intellectually challenging activities I've ever engaged in.  Whenever I have a conversation with him about a topic where he has a different view, I come away enlightened – if not necessarily convinced.
Conversations among my friends occasionally turn to the topic of our respective upbringings, as they do in any close group.  One of the recurring themes of my childhood is that, while my siblings and I were sometimes told to be quiet, we were never told to be quiet because our opinions weren't valuable.  Sometimes we were told in unequivocal terms that we were wrong, of course.  However, my dad always encouraged us to present our thoughts.  Then, he wouldn't pull any punches in relentlessly refuting our arguments, using a combination of facts, estimates, calculations, and rhetorical flourishes.  I learned more about influencing people and thinking clearly around the dinner table than in my entire formal education.
r0ml always questions glib answers, challenges the official version of events, distrusts things that are "intuitively obvious" or "common sense".  The skepticism I've developed as a result of his consistent example has rarely led me astray.  Glib answers, official versions, and common sense are frequently, if not always, wrong.  He taught me to search for the non-intuitive answer, the surprising inflection point in the data.
In a roundabout way, he also taught my siblings and I how to perform some delightful rhetorical flourishes of our own, but also not to trust them.  Pretty phrases can be deployed equally effectively in the service of illustration or deception.  Although I can appreciate that parents often come to a point where they've had enough and a little deception can be a useful thing.
One cannot be a practiced rhetorician without a heaping helping of eclectic life experience; r0ml has that too.  He's a fencer.  And a juggler.  He still has the highest score on Space Harrier of anyone I've ever met.  (I can remember a crowd gathering in an arcade to see him start level 18.)  He's an avid scholar of medieval thought and custom.  For that matter, he's an avid scholar of a couple dozen other things, but listing them all would take a whole day.
He has the common occupational affliction of being a science fiction fan.  However, fandom was never an identity for him. Again, by consistent example, he taught me to focus on my own creativity, and do something cool, never to just passively consume others' ideas.  He treats entertainment as an inspiration, rather than an escape.  For instance, one of the earliest memories I have about my father talking about software is a reference to the movie "Terminator".  (Please keep in mind that this memory is ~20 years old at this point, so it might not be terribly accurate.)  I remember him saying something like "All software should be relentless.  If you remove its legs, it should use its arms.  Whatever errors it encounters, it should deal with them, and keep going if it can."
Nevertheless, seeing "Tron: Legacy" with my dad, the hacker, in IMAX 3D, 20 years after we saw the original together... I didn't need to take a life lesson from that to think it was pretty rad[1].
Unlike many quiet geniuses who labor in obscurity, dispensing wisdom only to a fortunate few, r0ml is a somewhat notorious public speaker.  You can see him this year at OSCON.  If you hunt around the web, you can find some video examples of his previous talks, like this great 30-second interview[2] about the nature of open source process, from a talk he gave in 2008 (audio of the full talk here).
([1]: Although, jeez, what was the point of that whole open-source subplot at the beginning?  It seemed like a great idea, but then it went absolutely nowhere!)
([2]: Speaking of not doing things exactly the way he does - where he uses a metaphor to "single-threading" and "multi-threading", I would have said "blocking" and "event-driven" - but more on that in a future post.)
Happy Father's Day, r0ml.

Calling all Ascetic Buddhist Rock Musicians

The Presentation

The inimitable Zooko recently made me aware of an excellent presentation about HTTPS: "It's Time to Fix HTTPS", by Chris Palmer.

The presentation is both hilarious and illuminating; I highly recommend you view it right away.  It's not saying anything that I haven't been thinking for a very long time.  Except the thing about how IE can silently add certificates to your root CA store, that was definitely new, and a little depressing.  But this is a somewhat esoteric topic and it needs to be made more popular for the everyday user.  Sexy, even.

A Brief Review

(But seriously, go read the slides, they're more entertaining.)

Internet security is based on trust.  The math behind modern cryptography doesn't ensure anything beyond that you're talking to someone that holds a particular special secret ("private key").  You can verify that the party you're talking to has the same key as the one you talked to last time, and that a particular private key corresponds to a particular public key, but that's about it.  The public key can be published for everyone to see without risking any of the secrets being sent, but you still need some way to determine whether the public key actually belongs to the person you want to talk to.  So, in order to have a secure system, you have to layer some rules on top of that which give you some way to know whether that private key corresponds to an identity that you care about and trust.

The current system goes something like this: each web browser vendor decides, more or less at random, on a group of entities we will all trust completely.  By virtue of the trust of the software, they become the authorities who can decide whose public keys are valid.  Actually, a public key isn't quite enough: you need a key plus some metadata about the person sending it: we call this a "certificate".  So these entities are termed "certificate authorities".  The browser vendors tend to decide on the same group, because there's a lot of social pressure to maintain a list that makes sense (and also, anybody who gets accepted by one browser but denied by another can't really sell certificates: the whole point of this exercise is to sell things that make the little lock icon come up, so you know your web shopping cart is "secure").

The problem with this system is that almost all of these "completely trustworthy" entities are enormous companies or, possibly even foreign governments, which have diverse motivations and huge amounts of legitimate business to conduct, making it very hard to spot a small amount of malfeasance.  (Although there is some good news: people do notice, and they freak the hell out when they do; so at least there's some policing of the current system.)  One compromised certificate authority (and there are lots and lots to try and compromise) means a complete "game over" for everybody who uses a web browser and trusts the little lock icon.

Basically there's no such thing as "completely trustworthy".  There's only: do I trust you.

The Next Step

The solution that Mr. Palmer proposes is extremely similar to the one which I thought I originally devised in about 2004, but probably was floating around in the security zeitgeist even before that.  It's a combination of 3 general principles:

Trust On First Use

Basically, the first time I see you, on the internet, it's unlikely that you're trying to trick me.  So you can give me any old public key, and I'll accept that it's you.

Mr. Palmer gives this one a catchy pseudoym, "TOFU", which I quite like (and I guess is pretty widely known at this point).

Persistence Of Pseudonym

The important point is that then I remember that it's you, forever, so it's very hard to attack our communications after that point.

I'll come up with a name for you (let's say "Bob Smith" or "The Most Secure Bank In The World Dot Com"), and my software will make sure that it sticks to that public key.  You can potentially tell me that your key has changed, but you'd better be prepared to present your old key, otherwise I have to get re-introduced to you, and now I'm suspicious that something may have been fishy.  Especially if some other thing shows up and say "Hi, it's Bob Smith" (with the correct, old public key) - "Hey, who's this guy?"

This is referred to as "POP".  Also pretty catchy.

Mesh Overlay Network Keysigning

The third concept Mr. Palmer refers to as a "trustiness metric" which includes "perspectives", and says "You can't fool all of the people all of the time".  He includes some other stuff in his trustiness metric here, but I'm going to extrapolate from that sentence:

It's really, really easy to sit down in a café and intercept some of my network traffic.  It takes about 2 minutes to collect a dozen passwords this way, on today's mostly-not-encrypted internet.  So it would be very  easy for someone to break this system if all you had was a little re-introduction warning; users might not understand it and just click anyway, and then it's just as broken (if not worse) than the current model; at least in the current model, normal users don't usually get those warnings, and they're "safe" if they're looking for the lock, but in this new model, users would get them for all new secure introductions.  So we need something better.
It's not so easy to sit down in a café and intercept network traffic from me and also intercept traffic from my friend, on a different network, doing a different thing.  You have to know where my friend is.  You have to be able to intercept our pre-arranged secure communication (I already remember all my friends keys when I first see them, you'll recall).  If you're a casual attacker who just wants to sniff a couple of credit card numbers at the local starbucks, you probably don't have the resources to do that, even for a single individual.

It is definitely not easy to figure out where every single one of my currently-online friends - let's say Facebook friends, because you can maybe they finally care about security now - is online from, and also attack their networks simultaneously, to provide exactly the same bogus first-introduction certificate to Super Secure Bank Dot Com.  This is a level of sophistication and coordination that not even most governments can muster.

So if we had a reasonably available mesh overlay network, where I can tell my friends, and my friends can tell their friends (etc forever) about first-introduction key correspondence with DNS names, and legitimate changes to keys where the site operator has had a security problem, then we could address many of these issues much more robustly than we can today.  It might not be perfect, but it would silently work often enough that it would be much better than today's default of "bah, I don't know why you're getting the browser warning; just use HTTP".

Badump Ching

If you've been paying attention I think you can see where I am going with this.

We (those of us in the open source hipster security noosphere) need to popularize this concept, because it's not that hard to implement, people keep re-inventing it everywhere, it's mostly just about getting some browser vendor to think it's a good idea.

The acronym is TOFU POP MONK, so clearly we need a vegetarian monk - buddhist seems most likely - who sings pop songs about how great tofu is.  We need it to go viral on the you tubes, and any other tubes that are appropriate.

(Graphic design nerds, and sports racers of all stripes, start your engines.  I challenge you.  Show me some awesome macroable meme images starring the Tofu-Pop Monk.  I will post any particularly compelling ones here.)