I have the rare distinction of being a second-generation
software developer. Most recently,
I mentioned this in an interview when asked who my programming
heroes are. It might sound kind of corny, but I'm serious when I say
that my father is my programming hero.
My dad had a cool hacker alias in the seventies.
He's been known as "r0ml" around the web since before there was a
web. If you are in a particularly typographically hip part of the
internet, it might even be "RØML". How many of your parents have a nom
de plume with a digit, or a non-ASCII character in it? Or, for that
matter, any kind of hacker pseudonym?
I had the good fortune to work with one of r0ml's colleagues, Amir Bakhtiar.
Amir paid me one of the highest compliments I've ever received: he
said that the code for systems I've worked on is similar to r0ml's in its
style and exposition. My dad taught me how to program in x86
assembler, and in that process, I learned a lot about the way he thought
about solving problems and building systems. I regard thinking that
well, or even comparably well, as a real achievement.
That's not to say that I would do everything exactly the way that he does.
For example, he writes a lot of networking code in Java. He
doesn't use Twisted, for the most part. If you know me and you know my
dad, you know that we disagree on plenty of stuff.
Unlike the stereotypical, often-satirized filial argument, these
discussions are something I look forward to. Disagreeing with my dad
is still one of the most intellectually challenging activities I've ever
engaged in. Whenever I have a conversation with him about a topic
where he has a different view, I come away enlightened – if not necessarily
convinced.
Conversations among my friends occasionally turn to the topic of our
respective upbringings, as they do in any close group. One of the
recurring themes of my childhood is that, while my siblings and I were
sometimes told to be quiet, we were never told to be quiet because our
opinions weren't valuable. Sometimes we were told in unequivocal terms
that we were wrong, of course. However, my dad always
encouraged us to present our thoughts. Then, he wouldn't pull any
punches in relentlessly refuting our arguments, using a combination of
facts, estimates, calculations, and rhetorical flourishes. I learned
more about influencing people and thinking clearly around the dinner table
than in my entire formal education.
r0ml always questions glib answers, challenges the official version of
events, distrusts things that are "intuitively obvious" or "common sense".
The skepticism I've developed as a result of his consistent example
has rarely led me astray. Glib answers, official versions, and common
sense are frequently, if not always, wrong. He taught me to search for
the non-intuitive answer, the surprising inflection point in the data.
In a roundabout way, he also taught my siblings and I how to perform some
delightful rhetorical flourishes of our own, but also not to trust them.
Pretty phrases can be deployed equally effectively in the service of
illustration or deception. Although I can appreciate that parents
often come to a point where they've had enough and a
little deception can be a useful thing.
One cannot be a practiced rhetorician without a heaping helping of eclectic
life experience; r0ml has that too. He's a fencer. And
a juggler. He
still has the highest score on Space Harrier of anyone
I've ever met. (I can remember a crowd gathering in an arcade to see
him start level 18.) He's an avid scholar of medieval thought and
custom. For that matter, he's an avid scholar of a couple dozen other
things, but listing them all would take a whole day.
He has the common occupational affliction of being a science fiction fan.
However, fandom was never an identity for him. Again, by consistent
example, he taught me to focus on my own creativity, and do something cool,
never to just passively consume others' ideas. He treats entertainment
as an inspiration, rather than an escape. For instance, one of the
earliest memories I have about my father talking about software is a
reference to the movie "Terminator". (Please keep in mind that this
memory is ~20 years old at this point, so it might not be terribly
accurate.) I remember him saying something like "All software should
be relentless. If you remove its legs, it should use its arms.
Whatever errors it encounters, it should deal with them, and keep
going if it can."
Nevertheless, seeing "Tron: Legacy" with my dad, the hacker, in IMAX
3D, 20 years after we saw the original together... I didn't need to take a
life lesson from that to think it was pretty rad[1].
Unlike many quiet geniuses who labor in obscurity, dispensing wisdom only to
a fortunate few, r0ml is a somewhat notorious public speaker.
You can see him this
year at OSCON. If you hunt around the web, you can find some video
examples of his previous talks,
like this great 30-second interview[2] about the nature of open
source process, from a talk he gave in 2008 (audio of the
full talk here).
([1]: Although, jeez, what was the point of that whole open-source subplot
at the beginning? It seemed like a great idea, but then it went
absolutely nowhere!)
([2]: Speaking of not doing things exactly the way he does - where he
uses a metaphor to "single-threading" and "multi-threading", I would have
said "blocking" and "event-driven" - but more on that in a future post.)
Happy Father's Day, r0ml.
The Presentation
The inimitable Zooko recently made me aware of an excellent presentation about HTTPS: "It's Time to Fix HTTPS", by Chris Palmer.The presentation is both hilarious and illuminating; I highly recommend you view it right away. It's not saying anything that I haven't been thinking for a very long time. Except the thing about how IE can silently add certificates to your root CA store, that was definitely new, and a little depressing. But this is a somewhat esoteric topic and it needs to be made more popular for the everyday user. Sexy, even.
A Brief Review
(But seriously, go read the slides, they're more entertaining.)Internet security is based on trust. The math behind modern cryptography doesn't ensure anything beyond that you're talking to someone that holds a particular special secret ("private key"). You can verify that the party you're talking to has the same key as the one you talked to last time, and that a particular private key corresponds to a particular public key, but that's about it. The public key can be published for everyone to see without risking any of the secrets being sent, but you still need some way to determine whether the public key actually belongs to the person you want to talk to. So, in order to have a secure system, you have to layer some rules on top of that which give you some way to know whether that private key corresponds to an identity that you care about and trust.
The current system goes something like this: each web browser vendor decides, more or less at random, on a group of entities we will all trust completely. By virtue of the trust of the software, they become the authorities who can decide whose public keys are valid. Actually, a public key isn't quite enough: you need a key plus some metadata about the person sending it: we call this a "certificate". So these entities are termed "certificate authorities". The browser vendors tend to decide on the same group, because there's a lot of social pressure to maintain a list that makes sense (and also, anybody who gets accepted by one browser but denied by another can't really sell certificates: the whole point of this exercise is to sell things that make the little lock icon come up, so you know your web shopping cart is "secure").
The problem with this system is that almost all of these "completely trustworthy" entities are enormous companies or, possibly even foreign governments, which have diverse motivations and huge amounts of legitimate business to conduct, making it very hard to spot a small amount of malfeasance. (Although there is some good news: people do notice, and they freak the hell out when they do; so at least there's some policing of the current system.) One compromised certificate authority (and there are lots and lots to try and compromise) means a complete "game over" for everybody who uses a web browser and trusts the little lock icon.
Basically there's no such thing as "completely trustworthy". There's only: do I trust you.
The Next Step
The solution that Mr. Palmer proposes is extremely similar to the one which I thought I originally devised in about 2004, but probably was floating around in the security zeitgeist even before that. It's a combination of 3 general principles:Trust On First Use
Basically, the first time I see you, on the internet, it's unlikely that you're trying to trick me. So you can give me any old public key, and I'll accept that it's you.Mr. Palmer gives this one a catchy pseudoym, "TOFU", which I quite like (and I guess is pretty widely known at this point).
Persistence Of Pseudonym
The important point is that then I remember that it's you, forever, so it's very hard to attack our communications after that point.I'll come up with a name for you (let's say "Bob Smith" or "The Most Secure Bank In The World Dot Com"), and my software will make sure that it sticks to that public key. You can potentially tell me that your key has changed, but you'd better be prepared to present your old key, otherwise I have to get re-introduced to you, and now I'm suspicious that something may have been fishy. Especially if some other thing shows up and say "Hi, it's Bob Smith" (with the correct, old public key) - "Hey, who's this guy?"
This is referred to as "POP". Also pretty catchy.
Mesh Overlay Network Keysigning
The third concept Mr. Palmer refers to as a "trustiness metric" which includes "perspectives", and says "You can't fool all of the people all of the time". He includes some other stuff in his trustiness metric here, but I'm going to extrapolate from that sentence:It's really, really easy to sit down in a café and intercept some of my network traffic. It takes about 2 minutes to collect a dozen passwords this way, on today's mostly-not-encrypted internet. So it would be very easy for someone to break this system if all you had was a little re-introduction warning; users might not understand it and just click anyway, and then it's just as broken (if not worse) than the current model; at least in the current model, normal users don't usually get those warnings, and they're "safe" if they're looking for the lock, but in this new model, users would get them for all new secure introductions. So we need something better.
It's not so easy to sit down in a café and intercept network traffic from me and also intercept traffic from my friend, on a different network, doing a different thing. You have to know where my friend is. You have to be able to intercept our pre-arranged secure communication (I already remember all my friends keys when I first see them, you'll recall). If you're a casual attacker who just wants to sniff a couple of credit card numbers at the local starbucks, you probably don't have the resources to do that, even for a single individual.
It is definitely not easy to figure out where every single one of my currently-online friends - let's say Facebook friends, because you can maybe they finally care about security now - is online from, and also attack their networks simultaneously, to provide exactly the same bogus first-introduction certificate to Super Secure Bank Dot Com. This is a level of sophistication and coordination that not even most governments can muster.
So if we had a reasonably available mesh overlay network, where I can tell my friends, and my friends can tell their friends (etc forever) about first-introduction key correspondence with DNS names, and legitimate changes to keys where the site operator has had a security problem, then we could address many of these issues much more robustly than we can today. It might not be perfect, but it would silently work often enough that it would be much better than today's default of "bah, I don't know why you're getting the browser warning; just use HTTP".
Badump Ching
If you've been paying attention I think you can see where I am going with this.We (those of us in the open source hipster security noosphere) need to popularize this concept, because it's not that hard to implement, people keep re-inventing it everywhere, it's mostly just about getting some browser vendor to think it's a good idea.
The acronym is TOFU POP MONK, so clearly we need a vegetarian monk - buddhist seems most likely - who sings pop songs about how great tofu is. We need it to go viral on the you tubes, and any other tubes that are appropriate.
(Graphic design nerds, and sports racers of all stripes, start your engines. I challenge you. Show me some awesome macroable meme images starring the Tofu-Pop Monk. I will post any particularly compelling ones here.)
If you're like me, occasionally you grab the latest version of a bzr branch
onto your laptop before you're going somewhere without network access. But,
as you're about to leave, you glance over at your laptop screen, and you see
the dreaded:
As it turns out, Bazaar has actually already done all the hard work necessary for you to just go ahead and do that merge when you get to your potentially non-networked destination. The diverged revisions have already been pulled into your branch and are just sitting there, waiting to be merged, but you can't see them. The 'bzrtools' plugin provides the 'heads' command, which you can use to reveal the previously invisible revision. You can then just 'merge .' instead of merging from your usual pull location, as long as you specify the appropriate revision.
To demonstrate, here's a transcript of a sample session which simulates this common problem:
First, set up a branch:
For those of you who may be curious about the use-case, if you don't have it: I rarely encounter this with actual codebases I work on, as I tend to have a local trunk mirror, and features are neatly segregated into branches. It comes up more frequently in my personal configuration-files repository, where I make little changes to my desktop, little changes to my laptop, and then want to get out the door quickly with the latest merged copy. I was so happy when #bzr on freenode (thanks, spiv!) solved this problem for me that I just had to share.
bzr: ERROR: These branches have diverged. Use the missing command to see how.but you don't have time to do a merge, and wait for the (reliably agonizingly slow) network round trip to negotiate with the server about what the latest revision is - the train's about to leave, or you're late for your flight, or the cafe is closing and you need to shut your laptop right now. Sadness! You continue to work on a diverged branch and merge later. Which is a shame, because mechanically dealing with merge conflicts or just making sure the tests still pass after what looks like a trivial merge is exactly the sort of thing which is convenient to do when you're stuck waiting at a network-access-free bus stop.
Use the merge command to reconcile them.
As it turns out, Bazaar has actually already done all the hard work necessary for you to just go ahead and do that merge when you get to your potentially non-networked destination. The diverged revisions have already been pulled into your branch and are just sitting there, waiting to be merged, but you can't see them. The 'bzrtools' plugin provides the 'heads' command, which you can use to reveal the previously invisible revision. You can then just 'merge .' instead of merging from your usual pull location, as long as you specify the appropriate revision.
To demonstrate, here's a transcript of a sample session which simulates this common problem:
First, set up a branch:
you@computer:~$ mkdir tmpWe'll call 'a' the 'server' branch. Next, let's make a branch that represents the 'on the go' branch, your local working copy:
you@computer:~$ cd tmp
you@computer:~/tmp$ mkdir a
you@computer:~/tmp$ cd a
you@computer:~/tmp/a$ bzr init
Created a standalone tree (format: 2a)
you@computer:~/tmp/a$ touch initial.txt
you@computer:~/tmp/a$ bzr add
adding initial.txt
you@computer:~/tmp/a$ bzr ci -m "inital revision"
Committing to: /Domicile/glyph/tmp/a/
added initial.txt
Committed revision 1.
you@computer:~/tmp/a$ cd ..Now, it's time to diverge. Let's give each branch its own revision.
you@computer:~/tmp$ bzr get a b
Branched 1 revision(s).
you@computer:~/tmp$ cd aNow, it's time to get on that sad, wifi-free train. Let's make sure we're up to date with 'a' first...
you@computer:~/tmp/a$ touch a.txt
you@computer:~/tmp/a$ bzr add
badding a.txt
zyou@computer:~/tmp/a$ bzr ci -m 'revision from a'
Committing to: /Domicile/glyph/tmp/a/
added a.txt
Committed revision 2.
you@computer:~/tmp/a$ cd ../b/
you@computer:~/tmp/b$ touch b.txt
you@computer:~/tmp/b$ bzr add
adding b.txt
you@computer:~/tmp/b$ bzr ci -m 'revision from b'
Committing to: /Domicile/glyph/tmp/b/
added b.txt
Committed revision 2.
you@computer:~/tmp/b$ bzr pull ../aOh no! But, here comes 'bzr heads' to the rescue:
bzr: ERROR: These branches have diverged. Use the missing command to see how.
Use the merge command to reconcile them.
[Error: 3]
you@computer:~/tmp/b$ bzr heads --deadNow you know what the revision ID of the already-pulled-but-not-visible revision is - the tip of 'a', in other words. Now you just need to ask 'b' to merge it:
HEAD: revision-id: <strong>you@computer-123456</strong> (dead)
committer: You <you@computer>
branch nick: a
timestamp: now-ish
message:
revision from a
you@computer:~/tmp/b$ bzr merge . -r <strong>you@computer-123456</strong>Done! And, as you can see when you get back to your cozy 10gigE fiber connection at home, or whatever you happen to have, you see that the revision you've merged lines up neatly with 'a':
+N a.txt
All changes applied successfully.
you@computer:~/tmp/b$ bzr ci -m 'merge from a'
Committing to: /Domicile/glyph/tmp/b/
added a.txt
Committed revision 3.
you@computer:~/tmp/b$ bzr pull ../aEt voila. I hope this saves somebody some time when dealing with failed pulls.
No revisions to pull.
you@computer:~/tmp/b$
For those of you who may be curious about the use-case, if you don't have it: I rarely encounter this with actual codebases I work on, as I tend to have a local trunk mirror, and features are neatly segregated into branches. It comes up more frequently in my personal configuration-files repository, where I make little changes to my desktop, little changes to my laptop, and then want to get out the door quickly with the latest merged copy. I was so happy when #bzr on freenode (thanks, spiv!) solved this problem for me that I just had to share.
The open-source event-driven networking engine that I work on is called
"Twisted".
If you're uncomfortable using something that sounds like an adjective in a
place where a noun should go, the following noun phrases are
equivalent:
I can understand that there is some confusion around this stuff, since these words often appear in close proximity, but to my knowledge there is nothing called "Python Twisted", "Twisted Python", or "Twisted Matrix". There's "python-twisted", which is the package name that some operating systems use to package Twisted. There is also "twisted.python", which is a python package within Twisted itself. Finally there is "twisted-python@twistedmatrix.com", which is the mailing list for discussing Twisted stuff in the Python programming language. (This discussion list is so named to distinguish it from the possibility of not-quite-hypothetical discussion of Twisted implemented in other languages, although no other implementations are currently actively maintained.)
I just thought you'd all like to know that. That is all. (For now, anyway.)
- the Twisted project
- the Twisted engine
- the Twisted networking engine
- the Twisted framework
I can understand that there is some confusion around this stuff, since these words often appear in close proximity, but to my knowledge there is nothing called "Python Twisted", "Twisted Python", or "Twisted Matrix". There's "python-twisted", which is the package name that some operating systems use to package Twisted. There is also "twisted.python", which is a python package within Twisted itself. Finally there is "twisted-python@twistedmatrix.com", which is the mailing list for discussing Twisted stuff in the Python programming language. (This discussion list is so named to distinguish it from the possibility of not-quite-hypothetical discussion of Twisted implemented in other languages, although no other implementations are currently actively maintained.)
I just thought you'd all like to know that. That is all. (For now, anyway.)
Jean-Paul Calderone
continues his excellent "Twisted Web In 60
Seconds" tutorial series. If you haven't checked it out yet, you
should!