The Concurrency Spectrum: from Callbacks to Coroutines to Craziness

Thursday January 19, 2012

Concurrent programming idioms are on a spectrum of complexity.

Obviously, writing code that isn't concurrent in any way is the easiest. If you never introduce any concurrent tasks, you never have to debug any problems with things running in an unexpected order. But, in today's connected world, concurrency of some sort is usually a requirement. Each additional point where concurrency can happen introduces a bit of cognitive overhead, another place you need to think about what might happen, so as a codebase adds more of them it becomes more difficult to understand them all, and it becomes more challenging to understand subtle nuances of parallel execution.

So, at the simplest end of the spectrum, you have callback-based concurrency. Every time you have to proceed to the next step of a concurrent operation, you have to create a new function and new scope, and pass it to the operation so that the appropriate function will be called when the operation completes. This is very explicit and reasonably straightforward to debug and test, but it can be tedious and overly verbose, especially in Python where you have to think up a new function name and argument list for every step. The extra lines for the function definition and return statement can be an impediment to quickly understanding the code's intentions, so what facilitates understanding of the concurrency model can inhibit understanding of the code's actual logical purpose, depending on how much concurrent stuff it has to do. Twisted's Deferreds make this a bit easier than raw callback-passing without fundamentally changing the execution dynamic, so they're at this same level.

Then you have explicit concurrency, where every possible switch-point has to be labeled somehow. This is yield-based coroutines, or inlineCallbacks, in Twisted. This is more compact than using callbacks, but also more limiting. For example, you can only resume a generator once, whereas you can run a callback multiple times. However, for a logical flow of sequential concurrent steps, it reads very naturally, and is shorter, as it collapses out the 'def' and 'return' lines, and you have to think of at least two fewer names per step.

However, that very ease can be misleading. You might gloss over a 'result = yield ...' more easily than a 'def whatever(result): return result; something(whatever)'. Nevertheless, if you have 'yield's everywhere you might swap your stack, then when you have a concurrency bug, you can look at any given arbitrary chunk of code and know that you don't need any locks in it, as long as you can't see any yield statements. Where you do see yield statements, you know that you have some code that needs to be inspected.

To continue down that spectrum, a cooperatively multithreading program with implicit context switches makes every line with any function call on it (or any line which might be a function call, like any operator which can be overridden by a special method) a possible, but not likely culprit. Now when you have a concurrency bug you have to audit absolutely every line of code you've got, although you still have a few clues which will help you narrow it down and rule out certain areas of the code. For example, you can guess that it would be pathological for 'x = []; ...; x.append(y)' to context switch. (Although, given arbitrary introspection craziness, it is still possible, depending on what "..." is.) This is way more lines than you have to consider with yield, although with some discipline it can be kept manageable. However, experience has taught me that "with some discipline" is a code phrase for "almost never, on real-life programming projects".

All the way at the end of the spectrum of course you have preemptive multithreading, where every line of code is a mind-destroying death-trap hiding every possible concurrency peril you could imagine, and anything could happen at any time. When you encounter a concurrency bug you have to give up and just try to drink your sorrows away. Or just change random stuff in your 'settings.py' until it starts working, or something. I never really did get comfortable in that style. With some discipline, you can manage this problem by never manipulating shared state, and only transferring data via safe queueing mechanisms, but... there's that phrase again.

Some programming languages, like Erlang, support efficient preemptive processes with state isolation and built-in super-cheap super-fast queues to transfer immutable values. (Some other languages call these "threads" anyway, even though I would agree with Erlang's classification as "processes".) That's a different programming model entirely though, with its own advantages and challenges, which doesn't land neatly on this spectrum; if I'm talking about left and right here, Erlang and friends are somewhere above or below. I'm just describing Python and its ilk, where threads give you a big pile of shared, mutable state, and you are constantly tempted to splash said state all over your program.

Personally I like Twisted's style best; the thing that you yield is itself an object whose state can be inspected, and you can write callback-based or yield-based code as each specific context merits. My opinion on this has shifted over time, but currently I find that it's best to have a core which is written in the super-explicit callback-based approach with no coroutines at all, and then high-level application logic which wraps that core using yield-based coroutines (@inlineCallbacks, for Twisted fans).

I hope that in a future post, I may explain why, but that would take more words than I've got in me tonight.

I'm Sorry It's Come To This

Thursday December 15, 2011

If you want to be a great leader,

you must learn to follow the Tao.

Stop trying to control.

Let go of fixed plans and concepts,

and the world will govern itself.

- Tao Te Ching ch. 59,

as translated by S. Mitchell

I usually try not to get too political in my public persona – on blogs, twitter, IRC, mailing lists et cetera – and that's a conscious choice.

I work on open source software. I have for the last ten years. I am lucky enough to have founded a project of my own, but in open source, leaders are more beholden to their followers than vice versa. I depend on people showing up to effectively work for me, for free, on a regular basis. So, I try to avoid politics not because I don't have strong convictions (anyone who knows me personally can tell you that I certainly do) but because I don't want someone to avoid showing up and helping do some good in the world in one area, just because we might disagree in another.

This is a benefit of living in a free and democratic society: we have ways to dispute issues that we have strong feelings about, so we can cooperate on some things without having to agree on everything. It's rarely perfect but we can usually get some good stuff done, with rough consensus and running code.

Today though, there's a political issue which I can't ignore. The purpose of Twisted (the open source project which I founded) is to facilitate the transfer of information across the Internet. A new law, SOPA, is threatening to radically alter the legal infrastructure of the Internet in the United States, granting sweeping new powers to copyright cartels and fundamentally restricting the legal right to transfer any information, and to build tools that transfer it. Twisted is designed to make it easy to implement new protocols, to easily experiment with improvements to systems like the Domain Name System. SOPA might well make those potential improvements, and with only a little paranoid fantasizing, Twisted itself, illegal.

It's my view that this law is a blatantly unconstitutional restriction on free speech. It will kill job creation, at a time when our nation can scarce afford another blow to its economy. It will create the infrastructure to suppress political dissent, similar to the infrastructure in China and Syria, at a time when our corrupt political system needs dissent more than ever. It is the wrong thing at the wrong time.

This bill is being discussed in the house today. If you're in the US, call your representative right now.

(As always, I don't speak for anyone but myself; no one else has reviewed or endorsed these remarks.)

Blocking vs. Running

Friday November 04, 2011

I've heard tell of some confusion lately around what the term "non-blocking" means. This isn't the first time I've tried to explain it, and it certainly won't be the last, but blogging is easier than the job Sisyphus got, so I can't complain.

A thread is blocking when it is performing an input or output operation that may take an unknown amount of time. Crucially, a blocking thread is doing no useful work. It is stuck, consuming resources - in particular, its thread stack, and its process table entry. It is sucking up resources and getting nothing done. These are resources that one can most definitely run out of, and are in fact artificially limited on most operating systems, because if one has too many of them, the system bogs down and becomes unusable.

A thread may also be "stuck" doing some computationally intensive work; performing a complex computation, and sucking up CPU cycles. There is a very important distinction here, though. If that thread is burning up CPU, it is getting work done. It is computing. This is why we have computers: to compute things.

It is of course possible for a program to have a bug where a program goes into an infinite loop, or otherwise performs work on the CPU without actually getting anything useful to the user done, but if that's happening then the program is just buggy, or inefficient. But such a program is not blocking: it might be "thrashing" or "stuck" or "broken", but "blocking" means something more specific: that the program is sitting around, doing nothing, while it is waiting for some other thing to get work done, and not doing any of its own.

A program written in an event-driven style may be busy as long as it needs to be, but that does not mean it is blocking. Hence, event-driven and non-blocking are synonyms.

Furthermore, non-blocking doesn't necessarily mean single-process. Twisted is non-blocking, for example, but it has a sophisticated facility for starting, controlling and stopping other processes. Information about changes to those processes is represented as plain old events, making it reasonably easy to fold the results of computation in another process back into the main one.

If you need to perform a lengthy computation in an event-driven program, that does not mean you need to stop the world in order to do it. It doesn't mean that you need to give up on the relatively simple execution model of an event loop for a mess of threads, either. Just ask another process to do the work, and handle the result of that work as just another event.

2L2T: DjangoCon Feedback

Friday September 09, 2011

I've been having a great time over here at DjangoCon, but now that I've had an opportunity to relax and process some feedback from my talk, I have noticed a couple of themes to that feedback. This isn't really a full article, just a response, but it's too long to tweet.

If you're curious about the talk, you can view it here.

For the most part, the talk was exceedingly well-received and I want to thank the Django community both for the opportunity to speak and for the overwhelmingly positively response. Thanks for making an outsider to your community feel welcome and appreciated.

There have been a couple misconceptions though, and perhaps I didn't express myself clearly on a few points.

I realize that there are times – plenty of times, even – when using some component that's in a different language from your main application is the right choice. I wasn't trying to say "all Python all the time no matter what, no exceptions". I just want you all to consider that there is a cost to using a component that's in a different language, and you should be aware of that cost. It's not as simple as a tick-a-box feature comparison of the features and drawbacks of multiple products. If I came out as sounding really extreme on this, it just was to provoke a response.
You can have an architecture which is driven by Python and organized by Python without actually having all the implementation be in Python. For example, an inordinate number of people asked me about memcache. If you want using something like that, sure, use memcache, there's not a lot that it being in Python would buy you. Some might say that the whole point of memcache is that it isn't very deeply configurable and doesn't have much in the way of behavior. Plus, it's an internal component, not an externally visible service, so even my usual flimsy "no buffer overflows" argument doesn't really hold up; it's more like a library than a server. You can incorporate memcache into a Python-in-the-driver's-seat architecture by spawning memcache from your Python process instead of making memcache a configuration dependency. That way, you don't need a separate configuration file and a separately managed service or a chef script that boots memcache for you before your application. This applies equally well to any other, similar services: write their config files from your Python code, and start them automatically.

Finally, thanks to everyone who really thought about what I said, took the time to respond, and prompted me to write this.

ἁγιολογία for r0ml

Sunday June 19, 2011

I have the rare distinction of being a second-generation software developer. Most recently, I mentioned this in an interview when asked who my programming heroes are. It might sound kind of corny, but I'm serious when I say that my father is my programming hero.

My dad had a cool hacker alias in the seventies. He's been known as "r0ml" around the web since before there was a web. If you are in a particularly typographically hip part of the internet, it might even be "RØML". How many of your parents have a nom de plume with a digit, or a non-ASCII character in it? Or, for that matter, any kind of hacker pseudonym?
I had the good fortune to work with one of r0ml's colleagues, Amir Bakhtiar. Amir paid me one of the highest compliments I've ever received: he said that the code for systems I've worked on is similar to r0ml's in its style and exposition. My dad taught me how to program in x86 assembler, and in that process, I learned a lot about the way he thought about solving problems and building systems. I regard thinking that well, or even comparably well, as a real achievement.
That's not to say that I would do everything exactly the way that he does. For example, he writes a lot of networking code in Java. He doesn't use Twisted, for the most part. If you know me and you know my dad, you know that we disagree on plenty of stuff.
Unlike the stereotypical, often-satirized filial argument, these discussions are something I look forward to. Disagreeing with my dad is still one of the most intellectually challenging activities I've ever engaged in. Whenever I have a conversation with him about a topic where he has a different view, I come away enlightened – if not necessarily convinced.
Conversations among my friends occasionally turn to the topic of our respective upbringings, as they do in any close group. One of the recurring themes of my childhood is that, while my siblings and I were sometimes told to be quiet, we were never told to be quiet because our opinions weren't valuable. Sometimes we were told in unequivocal terms that we were wrong, of course. However, my dad always encouraged us to present our thoughts. Then, he wouldn't pull any punches in relentlessly refuting our arguments, using a combination of facts, estimates, calculations, and rhetorical flourishes. I learned more about influencing people and thinking clearly around the dinner table than in my entire formal education.
r0ml always questions glib answers, challenges the official version of events, distrusts things that are "intuitively obvious" or "common sense". The skepticism I've developed as a result of his consistent example has rarely led me astray. Glib answers, official versions, and common sense are frequently, if not always, wrong. He taught me to search for the non-intuitive answer, the surprising inflection point in the data.
In a roundabout way, he also taught my siblings and I how to perform some delightful rhetorical flourishes of our own, but also not to trust them. Pretty phrases can be deployed equally effectively in the service of illustration or deception. Although I can appreciate that parents often come to a point where they've had enough and a little deception can be a useful thing.
One cannot be a practiced rhetorician without a heaping helping of eclectic life experience; r0ml has that too. He's a fencer. And a juggler. He still has the highest score on Space Harrier of anyone I've ever met. (I can remember a crowd gathering in an arcade to see him start level 18.) He's an avid scholar of medieval thought and custom. For that matter, he's an avid scholar of a couple dozen other things, but listing them all would take a whole day.
He has the common occupational affliction of being a science fiction fan. However, fandom was never an identity for him. Again, by consistent example, he taught me to focus on my own creativity, and do something cool, never to just passively consume others' ideas. He treats entertainment as an inspiration, rather than an escape. For instance, one of the earliest memories I have about my father talking about software is a reference to the movie "Terminator". (Please keep in mind that this memory is ~20 years old at this point, so it might not be terribly accurate.) I remember him saying something like "All software should be relentless. If you remove its legs, it should use its arms. Whatever errors it encounters, it should deal with them, and keep going if it can."
Nevertheless, seeing "Tron: Legacy" with my dad, the hacker, in IMAX 3D, 20 years after we saw the original together... I didn't need to take a life lesson from that to think it was pretty rad[1].
Unlike many quiet geniuses who labor in obscurity, dispensing wisdom only to a fortunate few, r0ml is a somewhat notorious public speaker. You can see him this year at OSCON. If you hunt around the web, you can find some video examples of his previous talks, like this great 30-second interview[2] about the nature of open source process, from a talk he gave in 2008 (audio of the full talk here).
([1]: Although, jeez, what was the point of that whole open-source subplot at the beginning? It seemed like a great idea, but then it went absolutely nowhere!)
([2]: Speaking of not doing things exactly the way he does - where he uses a metaphor to "single-threading" and "multi-threading", I would have said "blocking" and "event-driven" - but more on that in a future post.)
Happy Father's Day, r0ml.