Slaying Medusa

Wednesday February 01, 2006
That is not dead, which can eternal lie
and in strange aeons, even death may die
- HPL


Apparently some guy is trying to resurrect Medusa. I wasn't going to say anything until he mentioned Twisted directly, calling it "unprofitable", and "a complex maze of orthogonal but impractical interfaces", but IT'S ON NOW.

Medusa was great for its time. I read the whole thing before I started work on Twisted. In fact, Medusa can be credited with opening my eyes to the problems with threads. Somewhere, locked in a lead drum, encased in concrete, the very first version of Twisted Reality for Python can be found, using threads for concurrency and pickle for serialization.

However, even when I was starting with Twisted - circa 1999 - Medusa was already showing its age.

The main problem with Medusa is that it does too little. For example, Windows sockets are different from UNIX sockets in a variety of ways you probably aren't going to anticipate if you are calling self.recv and such in an asyncore.dispatcher. Those are just the differences in sockets - Twisted handles threads, pipes, and processes too, which a cursory look at the new "Allegra" reveals only rudimentary support for.

95% of the time, you won't notice these problems. As they're related to operating-system error reporting, they tend to show up only when your program is under load. In other words, you won't notice any issues with your program until a lot of people are using it and there is a huge amount of pressure on you to fix it immediately. Luckily Allegra has no automated tests, so you won't have to be troubled by finding out about bugs too early. Twisted has more than 2500 tests run on 4 different platforms with each commit.

Don't believe me that there will be issues with your super-simple cross-platform select loop? By way of an example, after years of maintaining their own networking core, BitTorrent has started using Twisted, with the unobtrusive changelog message of "TCP Stack flaking out bug fixed, using Twisted".

Even given the testing situation, Medusa (I'm sorry, "Allegra") is, indeed, quite a bit simpler than the full Twisted suite, and an adequate solution to 95% of the asynchronous-socket-programming problems out there. However, it's only adequate, whereas I think Twisted is excellent.

As per my last post, there is a lot of complexity in some projects surrounding Twisted, and some of it is unnecessary. That complexity isn't there if you are just trying to get basic asynchronous I/O going, though. If you are considering Medu^HAllegra for some reason, have a look here first: Writing Servers with Twisted. If you can't figure that out, I've just saved you some time: you should just stop trying to write networked programs. "a complex maze", "impractical"??? The simplest working server is just 4 lines of code: an import, a class statement, a method definition, and a method body. You can really be productive that fast. I don't think it's even possible to be simpler than that in a Python framework.

A similar document for Medusa, Asynchronous Socket Programming, has more than twice as many words, and the simplest example is more than 4x longer.

As to "unprofitable", well, let me just direct you to a page that describes people's profitable experiences with Twisted: including the quote "The quality of the Twisted networking core is unmatched in the open- or closed-source arenas".

I'm glad to see someone doing a project that aims to do something similar to Twisted, since that indicates it's a thing that people need doing. I'd love to push for some interoperability standards if a different framework were to emerge that had different implementation strengths than Twisted does (and goodness gracious, it could certainly be faster. Twisted can only handle a few tens of thousands of requests per second on high-end hardware, C++ servers are in the millions these days).

However, there are a few things such a framework would have to get right first:
  1. Write some automated tests. Seriously, it's 2006 already, unit tests are an accepted best practice pretty much anywhere that people care about writing programs that work even some of the time.
  2. Be clear about what you're trying to do differently, or don't say anything at all. Here's a hint: any phrase involving the word "lightweight" is not specific. Reading the descriptions of Python frameworks (although this mainly applies to tiny one-off web frameworks) I am reminded of the descriptions of WindowMaker themes at the beginning of the themes craze five years or so ago. "It's a lightweight, simple, fast, clean, dark theme, with a picture of Neo on the desktop" probably described 4 out of every 5 themes to hit the site. The first four adjectives regularly show up in descriptions of frameworks. Where were all the heavy, slow, dirty, frameworks that these are in response to? If these are valid design tradeoffs, why don't people ever advertise as the opposite? Those four adjectives are basically meaningless - when an author of some software says it's "fast" and "lightweight" they just mean "I know what it does, so it doesn't surprise me when it's slow, and I can get things done quickly with it, because I wrote it." If it were measurably fast or small they would be quoting statistics at you about how many requests per second it can perform or how small the image can be compressed.
  3. Have an application. It looks like Mr. Allegra is, in fact, driving the requirements for his new-old framework from an application, and that's good. If his approach doesn't work well, he'll find that out, rather than assuming that it is fine.
  4. Finally, specific to asynchronous I/O frameworks: separate your transport from your protocols. Don't force your users to fetch data themselves from the OS; there is just no reason for that. It is a real pain to implement a low-level enough API in Twisted for compatibility, if your users expect to diddle the sockets themselves.