What I Wish Tornado Were

Friday September 11, 2009
FriendFeed has released its web server, Tornado.  It seems like everyone's blogging about it, and it's obviously relevant to my interests, so I feel like I should say something.

Let me start with the good stuff.  First of all, I think it's great that we have yet another asynchronous contender in the Python world.  Every time something like this comes out, it means that Twisted has to fight that much less hard to get over the huge hump of event-driven programming being too hard, or too weird, or whatever.  It's good to have an endorsement of the general message "if you need a web server to handle COMET requests, it needs to be asynchronous to perform acceptably" from such a high-profile company as Facebook.

Unfortunately I think the larger picture here is a failure of communication in the open source community.  In the course of developing Tornado, there are several things that FriendFeed could have done to move the Twisted community forward, at no cost to themselves.  I don't want to rag on FriendFeed, or Bret Taylor, or Facebook here; they're not the first to re-write something without communicating.  In fact I recently had almost this exact same discussion with another project that did the same thing.  Since Tornado is such a high-profile example, though, I want to draw attention to the problem so that there's some hope that maybe the next project won't forget to communicate first.

My main point here is that if you're about to undergo a re-write of a major project because it didn't meet some requirements that you had, please tell the project that you are rewriting what you are doing.  In the best case scenario, someone involved with that project will say, "Oh, you've misunderstood the documentation, actually it does do that".  In the worst case, you go ahead with your rewrite anyway, but there is some hope that you might be able to cooperate in the future, as the project gradually evolves to meet your requirements.  Somewhere in the middle, you might be able to contribute a few small fixes rather than re-implementing the whole thing and maintaining it yourself.

This is especially important if you are later going to make claims about that project not living up to your vaguely-described requirements, and thereby damage its reputation.  Bret Taylor claims in his blog:

We ended up writing our own web server and framework after looking at existing servers and tools like Twisted because none matched both our performance requirements and our ease-of-use requirements.

First and foremost, it would have been great to hear from Bret when he started off using Twisted about any performance problems or ease-of-use problems.  I'm guessing that Twisted itself had only ease-of-use problems, and other "tools like Twisted" were the ones with performance problems, since later, in a comment on the same post, he says:

I can't imagine there is much of a performance difference [between Twisted Web and Tornado].  The bottom is not that complex in my opinion.

It would also be great if he had explicitly said that Twisted didn't have performance problems rather than making me guess, because I'm sure that is what lots of developers will take away from this.  When you have the bully pulpit, off-the-cuff comments like this can do serious damage to smaller projects.

More to the point, what is the problem with "ease of use", exactly?  The fact that he found Deferred tedious, in particular, seems very strange to me, given that it is so un-tedious that it has become a de-facto standard even in the JavaScript community.  We had no opportunity to help him or anyone else out, because as far as I can tell from searching our archives, we never heard from him or from anyone else at FriendFeed when they were trying out Twisted at first.  Even as he's saying that Twisted is hard to use and (maybe?) performs poorly, he isn't pointing to any particular example of what about it is hard to use, or what performs poorly.  There's still nothing we can do to address this criticism.  And there's still not much we can do to make sure that future potential Twisted users won't have this problem.

Later, in yet another comment, Bret points out the root problem:

... the HTTP/web support in Twisted is very chaotic (see http://twistedmatrix.com/trac/wiki/WebDevelopme... - even they acknowledge this)...

This is true.  However, as I frequently like to note, Twisted is starved for resources.  Reconciling the chaos described on the page about web development with Twisted is an ongoing process.  For a tiny fraction of the effort invested in Tornado, FriendFeed could have worked with us to resolve many of the issues creating that chaos.

This is the main thing I want to reinforce here.  If half a dozen occasional contributors with a real focused interest in web development showed up to help us on Twisted, we'd have an awesome, polished web story within a few months.  If even one person really took responsibility for twisted.web, things would pick up.  But if everyone who wants an asynchronous webserver either uses twisted.web (because it's great!) without talking to us or decides not to use it (because it doesn't meet their unstated requirements) without talking to us, it's going to continue to improve at the same sluggish pace.

Even at the current rate, by the time we have an excellent HTTP story, I somehow doubt that Tornado will have a good SSHv2 protocol story ;-).

In his comment, Bret also takes a couple of pot-shots at Twisted that I think are unnecessary, and I'd like to address those too.

In general, it seems like Twisted is full of demo-quality stuff, but most of the protocols have tons of bugs.

We're not talking about "most" of the protocols here, Tornado is only concerned with HTTP.  And the HTTP implementation(s) in Twisted do not have "tons of bugs".  They are production quality, used on lots of different websites, and have lots of automated tests.  While much of the code in twisted.web doesn't have complete test coverage, since it's old enough to predate our testing requirements, I note that Tornado appears to have zero test coverage.

There's a kernel of truth here — some of the older, less frequently used protocols have a few problems — but in most cases the "bugs" are really just a lack of functionality.  Twisted overall has very few protocol-related bugs, and again, our test policy makes sure that new bugs are introduced very rarely.

Given all those factors, it didn't seem to provide a lot of value. Our core I/O loop is actually pretty small and simple, and I think resulted in fewer bugs than would have come up if we had used Twisted.

I must respectfully disagree.  Again, I don't want to rag on FriendFeed here, but here are several features that Tornado would have, and bugs that it wouldn't have, if it used Twisted for the event loop and none of the HTTP stuff:
  1. EINTR wouldn't cause your application to exit if run in a non-US-english locale.
  2. You don't have the opportunity to forget to set a socket to be non-blocking and thereby make your entire application stop.
  3. It would be possible to run your application on Windows.
  4. Firewalled connections and running out of file descriptors wouldn't cause your server to spew errors forever (at least, it won't any more).
  5. You could write a TCP client that didn't block for an arbitrary amount of time in connect().
  6. Finally, of course, you could use all of Twisted's other protocols, client and server: IMAP, POP, SMTP, IRC, AIM, etc.  You could also use external protocol implementations like Thift.
  7. You could spawn asynchronous subprocesses.
and this is a very short list, based on a cursory reading of the source code, not actually running tornado and not a particularly deep audit.  Some of these bugs might not be as serious as I think, and there might be plenty of other bugs.  But I can't really be sure what works for sure, since again: there are no automated tests.

This list is a great example of why projects like Tornado really should use Twisted.  Tornado implements some innovative web-framework stuff, but absolutely nothing interesting that I can see at the level of async I/O.  Using Twisted would have allowed them to focus exclusively on cool web things and left the never-ending stream of incremental surprising platform-specific, only-happens-in-weird-situations bugfixes to a single, common source.

What To Do Now

I hope that someone at FriendFeed will be a little heavier on detail and a little lighter on FUD in some future conversation about Twisted.  However, I'm sure they're going to have their hands full maintaining their own code, so I don't have high expectations in this area.  I'm sure Bret wasn't intentionally slamming Twisted, either; it wasn't like he wrote a big screed about it, he just dropped in a few unsubstantiated comments into a much larger post about Tornado. So I just want to be clear: I don't have sore feelings, I don't need anybody to apologize to me or to Twisted.

If any of you out there are fans of both Tornado and of Twisted, it would be great if you could contribute a patch to Tornado which would allow it to at least optionally use Twisted as an I/O back-end.  It would be great, of course, if lots of people interested in web stuff would help us out with our web situation, but supporting the Twisted event loop would be good regardless. It would mean that when people wanted to speak multiple protocols, they wouldn't need to re-write or kludge in their existing Tornado application, so it would increase the chances that we could get some help with our SSH, FTP, IRC, or XMPP code instead.  It would also open up a much wider multi-protocol landscape to users of Tornado, even if Tornado's default mode of operation still used ioloop.py.

Even better would be to hook up something that made a Tornado IResource implementation, so that Tornado applications and twisted.web and Nevow applications could all be seamlessly integrated into one server.

The whole point of Twisted is to have a common I/O layer that lots of different libraries can use, share, and build on, so that we can solidify the common and highly complex abstraction required of a comprehensive, cross-platform, event-driven I/O layer.  In order to realize that vision, we need help not just with the code; we need more Twisted ambassadors to go out into the community and help us integrate these disparate applications, help us find out where real users are finding the documentation inadequate or the organization confusing.

Tornado could be an excellent opportunity for those ambassadors to go out and introduce others to the wonders of Twisted, because its endorsement from FriendFeed guarantees it an audience of a tens of thousands of developers, at least for its first few months of life.  If you've shied away from contributing to Twisted itself because of our aggressive testing and documentation requirements, well, Tornado apparently doesn't have any, so it would be a great place for you to start :).