Twisted Do-Over

Recent fanfare over Twisted, including the totally awesome book which you should go buy right now, has gotten me thinking - is Twisted really all that great? I believe that while it is still probably the best thing out there for doing what it does, there are a few things I wish had happened differently. So here's a laundry list of things that I wish Twisted did differently, and how I would implement them if I were starting from scratch today. Maybe eventually this will be a roadmap - right now, it's wishful thinking, and too vague to be any real kind of spec.

In the innermost guts of the reactor, there is no real normalization of events. The reactor is sort of a fused engine block where all of the "work" of dispatching events happens. I'd rather that were unrolled a bit. Especially in today's world of generator-heavy Python, I'd rather that the reactor core look something like a set of wrapped iterators; a base generator that ran "select()" and yielded file descriptors ready for reading / writing; a generator that wrapped that which did OS- and FD-specific I/O, like recv() and send(); a wrapper above that generating application-level request/response pairs, and so forth. Think of this as a web server (in very, very broad strokes, this is not a precise API):


def webServer(self, connection):
for request in parseRequests(connection.inputStream):
response = self.respondTo(request)
yield response


Such a system would also make it a lot clearer what "one reactor iteration" meant. Rather than some arbitrary constellation of behaviors which happened to be scheduled "at the same time", one reactor iteration could be made to correspond exactly with one tick of a user-provided iterator.

Further up from that (but using that facility), I wish that we had included SEDA's notion of a "stage". This would have made a few things a lot easier. For example, it would be nice to have a well-defined notion of a request/response processing webserver that could generate a "response" object, possibly from a thread, but have that "response" be processed entirely asynchronously in the main thread.

In particular having a notion of a "stage" would make it a lot easier to run full database transactions within threads, isolating them from communications code, by stipulating that transactions must produce notifications or network I/O in the form of output objects placed into a queue. Recently I have surveyed some open-source Twisted code in the wild, and answered a bunch of questions, which have implied to me that many Twisted developers now believe that the correct way to interface with a relational database is to turn *EVERY* SQL statement into a Deferred which is handled individually.

This is a tangent, but allow me to offer a bit of advice. The documentation is really poor, and never says this, but using Twisted, or rather ADBAPI, to convert every single SQL statement into a separate transaction and handle its results separately, has a whole slew of problems. First of all, it's slow: you have to acquire and release thread mutexes on every operation. Second, it is unsafe. Your conceptual transactions might be interrupted at any moment, leaving your database in an inconsistent state. Also in the realm of safety, notifications generated from within a transaction that gets rolled back are sent to the network anyway, so two different database-using proceses talking to each other can trivially become inconsistent. Take a look at the 'runInteraction' API and give some thought to what represents a "whole" transaction in your database. Moving this transaction processing out of the main loop *IS* an appropriate use of threads, and in fact adbapi does it internally. This is doubly true if your application or your SQL layer does any caching of SQL results; to be sure that the cache is consistent with the DB, you have to keep track of whether and when transactions are rolled back.

Back to the main point. I also would have designed the reactor access API a bit differently. 'from twisted.internet import reactor' looks convenient, but is highly misleading. Figuring out what reactor your process is currently using is part of a more general problem of execution context. There are other objects that applications wish to find in the same way: the current database connection, for example, the current log monitor, or the current HTTP request. twisted.python.context deals with this in a general manner, but because it is not used consistently to access important objects, it has not been subjected to the testing and refining that it has needed. The worst side-effect of this has been the "context object" abomination that afflicts Nevow and Web2.

I also would have designed Deferreds as more central to the whole thing, and optimized the hell out of them rather than worrying about their performance. For example, it would be a lot easier for many applications if deferLater were the default behavior of callLater. Similarly to deferToThread vs. callFromThread. The main reason that the reactor does not use Deferreds for these, or for the client connection API, is because of a general feeling of nervousness about how it would be hard to implement Deferreds in C, so the reactor API shouldn't require them. In retrospect this is silly (especially now that James Knight has actually gotten further implementing Deferreds in C than anyone else has on getting the reactor implemented there).

View Them

Better photos than mine.

Windows Command Line

For various reasons, I find myself spending more and more time in win32 these days.  Blech.

Luckily lots of useful programs have been ported to Windows lately so I am not entirely without tools.  However, there is one glaring problem which I can't seem to work around.

Rumour has it that the command-line interface to Windows isn't a program at all, but some kind of awful kernel service.  cmd.exe is also some kind of horrid abomination.  Here are some things that are wrong with it:

  • Command Window
    • It doesn't draw the same style of window borders as every other window.
    • You can't resize it horizontally, interactively.
      • You can't maximize it horizontally
      • On a system with multiple desktops, you can only maximize it on one desktop
    • There is no keyboard shortcut for pasting text.
    • "QuickEdit mode" isn't the default, so the window is unresponsive to the mouse
      • and even when it is turned on, it behaves extremely bizarrely, unlike xterm or any other Windows program
    • The scrollback is unbelievably pathetic - only 999 lines?
    • You can't select monospaced fonts such as ProFont, only Lucida Console and the default VGA font
    • Regular window keyboard shortcuts don't work: you can't close it with Alt-F4.
    • It ... conflicts ... with some video card drivers.  I had to update my nvidia drivers just so I could drag command windows around!! How is that even possible!??
  • CMD.EXE
    • no tab-completion of commands
    • no tab-completion of commands
    • you can't tab-complete commands
    • history behaves weirdly: you seem to maintain a global position in it, except... you don't
    • going back to your home directory is harder than starting a new terminal, seriously:
      X:\>%HOMEDRIVE%
      X:\>cd %HOMEPATH%
    • there is no shortcut like ~ to refer to your home path in other commands, either
    • there is no shell-startup file
    • .bat language is almost perversely crippled
    • the shell doesn't do expansion, so tricks like 'echo *' don't work.
    • the whole idea of %PATHEXT% is weird; how do I make this compatible with scripts for any other platform?
    • did I mention that tab-completion is broken?

Now, I realize that you can use cygwin's bash.exe to correct some of the deficiencies of the shell, but it has its own issues.  Is there anything that can replace the command window, so that I can use cywin's bash, Python, and other command-line tools without cringing constantly?  In particular the horizontal resize and maximize issues are the worst.  I have already tried xterm under cygwin as well as local SSH with PuTTY - both of these don't work very well with actual Windows command-line programs (such as Python) so I'd like something designed specifically for Windows.

End of an Era

Cyan Worlds has laid off all but two employees. Wow.

This company was a huge inspiration to me when I was growing up. The first time I ever thought I'd get involved with a software company, rather than making games as a hobby on my own, was watching the "Making of Myst" video that came as a companion disc with the first Myst CD. Apparently I'm not the only person who feels strongly about this company.

I've been really busy for the last few years in general, and haven't had time to play a Myst game since Myst III (which I haven't finished) - so this feels a little like finding out that a friend that I've been "too busy" to call has gone and died. Of course the murmurs of how this was caused by the inhuman churning of the mechanism that drives the game industry seem to echo the recently-published angst of other prominent game designers, and that doesn't make it any easier to hear.

I hope that this giant's demise at least warrants a comic strip.

The problem is, on the internet, nobody can hear you.

Today I realized what Q2Q is. It is a (I swear, this just came to me, I was not even trying to make it sound like anything) Self-Certifying Remote Endpoint Authentication Mechanism, or "SCREAM".

A SCREAM in this sense is a mechanism whereby connections are authenticated by cryptographic means; where the handshake includes information identifying the connector to an arbitrary level of precision (in Q2Q's case, via an SSL certificate, that the connection is authenticated with)

It is self-certifying because the connection itself identifies itself, via both an in-band nonce and by TLS. All security is transport security.

It refers to a remote endpoint which is the other end of a networked communication. It identifies not only the user, but their agent, and optionally the capabilities and permissions of their agent.

It is an authentication mechanism because you use it to prove that your connection is authentic.

Also, Vertex will blow a hole in your NAT device the size of a watermelon: no kidding. Vertex is the Divmod implementation of Q2Q. We really want Q2Q to become a standard so we are making a big deal out of the separation between product and protocol.

(I really feel like there are some uses for this thing that I've missed. I really hope I have enough time to work on it in the next 6 months to see something through to fruition: other, less focused, worse P2P and identity solutions are starting to get some traction, and it bothers me.)