Thanks to jamwt for the
shout-out on the
announcement of Diesel.
Since the reaction to
my
reaction to tornado was so good (or at least so ...
energetic), I figure I
should comment on Diesel as well. Spoiler alert: my reaction is ...
largely similar, but since jamwt has been
kind of nice to Twisted in the past,
and didn't actually say anything mean this time, I'm somewhat reluctant to
have that reaction. Nevertheless, I swore a solemn oath to tell it
like it is, keep it real, and soforth. So I must.
Once again, I'm happy that event-driven programming is getting some
love. This time, I'm pleased that nobody is saying anything
especially snarky or FUD-ish about Twisted. I do feel like it's a
little weird not to mention Twisted, or include some comparisons to
Nevow or
Orbited, both of which provide different,
comprehensive approaches to COMET with Twisted.
(Worth noting: Orbited also originally started out using its own
event-driven I/O layer, but switched to Twisted later,
because
Twisted is "crazy delicious".)
Diesel has many more interesting ideas at the level of async I/O than
Tornado did. I think the generator-based approach for implementing
protocols is interesting and deserves some more exploration. I'm not
sold on it for every use-case, and I think the implementation might have
some flaws, but it definitely has some advantages.
I'd give jamwt a hard time for not reporting issues and communicating with
Twisted more before re-writing the core, but for three issues:
- jamwt's been around in the Twisted community for a while. He's
written a bunch of fairly deep Twisted code and he clearly knows what
the framework is capable of.
- I've spoken with him on a number of occasions, and for all I know I
might have discussed this with him. I don't remember it, but it
would be pretty embarrassing to write a big rant about how nobody talks
to us only to have him paste some chat log where he explained why he was
writing Diesel six months ago, and I said "oh, okay" ;-).
- Nobody is calling Twisted names or making vague, unsubstantiated
accusations. You're not obligated to examine Twisted, nor
Nevow, nor Orbited, I just feel that you owe us some explanation if you
publicly say that you tried it and found it wanting. The tone on
the Diesel announcement, in its one brief mention of Twisted, is "we
tried it, but we kinda wanted to do our own thing". So, good for
them, they did their own thing, I hope they had fun.
Now, personally, I'd like to leave it at that, but there is a certain
inevitable comparison that I think is going to take place. Diesel
has a nicer web page than Twisted. They have entwittered ...
twitified ... uh ... tweetened ... the project, and we haven't; we just
have an old-fashioned "
blog". Diesel is smaller than
Twisted, so it's easier to explain, and so the people approaching it will
have a better idea of its scope. This might give the immediate
impression that it is a simpler, better, more "modern" replacement for
Twisted's I/O layer, and this is not the case. So I still feel it's
important that I set the record straight.
Before I launch into my critique, I should say that I don't want to harsh
on Diesel too bad. It's a neat little hack and you should go play with
it. And I feel bad pointing out problems with it, since as I
mentioned above, nobody's dumping on Twisted. So, Diesel fans,
please take this in the spirit of a frank code-review, not a complaint
about your behavior.
The interesting generator-munging bits could be easily adapted to run on
top of Twisted's loop, which, arguably, they should have been in the first
place; and the toy "hub" that they've written might be good enough for
some simple applications where reliability under load is not a serious
concern. In fact,
inlineCallbacks might provide a good deal of what is needed to support
Diesel's programming style. Alternately, Diesel might provide some
hints as to how things like inlineCallbacks could be made more
efficient.
That said, Diesel's I/O loop sucks.
It's disappointing to see the same mistakes getting made over and over
again. First and foremost: no tests. Come on, Python
community! You can do better!
Write your damn tests
first!
The #1 benefit that a brand-new I/O loop project
could have over
Twisted is that Twisted was written in the bad old days before everybody
knew that TDD was the right way to write programs, so we don't have 100%
test coverage. But, we strive to get closer every day, while every
new project decides that they don't need no stinking quality control.
Predictably, as it has no tests, Diesel's I/O layer is full of dead code,
inaccurate documentation, and unhandled errors. Consider this
gem, which I found about 30 seconds into reading the code:
KqueueEventHub is documented to be "an epoll-based event hub", and its
initializer defines an inner function which is never used. I'm not
going to belabor the point by enumerating all the typo bugs I found, but
you may find the output of 'pyflakes diesel' interesting.
Instead of Tornado's inaccurate handling of EINTR, Diesel has
no
handling of EINTR, as far as I can tell. It also doesn't handle
EPERM, ENOBUFS, EMFILE, or even EAGAIN on accept(). To be fair, it
has a catch-all exception handler all the way at the top of the stack, so
none of these will cause instant crashes, but they will cause surprising
behavior in odd situations (and possibly infinite traceback-spewing
loops).
More surprisingly - I had to re-read the code about five times to make
sure - it doesn't appear that sockets are ever set to be non-blocking, and
EAGAIN is not handled from accept(), recv(), or send(). And yes,
this
can happen even if your multiplexor says your socket is ready
for reading and/or writing. The conditions are somewhat obscure, but
nevertheless they do happen. So, occasionally, Diesel will hiccup
and block until some slow network client manages to send or receive some
traffic. In other words: Diesel is not really async. It just
fakes it convincingly, most of the time.
Once again, there's no way to asynchronously spawn a process, and no way
to asynchronously connect a TCP client. Sure,
this looks like an asynchronous connect call, but it's misleading: it
blocks on resolving the hostname, and it potentially blocks on the initial
SYN/ACK/SYN+ACK exchange. There's no asynchronous SSL support.
And no, that is
not
trivial. Not to mention handling all the crazy errors that spew
out of the Windows TCP stack. And since the loop is implemented to
be incompatible with Twisted, it's not obviously trivial to compatibly
plug it in and get those features.
Again, I don't want to dump on Diesel here; for what it is, i.e. an
experiment in how to idiomatically structure asynchronous applications,
it's all right. For that matter Twisted has its fair share of bugs
too, which would be pretty easy to lay out in a similar post; you wouldn't
even need to do the research yourself, just go look at our bug
tracker.
But both Diesel and Tornado make the mistake of attempting to replace the
years of trial-and-error, years of testing discipline, and years of
portability and feature work that Twisted has accumulated with a few
oversimplified, untested hacks.
What they
could have done is contributed any extensions that they
needed to Twisted's loop, or modifications to Twisted's packaging that
would allow them to get a smaller sliver of Twisted's core to bootstrap,
if that's what they needed.
My goal in pointing out all these flaws is not to illustrate any
particular point about Diesel, but to reinforce the point I implicitly
made in my Tornado post, which is that
if you try to write a new
mainloop (especially without tests) you
will
screw it up. You will most likely screw it up in ways which will
only surface later, under mysterious circumstances, when your servers are
under load and you are under the gun for a deadline.
Or if I happen to get wind of it and write a blog post about it, of
course. Then you get to cheat a little.
It's not an indictment of Diesel that it screwed this up; everyone screws
it up.
I would probably screw it up, if I didn't have Twisted
sitting in front of me as a direct reference. POSIX by itself is
unreasonably subtle and difficult, but POSIX, plus the subtle variations
in different platforms which implement it, plus the Windows APIs which are
almost-but-not-quite-exactly-nothing-like the POSIX APIs, presents an
inhuman challenge.
Hopefully Diesel will grow some tests. Hopefully it will fix, or
better yet shed, its somewhat unfortunate I/O hub. I am hopeful that
someone will
follow Dustin's
excellent lead (perhaps Dustin himself!) and port Diesel's API and
generator system over to Twisted's I/O architecture and eliminate all
these silly bugs. Of course, it someone did that, you could use
Dustin's tornado port with Diesel.
With the silly bugs from the I/O loop out of the way, the Diesel team can
write tests for the more interesting pieces, and fix the bugs which aren't
entirely silly :-).