Hack, the Hardy Heron Sings: Glory to this Linux thing

It was time.

After running Gutsy Gibbon on five of my computers for almost exactly a year — nearly six months past its "best used by" date, and about seven months after I would normally have started upgrading — I have finally upgraded all of my computers which have displays.  Alastor (workstation), Illidan (desktop), Suijin (media center), Kazekage (backup), and Nhuvasarim (laptop) are all now happily flying the 8.04.1 banner.

You may have heard that this particular upgrade was somewhat rocky.  I have definitely seen a few of my friends lose hours of their lives that they'll never get back with 7.10 → 8.04 upgrades.  However, I'm happy to say that after paying my dues as an early adopter of inadvisably new kernels, packages, and libraries for ten years, my decision to lag this time has paid off.  On all five highly varied systems, the upgrade went off nearly without a hitch.

Nearly.

So, in the interests of saving those of you who are even later to the party than I am some trouble, I'll enumerate the few difficulties I did encounter.

The main thing that kept me from upgrading for so long is VMWare Server — or, as I sometimes like to call it while I'm trying to get it to work, VMWare CENSORED Server. There are packages for it that work beautifully in 7.10, but none in 8.04.1.  Considering that I have it installed on pretty much every computer that I use, that was a pretty big problem.  To make matters worse, not only were the nice, packaged ubuntu installations no longer available, some library incompatibility in Hardy made the custom-installer tarballs that VMware releases broken as well — and by "broken" I mean "would crash before opening a window".  But, there is now hope!  The recently-released 1.0.7 build of VMWare Server for Linux works like a charm on Hardy, with only two tweaks.
  1. If you're upgrading from Gutsy, and uninstalled your vmware-server package in preparation for installing the 1.0.7 tarball which actually works, you'll notice that the new tarball refuses to install.  That is because your /etc/vmware directory now refers to a removed installation.  You will need to back it up, since it contains your (ugh) "serial number" (seriously is this 1997 or what), and move it elsewhere.  Don't do a --purge removal to remove /etc/vmware, since that will destroy your virtual machines.  When the installer prompts you for a serial number, have a look in /your/old/etc/vmware/license.vs.1.0-00; it's helpfully left in plain text.
  2. You will need to sudo rm -fr /usr/lib/vmware/lib/libgcc_s.so.1 in order to delete a library which conflicts with something in Ubuntu.  A few months ago, there was a great deal more of this type of surgery, but thankfully this one step should be all, now.
So, once the finicky VMWare was up and running, it was time to resolve some testing issues.  Divmod's tests have a dependency on the now apparently unmaintained "pyxml" package.  Hardy has a package for pyxml, and it installs the code, but it does it in such a way that you can't import it.  I'm not sure why that's useful, but whatever.  Luckily, Combinator makes it so that "python setup.py install" works without being root, so I downloaded the ancient and whithered archive, which is now hidden for some reason, and installed it into my home directory.  Then I tried to forget about it.

I already had to make a tarball of an old, hacked up version of _PyLucene.so and friends to get the tests to run under Gutsy, so I didn't have to touch that to keep it working.  I guess I should make those files available somewhere, but I always feel like if I am going to put any kind of effort into making Divmod code easy to work with it will just be in ripping out PyLucene entirely, since that's tantalizingly close.

And... that's it.  Everything still works, and despite the occasional, although increasingly distant dread that random kernel bugs will attack me, number of things have been improved.  So, that's enough with the workarounds and hacks.  Let's get to the good stuff!
  • Firefox 3 is definitely faster.
  • The awesome bar is awesome.
  • Fullscreen flash movies are actually fullscreen, and don't reliably crash the browser.
  • The "typing" notification in Pidgin is a lot easier to spot and works a lot better in the flow of a conversation.
  • Deskbar can stick to the panel again, finally!
  • Compiz is generally less buggy.  I couldn't figure out how to file these as bugs in launchpad because I could never reproduce them reliably, but at least one of them would happen by the time X had been running for an hour and they're not happening any more...
    • The screen would sometimes freeze and compiz would crash while rotating the desktop cube.
    • Sometimes changing the title of a window rapidly (such as when typing commands in a bash prompt where the title is tracking commands or switching Emacs buffers rapidly where the window title is tracking the buffer name) would cause the entire window frame to appear unfocused until the title changed again.
    • Running games — WoW in Cedega, and OTRSPOD natively — under Compiz is much faster.  I can maximize WoW and it doesn't crash.  (Fullscreening OTRSPOD still seems to have a pretty vicious problem, but I can play it in as large a windowed resolution as I like.)
    • Sometimes window stacking Z-order would be messed up after invoking the shift switcher or application switcher; combined with the focused window frame suddenly becoming unfocused, sometimes I couldn't tell which window was actually on top or actually had the keyboard focus.
    • Okay, this wasn't exactly intermittent: the "Scale Window Title Filter" plugin would always crash after a few tries.  Now it doesn't.
  • The hardware temperature sensors can now tell me how hot my GPU is!  I'd be a little more excited about that if it weren't really, really hot:
  • On my laptop, suspend works when I close the lid.  By that, I also mean resume works when I open the lid.  This hasn't happened for me since Edgy, I don't think.  I can rest easy knowing that they're planning to break it again in intrepid, but maybe I won't upgrade to that right away either.
  • There are now free drivers available for my wifi card.  This isn't a huge win, since the ndiswrapper drivers still seem to have an edge in dealing with poor signal, but it's nice since the free drivers seem to get better throughput when the signal isn't terrible; and now, I have the option.
  • Although I still have a bit of work left to get it set up, two finger scrolling on the macbook trackpad now works.
Of course, the biggest good thing about Hardy is that it's the first major Ubuntu upgrade that hasn't really broken anything unexpected, nor taken any serious investment of effort for me.  The upgrades themselves, all together, took a lot less time than writing this post did.  My home machine upgraded while I entertained myself with "機動戦士ガンダム00", my work machine upgraded while I was occupied talking to JP about some unpleasant business, and my laptop pretty much just sat in a corner churning away while I was working.

Anyway, thanks to the Ubuntu distro team for making this a good one.  I'm certainly enjoying it!

-glyph

P.S. Sorry if this post seems kind of frivolous, but I have a huge backlog of writing to do and I needed to start with something simple.

P.P.S. I don't like to talk about personal stuff to much but last month I got married.  It was cool.

P.P.P.S. If you knew why the image above was appropriate to the theme of "hardy heron", congratulations, you are as much of a dork as I am.

Update: ScribeFire has a neat feature where dragging and dropping photos will automatically upload them to your blog's API.  I used it, and I was pretty impressed. 
ScribeFire uploads the images just fine, and gets correct URLs to use from the upload, and inserts the appropriate <img> tags into the content of the post.  However, Google's image hosting (picasa) has a bug in it.  It checks the referer to see if you are coming from a subdomain of a google "property" — google.com, blogspot.com, blogger.com et. al., and since my blog is hosted on a custom domain, it doesn't count as a "google property" despite the fact that it's on their servers and they know about it.  So, sorry if that last P.P.P.S. was kind of nonsensical without any pictures acutally visible in the body of the post!  I hope they're fixed now.  (How is it that I keep hitting these crazy problems?  It's not like I'm doing anything interesting, I just want to include indentation and images in my writing.)

Don't Call It Blogging

Despite my own impeccable credentials as an elite cyber-hacker, I am friends with a number of people who are bewildered by the profusion of different technologies that the internet now affords us to interact.  I recently had a conversation where someone was just confused by the whole "blogging" thing.  Why do people blagoblog on the intertron?  What is the point?  I'm a prolific "blogger" myself, I guess, but I found myself sympathizing as I tried to explain.

I'm a huge fan of the activity of blogging, but I have never liked the word, "blogging".  I never really understood why until I was attempting to explain what it's all about.

For thousands of years — well, okay I don't have any citations of exactly how long, due to the evolution of English as a language, but, for a really long time — we've had one word for the activity of "blogging".  We called it writing.  That's all you're doing when you're blogging.

If we were to describe the activity of a Sumerian scribe pressing symbols into soft clay, we'd say they were writing on that clay.  An ancient Egyptian putting words onto a sheet of papyrus: they are writing.  Similarly, we don't typically have separate words for "scrolling", "codexing", "booking", "newspapering", "magazining", and so on.  Each new technology for moving writing around didn't need a new verb.  So why has "blogging" gotten one?

I think there is a good reason this term exists, but that reason doesn't justify the term, it provides a warning, and a reason to try to actively resist the term and just say "writing".  The web is a more radical and democratizing shift in publishing technology than any of the ones which preceded it, so publishing on the web (especially automated publishing, as on a blog) affords a feedback cycle where the author and the audience are effectively peers.  In fact, the nature of the terms "author" and "audience" has changed; formerly a description of social classes, the people who produce and the people who consume, they have been re-framed as roles within an individual conversation.  You might be the audience when you're reading someone else's blog, but ten minutes later you can easily reverse that relationship with that author as you're writing your own.  This extremely rapid cycle has given a wholly new quality to the style of many blogs, unseen in any prior form of written media.

So, why resist the term "blogging"?  It confuses the possibilities that the medium presents with conventions that it enforces.  Writing is a powerfully diverse art.  A lot of it's good, a lot of it's bad.  "Blogging", however, is more specific, and unfortunately implies a sort of perpetual half-finished conversation.  It calls to mind a semi-private, informal, ephemeral, link-heavy style of extremely short-form writing.  This form has its masters: Tycho of Penny Arcade infamy leaps to mind immediately.  It also has a sea of mediocrity.  Statistically speaking, you can probably click the 'next blog' link at the top of this page for an immediate example.  I don't have a problem with any of this.  Even the "mediocrity" is just evidence of the degree to which this is empowering people: much of what I'd consider "mediocre" just isn't relevant to me, and isn't written for me.

But blogs can be, and are, so much more than that.  They are a disruptive technology in the world of publishing, where any style of writing can easily be published, circulated, and promoted.  One can write an entire novel, serialized chapter-by-chapter as blog posts.  Many people have, in fact, done this already.  You don't even need to emulate older forms of writing to step outside the style implied by "blogging".  The tools that the web affords — instant publishing, hyperlinks — are ideal for collaborative scientific research.  Hyperlinks take the work out of footnoting.

Prominent web writers who I respect also seem to avoid the use of the term "blog".  Joel Spolsky refers to other people's blogs, but the term "blog" does not appear anywhere describing his site, despite the fact that there is quite a bit of self-descriptive text that refers to "this site".  Paul Graham goes a step further, foregoing many traditional blog trappings and has a link that says, simply, "Essays".  I wonder if it's for this reason.

So, if you need to explain to someone who doesn't quite get what all the whole "blogging" thing is about, don't talk about social dynamics and the singularity and the mass popularization of media.  That's all great stuff, but it's a confusing distraction.  It's just like writing a book — or, more likely, a magazine.  Except you don't have to talk to a publisher.  And you don't have to have an editor.  And it's free.  And the publishing part doesn't actually take any time.  And it's accessible from anywhere in the world.  And you can read it on your cell phone.  When you stack up all the advantages, the lack of some bound paper doesn't seem like a big deal.

If you find this explanation useful, feel free to point your relatives at this post.  Tell them that you saw it on my blog, but don't tell them I blogged about it.  Tell them I wrote about it.

Conference FAIL

Last night at a dinner with Ivan Krstić and Itamar Shtull-Trauring, we were all lamenting that too many (all?) software conferences focus specifically on positive results. This is what you want, of course, if you treat a conference as purely a marketing venue. However, most learning takes place based on something that someone did wrong and then needed to correct, not something that they did right.

All of the great software developers I know have at least one great story of how a project they were working on was a complete disaster.  Often these projects are shielded from the public eye, since nobody wants to talk about failure.  So, how do we make a public discussion of these ideas socially acceptable?

Thus, an idea was born: FAILcon.  The idea is simple: submitted talks and papers must be related to projects which failed in an interesting way.  The larger the better, of course — the bigger they are, the harder they fail — but anything that failed in an interesting way would be a valid subject for discussion.

I'm writing about it so that it won't be forgotten, because I think it's a great idea.  But I doubt that any of us are going to organize a conference any time soon.  So please, steal this idea.  Does anyone out there with conference-organizing skills want to get something together based around the common theme of failure?

Static On The Wire

I am, as you might have guessed, a big fan of dynamic typing.  Yet, two prominent systems I've designed, the Axiom object database and the Asynchronous Messaging Protocol (AMP) have required systems for explicit declarations of types: at a glance, static typing.  Have I gone crazy?  Am I pining for my glory days as a Java programmer?  What's wrong with me?

I believe the economics of in-memory and on-the-wire data structures are very, very different.  In-memory structures are cheap to create and cheap to fix if you got their structure wrong.  Static typing to ensure their correctness is wasted effort.  On the other hand, while on-the-wire data structures (data structures which you exchange with other programs) can be equally cheap to create, they can be exponentially more expensive to maintain.

When you have an in-memory data structure, it's remarkably flexible.  It is, almost by definition, going to be thrown away, so you can afford to change how it will be represented in subsequent runs of your program.  So, when your compiler complains at you for getting the static type declarations wrong, it's just wasting your time.  You have to write unit tests anyway, and static typing makes unit testing harder.  What if you want a test that fakes just the method foo on an interface which also requires baz, boz, and qux, so you can quickly test a caller of foo and move on?  A really good static type system will just figure that out for you, but it probably needs to analyze your whole program to do it.  Most "statically typed" languages — such as the ones that actually exist — will force you to write a huge mess of extra code which doesn't actually do anything, just so all your round pegs can pretend to fit into square holes well enough to get your job done.

But I don't have to convince you, dear reader.  I'm sure the audience of this blog is already deeply religious on this issue, and they've got my religion.  I'm just trying to make sure you understand I'm not insane when I get to this next part.

The most important thing that I said about in-memory data structures, above, is that you throw them away.  It's important enough that I'll repeat it a third time, for emphasis: you throw them away.  As it so happens, the inverse is the most important property of an on-the-wire data structure.  You can't throw it away.  You have to live with it.

Forever.

Oh, sure, you told your customers that they all have to upgrade to protocol version 3.5, but they're still using 3.2.  Unless you're Blizzard Entertainment, you can't tell them to download the new version every six weeks or go to hell.  Even if you can do that (and statistically speaking, you probably aren't Blizzard Entertainment) you have to keep the old versions of the updater protocol around so that when version 4.0 comes out all the laggards who haven't even run your program since 3.0 can still manage to upgrade.

Here's the best part: your unit tests aren't going help you — at least, not in the same way they would with your in-memory data.  When you change an in-memory data structure, you aren't supposed to have to change your unit tests.  You want the behavior to stay the same, you don't change the tests; if they start failing, you know something is wrong. With your new protocol changes though, you can have tests for the old protocol, and tests for the new protocol, but every time you make a protocol change you need to a new test for every version of the protocol which you still support.  Plus, you probably can't stop supporting older versions of the protocol (see above).

If you've got a message X[3], and you're introducing X[4], you have to make sure that X[4] can talk to X[3] and X[2] and X[1].  Each of those is potentially a new test.  Each one is more work.  Even worse, it's possible to introduce X[4] without realizing that you've done it!  If you have a new, optional argument, let's call it "y", to a dynamically-typed protocol, your old tests (which didn't pass y) will pass.  Your new tests (which do pass y, to the newly-modified X[4]) also pass.  But there's a case which has now arisen which your tests did not detect: y could be passed to a client which only supports X[3], and an error occurs.

If this were some in-memory structures, that case no longer exists.  There is no version of X currently in your code which cannot accept y.  Your tests ensure that.  You have to time-travel into the past for your unit tests to discover the code which would cause them to fail.  You can't just do it once, either: maybe X[3] was designed to ignore all optional parameters.  You have to consider X[2] and X[1].  You have to travel back to all points in time simultaneously.

This is why I said that the cost is exponential: you carry this cost forward with each new supported version that gets released.  Of course, there are ways to reduce it.  You can design your protocol such that arguments which your implementation doesn't understand are ignored.  You can start adding version numbers to everything, or change the name of every message every time some part of its schema changes.  All of these alternatives get tedious after a while.

So what does this have to do with static typing?  Static type declarations can save you a lot of this work.  For one thing, it becomes impossible to forget you're changing the protocol.  Did you change the data's types?  If so, you need to add a compatibility layer.  These static type declarations give you key information: what do the previous versions of the protocol look like?  More importantly, they give your code key information: is an automatic transformation between these two versions of the data format possible?  (If not, is the manual transformation between these two versions correct?)

In a dynamically typed program, you can figure out your in-memory types are doing by running the debugger, inspecting the code that's calling them, and simply reading the code.  Sometimes this can be a bit spread out — in a badly designed system, painfully spread out — but the key point is that all the information you need is right in front of you, in the source code.  If you're working on code that is shipping data elsewhere without an explicit schema, you have to have a full copy of the revision history and some very fancy revision control tools telling you what the protocol looked like in the past.  (Or, perhaps, what the protocol that some other piece of software has developed used to look like in the past.)

Your disk is another kind of wire.  This one is particularly brutal, because while you might be able to tell someone to download a new client to be able to access a service, there is no way you are ever going to get away with saying "just delete all your data and start again.  there's a new version of the format."  When writing objects to disk (or to a database), you might not be talking across a network, but you're still talking to a different program.  A later version of the one you're writing now.  So these constraints all apply to Axiom just as they do to AMP; moreso, actually, because in the case of AMP all the translations can be very simple and ad-hoc, whereas in Axiom the translations between data types need to be specifically implemented as upgraders.

With a network involved, you also have to worry about an additional issue of security.  One way to deal with this is by adding linguistic support to the notion of untrusted code running "somewhere else", but type declarations can provide some benefit as well.  Let's say that you have a function that can be invoked by some networked code:

@myprotocol.expose()
def biggger(number):
    return number * 1000


Seems simple, seems safe enough, right?  'number' is a number taken from the network, and you return a result to the network that is 1000 times bigger.  But... what if 'number' were, instead, a list of 10,000 elements?  Now you've just consumed a huge amount of memory and sent the caller 1000 times as much traffic as they've sent you.  Dynamic typing allows the client side of the network connection to pass in whatever it wants.

Now, let's look at a slightly different implementation of that function:

@myprotocol.expose(number=int)
def biggger(number):
    return number * 1000


Now, your protocol code has a critical hint that it needs to make this code secure.  You might spell it differently ("arguments = [('number', Integer())]" comes to mind), but the idea is that the protocol code now knows: if 'number' is not an integer, don't bother to call this function.  You can, of couse, add checks to make sure that all the methods you want to call on your arguments are safe, but that can get ugly quickly.

Let's break it down.

Static type declarations have a cost.  You (probably) have to type a bunch of additional names for your types, which makes it difficult to write code quickly.  Therefore it is preferable to avoid that cost.

All the information you need about the code at runtime is present when you're looking at your codebase.  Therefore — although you may find its form more convenient — static type declarations don't provide any additional information about the code as it's running.  However, information about the code on opposite ends of the wire may only be in your repository history, or it may not be in your code at all (it could be in a different codebase entirely).  Therefore static typing provides additional information for the wire but not in memory.

At runtime, you only have to deal with one version of an object at a time.  On the wire, you might need to deal with a few different versions simultaneously in the same process.  Static type declarations provide your application with information it may need to interact with those older versions.

At runtime (at least in today's languages) you aren't worried about security inside your process.  Enforcing type safety at compile time doesn't really add any security, especially with popular VMs like the JVM not bothering to enforce type constraints in the bytecode, only in the compiler.  However, static type declarations can help the protocol implementation understand the expectations of the application code so that it does not get invoked with confusing or potentially dangerous values.  Therefore static type declarations can add security on the wire while they can't add security in memory.  (It turns out that if you care about security in memory, you need to do a bunch of other stuff, unrelated to type safety.  When the rest of the world catches up to the E language I may need to revisit my ideas of how type safety help here.)

If you have data that's being sent to another program, you probably need static type declarations for that data.  Or you need a lot of memory to store all those lists I'm about to multiply by 1000 on your server.

Constructive Criticism

I frequently say that I'm a big fan of constructive criticism of Twisted, but I rarely get it.  People either gush about how incredibly awesomely spectacularly awesome Twisted is, or they directionlessly rant about how much it sucks, but aside from a fairly small group of regulars who file issues on the Twisted tracker, I don't hear much in between.

I caught wind of (and responded to) some blog comments of the latter type (directionless ranting) from Lakin Wecker.  After I responded, in an unusual response for someone writing such comments, he apologized and promised to do much better.  He has responded with some much more specific and potentially constructive criticism, ominously entitled "twisted part 1".

Lakin, thanks for reformulating your complaints in a more productive way.  I do think that some useful things might happen as a result of this article.  While I don't necessarily agree with it, I do care about this type of criticism.  In order to demonstrate my appreciation, I will try to make this a thorough reply.

It sounds like there are several mostly separate issues that you had here.  I'll address them one at a time.

Twisted Mail

I believe that the main issue is that the twisted.mail API is missing some convenience functionality which will allow users to quickly build SMTP responders that deal with whole messages.  This is definitely a shortcoming of twisted.mail.

However, this shortcoming is not entirely unintentional.  In general, Twisted's interfaces encourage you to write code which scales to arbitrary volumes of input.  IMessage is a thing that can receive a message, rather than a fully parsed in-memory message, because we want to encourage users to write servers that don't fall over.  If you have to handle each line as it arrives, it's less likely that you'll die if you a message bigger than the memory of the machine that is running the server.

That's not to say that there shouldn't be some additional, higher-level interface which does what you want.  Quotient, for example, uses twisted.mail, but provides a representation of a message which has all of its data written to disk first, and efficient APIs for accessing things like headers without fetching the whole message back into memory.  twisted.mail almost provides something like this itself; if you poke around in twisted.mail.maildir and twisted.mail.mail, you'll find FileMessage (an implementation of a message which writes its contents to disk) and MaildirDirdbmDomain (an implementation of IDomain which uses a directory of maildirs to deliver messages).  Not that these would not have been useful for your use case: they just show that we're happy to have higher-level stuff implemented within Twisted.

One function which might be cool to provide is something which will parse an incoming SMTP message and convert it to an email.Message.Message, then hand it off to some user code.  Even better would be to integrate this with the command-line "twistd mail" tool, such that you could easily deploy such a class as an LMTP server or something like that.

Although we don't have all the pieces you need, there is also the ever-present issue of documentation of the pieces which we do have.  Some of the code in twisted.mail might have been useful to you if its documentation had been better.  For example, you might also notice some pretty strong similarities between twisted.mail.protocols.DomainDeliveryBase.receivedHeader and your own implementation of that method.

My main point here is that fixing this is a simple matter of programming (or, in the latter case, of documenting).  I think that the best way to deal with that shortcoming is simply to submit patches to twisted.mail which add the functionality that you want.  Lots of open source projects are like this: they were driven just far enough to satisfy their implementors' use-cases.  twisted.mail is a perfectly functional and simple API if you want to build what it is designed to build.

When we're talking about "Twisted", we're typically talking about the core, and the programming model that comes with it.  When you get into the specifics of an API like twisted.mail, twisted.names, and even twisted.web (maybe even especially twisted.web) you're going to find plenty of shortcomings and areas that it don't yet do what you need.  There are some areas which are downright bad, and some which are so bad that they're embarrassing.  We need volunteers to identify the areas that are lacking and add to them.

Twisted vs. Things Which Are Not Twisted

The reason that I disagree with your conclusion that Twisted as a whole is necessarily more complex, hard to explain, too dense, unreadable (etc, etc) is that the main thing to compare it to is shared-state multithreaded socket servers, or asyncore.

Here's a good example of what makes Twisted simple, at its core:
from twisted.internet.protocol import Protocol
class Echo(Protocol):
  def dataReceived(self, data):
    self.transport.write(data)

This server supports a large number of clients.  It supports TLS.  It's cross-platform.  It supports both inbound and outbound connections.  And yet, including the import, it's only 4 lines of code.  You can write a threaded version of this which appears to be just as short, but it's pretty much impossible to do without getting a half-dozen subtleties of either a socket API or a concurrency issue wrong.

For example, your example "smtp_helper.py".  You don't provide any documentation of its concurrency properties, but the implementation of 'start' is almost certainly wrong.  For one thing, starting the same TestSMTPServer twice, or even starting two completely different TestSMTPServers at the same time, will not work.  Of course, you'd never do that, but let's say your SMTP client also used asyncore and a thread.  Now you've got a client using socket_map in the main thread and a server using socket_map in another thread.  Also, there's the fact that process_message may be called from an arbitrary thread; if it ever grew to do anything more complex than appending to a list, it would need its own serialization logic.  This isn't something that could be fixed — the entire approach is wrong, and you would need to rewrite all of your tests to work completely differently in order to fix it.  You'd need to asynchronously start both your client and your server, then have an API for letting your tests know when both of them are done.  By the time you're doing that, you're practically implementing your own mini-Twisted, along with extensions to unittest that turn it into Trial.

Ironically, you can use Twisted to fix this problem.  If you really like the API presented by the 'smtpd' module, you could write a wrapper which would make an asyncore dispatcher look like a Twisted protocol factory (or protocol), and hook asyncore into the main loop, then use 'trial' for your testing.  How exactly one would implement such a thing is beyond the scope of this post, but it's not actually that hard; just look at the relatively few methods that asyncore.dispatcher calls on self.socket and you'll probably get the idea.

I feel that the comparison of "Twisted" versus "non-Twisted" code you've presented is a bit unfair.  The Twisted example is a demonstration of utility functionality that Twisted Mail is missing, not a core idea that Twisted implements wrong.  The code it is being compared to looks simple only because critical areas of correctness that would need to be addressed in a real system (and will probably eventually need to be addressed, if the test is maintained for a long time) are being completely ignored.  The twisted example, if it fails, will fail relatively straightforwardly; the other example's failure mode will be an obscure traceback coming out of otherwise unrelated (but not thread-safe) code.

However, your subjective experience of some areas of Twisted being hard to understand and use is entirely valid.  Your detailed description of why it was difficult for you has already been useful, but I hope you will stick around and help us improve the situation for future users as well.

Trial and Testing

Perhaps the more significant issue that you discovered while you were working on this is the subtle mystery of getting Twisted to fully shut down a connection and a bound port inside a test.  This is really way too hard, and it is a problem which affects anyone who wants to use Trial for integration testing.

Although I'd really like to see this problem dealt with in a systematic way, and I'd like it to be easy as pie to write integration tests with trial, there is a reason that the issue hasn't been fixed.  As the Twisted team has been improving our testing skills, we've been finding more and more that you absolutely need good unit tests before you can really write integration tests.  Without unit tests, you don't know whether the individual pieces work, so they tend to break in surprising ways when you put them together.  In Twisted itself we are still in the process of rehabilitating a very large, and very old hodgepodge of unit, functional, and integration tests to be broken down into smaller, more coherent unit tests.  Until that process is finished, and trial has been tuned to be as good as possible for that sort of testing, integration testing isn't going to be a focus of any core developer.

I agree with the advice that you were given on IRC.  We could eliminate the particular surprise of doing a clean connection shut-down in trial, and provide a good way to do it, but you'd still face issues with your tests where the SMTP API might be scheduling timed calls or doing other things behind your back which would be difficult to monitor or shut down.  Talking to a mock message-sending implementation for starters would be a lot easier.

I can understand your concern about passing more parameters.  Luckily, this is Python: you don't necessarily need to change the interface of the system you're testing.  If you have a system, A, that depends on another system, B, to perform some of its work, you need to have a reference from A to B somewhere.  That can be passed as a parameter, imported as an object, or loaded as a module.  In Java, you'd need to change all your type declarations and do some kind of dependency injection magic, but in Python you can always cheat.  The worst case in Python, after all, is that A imports B as a module.  So, if you don't want to add any parameters, or even any attributes or methods, consider this:

# A.py
import B

def stuff():
  B.functionFromB().otherStuff()

# test_A.py
import unittest
import A
import B

class MyTest(unittest.TestCase):
  def functionFromB(self):
    result = B.functionFromB()
    # Modify the result for the test, if you like
    return result

  def setUp(self):
    A.B = self
  def tearDown(self):
    A.B = B


Some might consider this a bit gross, of course.  It might be cleaner to add a specific API for plugging in a different implementation of B.  However, it's useful to use this technique in cases — such as the one you described in your post — where you are trying to add some test coverage for an API which has already been written and you don't have control over.

I hope that digression helped, but I don't want to turn this into a screed about what you could have done better; let's consider your requirements as fixed (this needs to be an integration test) and look at what Twisted could have done better.

One thing the core team has been talking a lot about lately has been the development of verified test doubles.  We don't have a lot of them, and we need more.  For example, if you could pass a fake reactor to both your SMTP sender and receiver code, then you could manually make sure it was sending traffic at the appropriate times, to the appropriate hosts, and fail your test in sensible ways if it did something unexpected, rather than just having trial bomb out on you.  This would also let you have regression tests to make sure that your code was working with the latest version of Twisted, in case the APIs in question changed.  You wouldn't need your test to have a full, complete, clean shutdown of your SMTP connections because they would simply be garbage collected, as they would not be connected to the real reactor.  You can see an example of what this might look like in twisted.internet.task.Clock.  If someone contributed a real, documented, usable, verified test double for IReactorTCP, we would all be eternally grateful, especially if they could coalesce all the uses of the numerous half-assed attempts at it in our own test suite.

Something else we could do is write a supported factory wrapper which would allow the use of a real factory and connection in a trial test, but that would shut everything down cleanly at the connection level in tearDown.  I would personally like this a lot, but I can't promise that it would be popular with the rest of the Twisted team.  We all spend a lot of time trying to convince people to write unit tests before integration tests.  I know that I'm a little concerned that providing great integration testing support will just lead to more people being confused by weird interactions in the guts of whatever protocol they're talking to.  Eventually, however, integration tests can be useful, and I wrote the beginnings of the wrapper that I'm suggesting when I was writing tests for the AMP protocol.  You might be able to use that as an example even if Twisted doesn't provide any public APIs for that sort of thing.

Conclusions

Unfortunately there's not much I can do immediately to fix the problems that you've had, Lakin.  If someone with a similar level of Twisted experience attempts a similar task in the near future, it's likely that they'll hit the same issues.  I barely (read: didn't actually) have the time to write this blog post, and I definitely don't have the time to fix the problems I've outlined.

While there are definitely some problems here, I don't think the situation is really all that bad.  According to your post, learning enough about Twisted to do what you were doing and writing the Twisted version of this code took only 3 days.  This learning curve is not as steep as some have accused Twisted of having.  Presumably it would have taken someone already familiar with twisted.mail and trial much less time.  It didn't take me much more than 2 minutes to read and understand it :-).  As I mentioned above, your friend's threaded smtpd implementation has some pretty severe problems which might cause maintenance headaches later, whereas you were quite careful to do a proper shutdown (the trickiest thing to get right) in the Twisted version, so it is likely to be fairly robust going forward.