Tornado + Twisted

Many kudos to Dustin Sallings, who has already created a branch of Tornado which uses Twisted for both networking and HTTP parsing, in probably less time than it took me to write my previous post about how somebody should do that.  Awesome!

(The method it uses is currently a little weird, where you create a "Site" object, but it looks like it would be pretty simple to use a Resource instead if you were so inclined.)

What I Wish Tornado Were

FriendFeed has released its web server, Tornado.  It seems like everyone's blogging about it, and it's obviously relevant to my interests, so I feel like I should say something.

Let me start with the good stuff.  First of all, I think it's great that we have yet another asynchronous contender in the Python world.  Every time something like this comes out, it means that Twisted has to fight that much less hard to get over the huge hump of event-driven programming being too hard, or too weird, or whatever.  It's good to have an endorsement of the general message "if you need a web server to handle COMET requests, it needs to be asynchronous to perform acceptably" from such a high-profile company as Facebook.

Unfortunately I think the larger picture here is a failure of communication in the open source community.  In the course of developing Tornado, there are several things that FriendFeed could have done to move the Twisted community forward, at no cost to themselves.  I don't want to rag on FriendFeed, or Bret Taylor, or Facebook here; they're not the first to re-write something without communicating.  In fact I recently had almost this exact same discussion with another project that did the same thing.  Since Tornado is such a high-profile example, though, I want to draw attention to the problem so that there's some hope that maybe the next project won't forget to communicate first.

My main point here is that if you're about to undergo a re-write of a major project because it didn't meet some requirements that you had, please tell the project that you are rewriting what you are doing.  In the best case scenario, someone involved with that project will say, "Oh, you've misunderstood the documentation, actually it does do that".  In the worst case, you go ahead with your rewrite anyway, but there is some hope that you might be able to cooperate in the future, as the project gradually evolves to meet your requirements.  Somewhere in the middle, you might be able to contribute a few small fixes rather than re-implementing the whole thing and maintaining it yourself.

This is especially important if you are later going to make claims about that project not living up to your vaguely-described requirements, and thereby damage its reputation.  Bret Taylor claims in his blog:

We ended up writing our own web server and framework after looking at existing servers and tools like Twisted because none matched both our performance requirements and our ease-of-use requirements.

First and foremost, it would have been great to hear from Bret when he started off using Twisted about any performance problems or ease-of-use problems.  I'm guessing that Twisted itself had only ease-of-use problems, and other "tools like Twisted" were the ones with performance problems, since later, in a comment on the same post, he says:

I can't imagine there is much of a performance difference [between Twisted Web and Tornado].  The bottom is not that complex in my opinion.

It would also be great if he had explicitly said that Twisted didn't have performance problems rather than making me guess, because I'm sure that is what lots of developers will take away from this.  When you have the bully pulpit, off-the-cuff comments like this can do serious damage to smaller projects.

More to the point, what is the problem with "ease of use", exactly?  The fact that he found Deferred tedious, in particular, seems very strange to me, given that it is so un-tedious that it has become a de-facto standard even in the JavaScript community.  We had no opportunity to help him or anyone else out, because as far as I can tell from searching our archives, we never heard from him or from anyone else at FriendFeed when they were trying out Twisted at first.  Even as he's saying that Twisted is hard to use and (maybe?) performs poorly, he isn't pointing to any particular example of what about it is hard to use, or what performs poorly.  There's still nothing we can do to address this criticism.  And there's still not much we can do to make sure that future potential Twisted users won't have this problem.

Later, in yet another comment, Bret points out the root problem:

... the HTTP/web support in Twisted is very chaotic (see http://twistedmatrix.com/trac/wiki/WebDevelopme... - even they acknowledge this)...

This is true.  However, as I frequently like to note, Twisted is starved for resources.  Reconciling the chaos described on the page about web development with Twisted is an ongoing process.  For a tiny fraction of the effort invested in Tornado, FriendFeed could have worked with us to resolve many of the issues creating that chaos.

This is the main thing I want to reinforce here.  If half a dozen occasional contributors with a real focused interest in web development showed up to help us on Twisted, we'd have an awesome, polished web story within a few months.  If even one person really took responsibility for twisted.web, things would pick up.  But if everyone who wants an asynchronous webserver either uses twisted.web (because it's great!) without talking to us or decides not to use it (because it doesn't meet their unstated requirements) without talking to us, it's going to continue to improve at the same sluggish pace.

Even at the current rate, by the time we have an excellent HTTP story, I somehow doubt that Tornado will have a good SSHv2 protocol story ;-).

In his comment, Bret also takes a couple of pot-shots at Twisted that I think are unnecessary, and I'd like to address those too.

In general, it seems like Twisted is full of demo-quality stuff, but most of the protocols have tons of bugs.

We're not talking about "most" of the protocols here, Tornado is only concerned with HTTP.  And the HTTP implementation(s) in Twisted do not have "tons of bugs".  They are production quality, used on lots of different websites, and have lots of automated tests.  While much of the code in twisted.web doesn't have complete test coverage, since it's old enough to predate our testing requirements, I note that Tornado appears to have zero test coverage.

There's a kernel of truth here — some of the older, less frequently used protocols have a few problems — but in most cases the "bugs" are really just a lack of functionality.  Twisted overall has very few protocol-related bugs, and again, our test policy makes sure that new bugs are introduced very rarely.

Given all those factors, it didn't seem to provide a lot of value. Our core I/O loop is actually pretty small and simple, and I think resulted in fewer bugs than would have come up if we had used Twisted.

I must respectfully disagree.  Again, I don't want to rag on FriendFeed here, but here are several features that Tornado would have, and bugs that it wouldn't have, if it used Twisted for the event loop and none of the HTTP stuff:
  1. EINTR wouldn't cause your application to exit if run in a non-US-english locale.
  2. You don't have the opportunity to forget to set a socket to be non-blocking and thereby make your entire application stop.
  3. It would be possible to run your application on Windows.
  4. Firewalled connections and running out of file descriptors wouldn't cause your server to spew errors forever (at least, it won't any more).
  5. You could write a TCP client that didn't block for an arbitrary amount of time in connect().
  6. Finally, of course, you could use all of Twisted's other protocols, client and server: IMAP, POP, SMTP, IRC, AIM, etc.  You could also use external protocol implementations like Thift.
  7. You could spawn asynchronous subprocesses.
and this is a very short list, based on a cursory reading of the source code, not actually running tornado and not a particularly deep audit.  Some of these bugs might not be as serious as I think, and there might be plenty of other bugs.  But I can't really be sure what works for sure, since again: there are no automated tests.

This list is a great example of why projects like Tornado really should use Twisted.  Tornado implements some innovative web-framework stuff, but absolutely nothing interesting that I can see at the level of async I/O.  Using Twisted would have allowed them to focus exclusively on cool web things and left the never-ending stream of incremental surprising platform-specific, only-happens-in-weird-situations bugfixes to a single, common source.

What To Do Now

I hope that someone at FriendFeed will be a little heavier on detail and a little lighter on FUD in some future conversation about Twisted.  However, I'm sure they're going to have their hands full maintaining their own code, so I don't have high expectations in this area.  I'm sure Bret wasn't intentionally slamming Twisted, either; it wasn't like he wrote a big screed about it, he just dropped in a few unsubstantiated comments into a much larger post about Tornado. So I just want to be clear: I don't have sore feelings, I don't need anybody to apologize to me or to Twisted.

If any of you out there are fans of both Tornado and of Twisted, it would be great if you could contribute a patch to Tornado which would allow it to at least optionally use Twisted as an I/O back-end.  It would be great, of course, if lots of people interested in web stuff would help us out with our web situation, but supporting the Twisted event loop would be good regardless. It would mean that when people wanted to speak multiple protocols, they wouldn't need to re-write or kludge in their existing Tornado application, so it would increase the chances that we could get some help with our SSH, FTP, IRC, or XMPP code instead.  It would also open up a much wider multi-protocol landscape to users of Tornado, even if Tornado's default mode of operation still used ioloop.py.

Even better would be to hook up something that made a Tornado IResource implementation, so that Tornado applications and twisted.web and Nevow applications could all be seamlessly integrated into one server.

The whole point of Twisted is to have a common I/O layer that lots of different libraries can use, share, and build on, so that we can solidify the common and highly complex abstraction required of a comprehensive, cross-platform, event-driven I/O layer.  In order to realize that vision, we need help not just with the code; we need more Twisted ambassadors to go out into the community and help us integrate these disparate applications, help us find out where real users are finding the documentation inadequate or the organization confusing.

Tornado could be an excellent opportunity for those ambassadors to go out and introduce others to the wonders of Twisted, because its endorsement from FriendFeed guarantees it an audience of a tens of thousands of developers, at least for its first few months of life.  If you've shied away from contributing to Twisted itself because of our aggressive testing and documentation requirements, well, Tornado apparently doesn't have any, so it would be a great place for you to start :).

The Web, Untangled

In my previous post, I outlined some reasons that web development is worse than other kinds of development (specifically: traditional client-server development).  I left off there saying that I had some prescriptions for the web's ailments, though, and now I'll describe those.  Given that we're stuck with web development for the forseeable future, how can we make it a tolerable experience?

First, let me tell you what the answer isn't.  It isn't a continuation of the traditional "web framework" strategy.  These have been important tools in dredging the conceptual mire of the web for useful patterns, and at this point in history they have a long life ahead of them.  I'm not predicting the death of Django or Rails any time soon.  Django and Rails are the stucco of the web.  An important architectural innovation, to be sure: they let you cover over the materials underneath, allowing you to build structures that are appealing without fundamentally changing the necessarily ugly underpinnings.  But you can't build a skyscraper out of stucco.

As Jacob covered in great detail in his talk, innovations in the "framework" space generally involve building more and more abstractions, creating more and more new concepts to simplify the underlying concepts.  Eventually you run out of brain-space for new concepts, though, and you have to start over.

I started here by saying that we're stuck with the web.  If we can understand why we're stuck with the web, we can make it a pleasant place to be.  Of course everybody has their own ideas about what makes the web great, but it's important to remember that none of that is what makes the web necessary.

What makes the web necessary is very simple: a web browser is a turing complete execution environment, and everyone has one.  It's also got a feature-complete (if highly idiosyncratic) widget set, so you can display text, images, buttons, scrollbars, and menus, and compose your own widgets (sort of).  Most importantly, it executes code without prompting the user, which means the barrier to adoption of new applications is at zero.  Not to mention that, thanks to the huge ecosystem of existing applications, the user is probably already running a web browser.

I feel it's important to emphasize this point.  When developing an application, delivery is king.  It doesn't matter how great your application is if no users ever run it, and given how incredibly cheap in terms of user effort it is to run an application in a web browser, your application has to be really, really awesome to get them to do more work than clicking on a link.  I can't find the article, but I believe Three Rings once did an interview where they explained that some huge percentage of users (if I remember correctly, something like 90%) will leave immediately if you make them click on a "download" link to play the game, but they'll stick around if you can manage to keep it in the browser without making them download a plugin.

Improvements to ECMAScript and HTML sound fun, but if, tomorrow morning, somebody figured out how to securely execute x86 machine code on web browsers, and distribute that capability to every browser on the internet, developers would start using that almost immediately.  HTML-based applications would slowly die out, as their UIs would be comparatively slow, clunky, and limited.

Tools like the Google Web Toolkit (and Pyjamas, its Python clone), recognized this fact early on.  They treat the browser as what the browser should be: a dumb run-time.  A deployment target, not a development environment.  Seen in this light, it's possible to create layers for integration and inter-op above the complexity soup of DOM and JavaScript: despite the fact that the browser itself has no "linker" to speak of, and no direct support for library code, with GWT you get Java's library mechanism.

Although it's not particularly well-maintained, PyPy also has a JavaScript back-end, which allows you to run a restricted subset of Python ("RPython") in a web browser; I hope that in the future this will be expanded to give us a more realistic, full-featured Python VM in the browser than Pyjamas' fairly simplistic translation currently does.  In opposition to the "worrying trend" that Jacob noted, with individual applications needing to write new, custom run-times, they leverage an existing language ecosystem rather than inventing something new.

Using tools like these, you can write code in the same language client-side and server-side.  This simplifies testing.  You can at least get basic test coverage in one pass, in one runtime, even if some of that code will actually run in a different runtime later.  It simplifies implementation and maintenance, too.  You can write functions and then decide to run them somewhere else based on deployment, security, or performance concerns without necessarily rewriting them from scratch.

If toolkits like these gained more traction, it would go a long way towards interop, too.  It would be a lot easier to have an FFI between Python-in-the-browser and Java-in-the-browser than to try to wrangle every possible JavaScript hack in the book.  Similarly on the server side: once a few frameworks can standardize on rich client-server communication channels, it will be easier to have a high-level abstraction over those than over the mess of XmlHttpRequest and its various work-alikes.

There's still an important component missing, though.  Web applications almost always have 3 tiers.  I've already discussed what should happen on the first tier, the browser.  And, as GWT, NaCl and Pyjamas indicate, there are folks already hard at work on that.  The middle tier is basically already okay; server-side frameworks allow you to work with "business logic" in a fairly sane manner.  What about the database tier?

The most common complaint about the database tier is security.  Since half the time your middle tier needs to be generating strings of SQL to send to the database, there are a plethora of situations where an accidental side-channel is created, allowing users to directly access the database.

This is a much more tractable problem than the front-end problem.  For one thing, a really well-written framework, one which doesn't encourage you to execute SQL directly, can comprehensively deal with the security issue.  Similarly, a good ORM will allow you complete access to the useful features of your database without forcing you to write code in two different programming languages.

Still, there's a huge amount of wasted effort on the database side of things.  Pretty much every major database system has a sophisticated permission system that nobody really uses.  If you want to write stored procedures, triggers, or constraints in a language like Python, it is at worst impossible and at best completely non-standard and very confusing.  Finally, if you want to test anything... you're not entirely on your own, but it's going to be quite a bit harder than testing your middle-tier code.

One part of the solution to this problem comes, oddly enough, from Microsoft: LINQ, the Langauge Integrated Query component, provides a standard syntax and runtime model for queries executed in multiple different languages.  More than providing a nice wrapper over database queries, it allows you to use the same query over in-memory objects with no "database engine" to speak of.  So you can write and test your LINQ code in such a way that you don't need to talk to a database.  When you hook it up to a database, your application code doesn't even really need to know.

The other part of the solution comes from SQLite.  Right now, managing the deployment of and connection to a database is a hugely complex problem.  You have to install the software, write some config files, initialize your database, grant permissions to your application user, somehow get credentials from the database to the application, connect from the application to the database, and verify that the database's schema is the same as what the application expects.  And that's before you can even do anything!  Once you're up and running, you need to manage upgrades, schedule downtime for updating the database software (independently of upgrading the application software).  Note that the database can't be a complete solution for the application's persistence needs, either, because in order to tell the application where it needs to find the rest of its data, you need, at the very least, a hostname, username, and password for the database server.

All of this makes testing more difficult - with all those manual steps, how can you really know if your production configuration is the same as your test configuration?  It also makes development more difficult: if automatically spinning up a new database instance is hard, then you end up with a slightly-nonstandard manual database setup for each developer.  With SQLite, you can just say "database, please!" from your application code, specifying all the interesting configuration right there.

Finally, SQLite allows you to very easily write stored procedures and triggers in your "native" language.  You also don't need to quite as much, because your application can much more easily completely control access to its database, but if you want to work in the relational model it's fairly simple.  The stored procedures are just in memory, and are called like regular functions, not in an obscure embedded database environment.

In other words, for modern web applications, a database engine is really just a library.  The easier it is to treat it like one, the easier it is to deploy and manage your application.

In the framework of the future, I believe you'll be able to write UI code in Python, model code in Python, and data-manipulation code in Python.  When you need to round a number to two digits, you'll just round it off, and it'll come out right.

Oh <what> a.tangled {web, we} WEAVE FROM

The always entertaining Jacob Kaplan-Moss recently posted a missive, "Snakes on the Web", which, if you haven't already read it, is a highly edifying trip through a variety of Python web technologies and history.  He begins with a simple statement — "Web development sucks." — and goes on to ask a number of interesting questions about that.

What sucks about web development?  How will we fix it?  How has python fixed it, and how will python fix it in the future?  While I can't say I agree with every answer, I found myself nodding quite a bit, and he has something useful to say on just about every point.

I noticed one very important question he leaves out of the mix, though, which seems more fundamental than the others: why does web development suck?  In particular, why do so many people who are familiar with multiple styles of development feel like developing for the web is particularly painful by comparison, while so much of software development moves to the web?  And, why does web development in Python suck, despite the fact that otherwise, Python mostly rocks?

Programming for the web lacks an important component, one that Fred Brooks identified as crucial for all software as early as 1975: conceptual integrity.  Put more simply, it is difficult to make sense of "web" programs.  They're difficult to read, difficult to write and difficult to modify, because none of the pieces fits together in a way which can be understood using a simple conceptual model.

Rather than approach this head on, from the perspective of a working web programmer, let's start earlier than that.  Let's say someone approached you with a simple programming task: write an accounting system that includes point-of-sale software to run a small business.  Now, considering some imagined requirements for such a system, how many languages would you recommend that it be written in?

Most working programmers would usually say "one" without a second thought.  A too-clever-by-half language nerd might instead answer "two, a general-purpose programming language for most things and a domain specific language to describe accounting rules and promotions for the business".  Why this number?  Simply put, there's no reason to use more, and introducing additional languages means mastering additional skills and becoming familiar with additional quirks, all of which add to initial development time and maintenance overhead.  Modern programming languages are powerful enough to perform lots of different types of tasks, and are portable across both different computer architectures and different operating systems, so other concerns rarely intrude.

But, in the practical, working programmer's world, what's the web's answer to this question?  Six.  You have to learn six languages to work on the web:
  1. HTML.  This isn't really a programming language, but in web development you do end up reading and writing quite a lot of it.
  2. CSS.  In order to apply visual styles to your HTML so that it actually looks nice in a browser, you need to understand a different language (with a different conceptual model for how documents are laid out than the HTML itself).
  3. JavaScript.  In today's competitive AJAX-y world, you need to be able to react instantly in the browser, writing a real client application.
  4. SQL, so that you can store your data in a database.
  5. Your "middle-tier" language: in my case and Jacob's, that would be Python.  This is where people tend to spend the bulk of their programming time, but not all of it.
  6. A templating language; in Jacob's case, the Django template language.
If you're unlucky, you might need to learn XML, more than one back-end language, a deployment language (UNIX shell scripting or Windows's "batch" language), and ActionScript.  You'll probably need to learn a smattering of some awful web-server configuration language though, like the not-quite-XML-not-quite-HTML used to configure Apache.

Of course, Jacob lists a pile of related technologies too, and rightly points out that it's a lot to keep in your head.  But he is talking about a problem of needing extensive technical knowledge, something which all programmers working in a particular technology ecosystem learn sooner or later.  I'm talking about a different, more fundamental problem: in addition to the surface problem of being complex and often broken, these technologies are fundamentally conceptually incompatible, which leads to a whole host of other problems.  Furthermore, the only component which is really complete is the "middle-tier" language, although bespoke web-only languages like PHP and Arc manage to screw that up too.

Here are a few simple example problems that are made depressingly complex by the impedence mismatch between two of these components, but which are incredibly easy using a different paradigm.

How do you place two boxes with text in them side-by-side?  Using a GUI toolkit, like my favorite PyGTK, it often goes something like this:
left = Label("some text")
right = Label("some other text")
box = HBox()
box.add(left)
box.add(right)
The conceptual model here is simple: the HBox() is a container, the "left" and "right" things are widgets, which are in that container.  You can add them, remove them, swap them, or handle events on them easily.  You can discover how these things are done by reading the API references for the appropriate classes of object.  However, there's no right answer to this question on the web.  You can use a <table> tag, and then some <tr>s and <td>s to make a single-row table with two cells, but that has a variety of limitations; plus, it's considered somehow gauche by most web designers to use tables for layout these days.  Or, you could cook up a collection of CSS classes.  So there's the first impedence mismatch: do you do layout in HTML, or CSS?  Of course most design gurus would like to tell you that "always and only CSS" is the right answer here, but more practically-minded web developers who actually write code will often prefer HTML, partially because it's simpler but partially because CSS's featureset is incomplete and there are some things you can still only do with HTML, or only do portably with HTML.

Plus, how do you discover how these layouts work?  There are a variety of reference materials, but no canonical guide that says "this is exactly what a <table> tag should do, and how it should look".  There are different forms of documentation for both.

If you have a variable number of elements, you quickly run into another problem.  Should this be the responsibility of the HTML, the CSS, or some code (in the templating layer) that emits some HTML or some CSS?  Should the code in the templating layer be written as an invocation of your middle-tier language, or should the template language itself have some code in it?  Reasonable people of good conscience disagee with each other in every possible way over every one of these details.

This is all part of a very complex problem though.  For all of these crazy hoops you have to jump through, HTML and CSS do provide a layout model that allows you to do some very pretty and very flexible things with layout, especially if you have large amounts of text.  Perhaps not as good as even the most basic pre-press layout engine, but still better than the built-in stuff that most GUI toolkits allow you.  So there is an argument that this complexity is a trade-off, where you get functionality in exchange for the confusion.  So let's look at a much simpler problem.

Let's say that, in our hypothetical accounting application, you have a list of items in a retail transaction, and you want to process the list and produce a sum.  Where is the right place to do that?  It turns out you have to write the code to do that three times.

First, you have to write it in JavaScript.  After all, the numbers are all already in the client / browser, and you want to update the page instantaneously, not wait for some potentially heavily-loaded server to get back to you each time the user presses a keystroke.  And why not?  You've got plenty of processing power available on the client.

Then you have to write it in Python.  That's where the real brain of the application lives, after all, and if you're going to do something like send a job to a receipt printer or email a customer or sales representative some information in response to a sale, the number has to be located in the middle tier.

Finally you have to do it in SQL.  Since this is a traditional web application, your Python code is going to be spread out among multiple servers, and the database is the ultimate arbiter of recorded truth.  So you need to have transactions around the appropriate points and execute any interesting aggregate functions (such as SUM()) in the database tier.

So, you've got three times as much work to do in your fancy new web application as you would in a simple record-based application with a GUI.  A worthy price to pay to run in the brave new world of tomorrow rather than on some crusty old client/server system, right?

Well, as it turns out, the problem is somewhat deeper than that.  It turns out that JavaScript, Python, and SQL actually have slightly different numerical models (in fact Python implements at least 4 itself: fixed-point decimal, floating-point decimal, IEEE 754 floating-point binary, and integer math; you should really only use decimal for money, but this isn't availble in JavaScript and its availability in SQL is spotty).  After applying some discounts, your register might read $19.74 but your receipt will read $19.75; and the reports sent to the accounting department will read $19.74898989898989.

Even if you know a lot about math on computers, the limitations of each of these runtimes, and you happen to get all of that just right, you still have another problem to contend with: what happens when somebody else needs to change the logic in question?  How do you test that the Python, the JavaScript, and the SQL are all still in sync?  It's possible, but you have to go above and beyond the usual discipline of test-driven development, because you need to have integration tests that verify that different, almost unrelated code, in different languages, in different environments is all executing properly in lock-step.  Just getting the code from SQL and JavaScript to run in your Python test suite at all is a major challenge; in a language like PHP it's borderline impossible.

This is all even worse when it comes to security, because every part of the application exposes an attack surface, and because you can't use the same language or the same libraries to do any of the work, they all expose a different attack surface.

In his talk, Jacob notes that "frameworks suck at inter-op", but the problem is much deeper than that.  As I've shown here, a single page from a single application written using a single framework, which has only one task to do, can't even inter-operate with itself cleanly, at least not at the level that Jacob wants — or that I want.  He says, "gateways aren't APIs", and he's right: the correct way to inter-operate is through well-defined APIs.  APIs can be discovered through a single, consistent process.  Their implementations can be debugged using a single set of development tools.

CSS isn't an API.  HTML isn't an API.  Strings containing a hodgepodge of SQL and data aren't an API either.

It's not all doom and gloom, but my ideas for a future solution to this problem will have to wait for another post.

Threat 2: Attacks via E-Mail

Continuing my series on simple threat models for internet users, I'll now address the second threat I mentioned: threats via e-mail.

There are two kinds of e-mail attacks: direct attacks, and trojan horses.  First let's talk about direct attacks.

The basic idea behind a direct e-mail attack is that the program you use to read your e-mail might have flaws in it, which a specially-crafted message will exploit.  That message will have a program in it, and a mistake by the programmers who wrote your e-mail client will cause that program to be executed.

Unlike attacks from the outside, which you can very simply protect against by denying outside attackers access to your computer entirely, there is no fool-proof method to protect against this kind of threat.  E-mail formats are highly complex, and messages can contain multiple parts, including images, etc.  The code that decodes images is notoriously prone to security problems.  Even e-mail programs which don't process images are occasionally prone to security problems dealing with the structure of certain messages.

Chances are that you are going to want to read e-mail somewhere, and you probably want to be able to see images and download attachments; shutting off e-mail completely isn't really an option.  The more general advice I gave against the first threat still applies, though: keep all your software up to date, including your e-mail client.  People who make e-mail software take these kinds of threats very seriously and release updates very quickly when problems are discovered.

One way you can mitigate this risk, and reduce the amount of work required to keep up to date (and therefore the opportunity for you to forget to do so) is to use a web-based e-mail client like GMail.  If you use GMail, the potentially vulnerable program running on your computer is just your web browser, and you already need to keep your browser up to date for other reasons.  The code which deals with the structure of messages is all run on the server, and constantly kept up to date by the fine folks at Google.  Similarly, they take steps to protect your browser; stripping out harmful attachments and filtering spam for you so that potentially dangerous messages never reach you.

The much more common form of e-mail attack is easier to defend against, but is attacking something more potentially vulnerable than your e-mail software: it's attacking you.  A trojan horse is a program which doesn't do anything tricky to get itself run automatically, but instead elicits your cooperation in making it run.  Whether you run a web-based email client or the oldest, buggiest version of Microsoft Outlook, you are equally vulnerable to these kinds of attacks.

The key to defending yourself against a trojan horse is to understand what you are double-clicking on.  Look inside that trojan horse before you open it; there may be a bunch of armed greeks inside.  Before you open any document or run any program that was attached to an e-mail, very carefully read the message that it came from.  Ask yourself a few questions:
  1. Were you expecting this message?
    If you weren't expecting the message, you should double-check to make sure.  In the best case, use some mechanism other than e-mail to check.  Give the sender a phone call.  Ask if they actually sent you the message in question.
  2. Is the message really from who it says it's from?
    It's very easy to fake e-mail addresses, so if you are used to receiving messages from Bob Dobbs and you see "From: Bob Dobbs <bobdobbs@example.com>", you shouldn't necessarily believe it.  Does the text of the message read like Bob wrote it?  Does Bob usually send you these kinds of attachments?  Is the "To" line correct?  Does he use your real name?  A lot of spam which includes viruses is very generic, but it is increasingly cleverly disguised as coming from people in your addressbook.
  3. Is an attachment trying to disguise itself?
    Sometimes, even messages you are expecting, from people that you know, will contain evil attachments.  If Bob's computer is infected with a virus, he may well have actually legitimately written you the message but a trojan horse packed itself along for the ride.  In this case, you need to see if the attachment is trying to look like something different than it is.  Does the file's name have multiple extensions?  For example, "business-plan.doc" is a Word document, but "business-plan.doc.exe" is an executable program, with its name changed to pretend to be a Word document to fool you.
  4. Is anything trying to warn you?
    Most browsers and operating systems these days will double-check with you before opening executables which you've downloaded.  If a box pops up saying "Are you sure you want to do that?", don't just click past it immediately; read it completely and try to understand what it's telling you.  Even if you don't understand a word, pausing for a moment to reflect on whether the warning is serious or not will often help you realize that something might be amiss.
If you're careful and look for details which seem out of place, you don't need to be an expert to spot e-mails that look wrong.  The most basic task here is to recognize genuine human communication, and not to scan for any particular technical trick.  That's not all, of course; as I mentioned, there are ways that programs can hijack legitimate communications, but these are much more sophisticated, and much rarer than the much more common type of message, which is one that simply says "hey buddy, click this" and expects you to click on it without thinking.  If you can recognize those you will be safe 99% of the time.

In using the Internet, this is a generally useful skill, and particularly important when it comes to security.  It will be particularly useful when I discuss threat #4, phishing attacks.