The Emacs Test

I use Emacs.  However, unlike some Emacs users, I don't treat it as a religion.  In fact, I'd rather be using a more "modern" IDE; one that understands my code on a deeper level and provides things like refactoring tools, integrated debugging, and "view method implementation" that work reliably and don't require weeks of configuration effort to use.  One that uses modern UI conventions instead of arcana from the 70s so that my friends who are not emacs-heads can quickly wrap their heads around what's going on on my screen, and perhaps dare to touch my keyboard.

However, even if one is keen to do it, switching away from Emacs is a big deal.  I see lots of editors that advertise "emacs keybindings".  While I appreciate the effort, these features always look like someone who has no idea how to use Emacs worked through some kind of quick cheat-sheet of features like "keybinding for Save", "keybinding for Save As", "keybinding for Close Window" and just added them one after another.  Sometimes, with no regard for whether these keys conflict with other shortcuts!  (I'm looking at you, gnome "key-theme".)

Do you think you can write an editor which can replace Emacs for me?  Here are a few features, taken both from my years of customizing Emacs to meet my needs and some basic features in Emacs itself that non-natives never seem to understand.

I'm leaving out the extremely basic stuff like "syntax highlighting" and "automatic indentation" since most editors do OK on those fronts already.  These are the things that I find have been subtle, in that they're broken almost everywhere outside of emacs.

Can you do what I mean when I press the "go" button?

When I edit code, I repeat these steps endlessly:
  1. Edit a test file.
  2. Run the tests.
  3. Edit an implementation file.
  4. Run the tests.
In order to do this in my current emacs setup, I open my implementation file, then open my test file, then press F9.  Then, I edit a little bit, and press F9 again.  Then, I switch to the implementation file, type a little bit, and press F9 again - Emacs knows which tests to run, because the implementation file has an annotation in it which describes the test-cases that are associated with it.

When I'm done, I push F5 and it immediately jumps my cursor to the error that i'm working on.

This works for me in Java, in C, and in Python.  I've got a little bit of custom emacs-lisp code that I wrote to do each of them.  I'm willing to write a little more.  But, is it very easy in your editor to grab a global keybinding?  To write a plugin in 4 or 5 lines of code that just formats a command-line string to run?  To parse the output of a subprocess?  To visit a file and line number without requiring further user interaction?

Can I reach the thing I need to work on fast?

Let's say I'm working on a file called "foo.c".  I want to open "bar-baz-boz-qux.c" in the same directory.  In Emacs, this is probably just "control-x control-f b <tab> <enter>" - maybe a few more letters if there are other "bar-" files.  Do I need to hit more buttons than that in your editor?  Do I need to reach for the mouse?  Do I need to navigate the inefficient-even-from-the-keyboard GTK file dialog?

Now, let's say I've got the file "foo.c" open in several different branches of the same project, and I want to quickly alternate between them, display them side-by-side, etc.  In Emacs, "C-x C-b" will bring up a list of every open file, and as I type its name, the list is reduced to only those who match what I've typed so far.  So if, for example, I type "foo.c", I'll get all my "foo.c"s.  I can cycle between them with C-s until I get the one that I want.

I don't want to hunt around and click on a tab-bar or hit "next next next next".  I just want to type in a little bit of the file name and have the editor figure the rest out for itself.

Can I use it on Mac, Linux, FreeBSD, Solaris, and Windows?

Emacs is extremely portable.  Every operating system I want to work on (and many, many that I hope I never do) can run it.  I don't want to invest the energy to learn a new editor if I can't use it everywhere.

Can I use it remotely over the internet?  Collaboratively?

You can cheat on this one: if you can run in a terminal (and under Screen), then you get both of these for free.  Emacs cheats that way.  But if you can't run in a terminal - no, I'm sorry, VNC is not an acceptable substitute.  It doesn't perform adequately over the internet and it probably never will - as the internet gets faster, the average display gets bigger and has more colors on it.

Your editor probably can't run in a terminal.  So you'd better give me a way to pair-program with people over the internet and a way to access the editor on my desktop machine when I'm away from home.

Can I see whitespace?

I don't like to leave invisible droppings in files when I edit them.  I'd like to be able to see trailing whitespace, highlight it, and eliminate it.  I'd like to be able to tell if I have any tabs in my files (python does not like that very much).

Do I need to carefully juggle my clipboard - or do you have a "kill ring"?

Normally, if you cut some text, then cut some other text, you lose the first text — unless you use "undo" or something like that and screw up your editing state.

In Emacs, when I cut five or six different pieces of text, and I go to paste them, I can paste any of them.  I don't have to carefully remember what's on my clipboard, because the last 60 or so things that I cut or copied are around in case I need any of them.

Do I need to carefully remember where I was and scroll around to get back there - or do you have a "mark ring"?

Emacs has what amounts to a "back button" for your text editor.  If I edit something interesting, go to another window, go to another project, and then want to jump back a few steps, I can easily do that.

This is particularly helpful when, for example, inserting some import statements.  I'm in the middle of a function in the middle of a big file.  I want to use the Foo class, but it's not available yet.  So, I jump to the beginning of the buffer, type "from baz.bar.foo import Foo" and then hit "C-u C-space" to jump back to the middle of that function I was working on.  "C-x C-space" does something similar, but can even take me to different files.

Can you do smart word-wrapping?

/**
 * When I have a long documentation comment in ActionScript, JavaScript or
 * Java, Emacs will helpfully wrap it like this.  If I make changes in the
 * middle and then re-wrap it, it stays wrapped and helpfully adjusts the
 * placement of the asterisks along the left-hand side.  Can your editor do
 * this?
 */

# But when I have a comment in Python, it's formatted like this.  I don't have
# to tell Emacs anything about the different comment styles.

-- For that matter, it can understand and properly format SQL comments too.
-- And C/C++, and Ruby, and PHP, and more.

"""
If I format code inside a docstring, it flows properly too.  Granted, there are
a lot of bugs in this particular case in the stock Emacs, but since it works
everywhere else I have written some workarounds.  (You could always work around
it by inserting some extra blank lines before wrapping, but that always
bothered me.)  Can I customize how your flowing works if I don't like it?
"""

Is there version-control integration?

If I'm editing a project that uses bzr, darcs, git, svn, hg, perforce, or cvs, I can get a nice "status" page as a jumping-off point in Emacs to show me which files are in version control and what files I've edited.  I can update, commit, pull, push, and diff out without leaving the editor.  Can your editor do that?  And I don't just mean, can your editor do that for SVN.  Does it support all of the systems I just named and a few others for good measure?

Can I tell what I'm working on?

I don't like having to scroll around to figure out what function I'm in the middle of when I forget.  I work on a lot of code, and I browse a lot of code, and sometimes if I'm in the middle of a 200-line-long function I can't see the class or the function name.  Emacs has a feature called "which-func-mode" which allows me to glance at the bottom of the screen and instantly know what function I'm working on.  Fancy, glowy sidebars with tree-views of my whole source file and inheritance hierarchy are great, but can I always see the name of the class and the method that I'm working on now?  Even if there are so many other methods on that class that the fancy method list on the left has to scroll?

Last but not least...

Can I code for it?

I'm a programmer and I need a programmer's editor.  I don't want to write giant, heavyweight plugins; I want to be able to quickly toss off a snippet of code which modifies the editor.  But, I'm not an IDE developer.  I don't want to write a giant plugin; I want an editor which lets me organically grow my own modifications when I find myself doing some task frequently.

For example, I have my own "snippets"-type module, "quick-hack mode", which does a ton of clever-clever things like inserting
def (self):
    """
    """
when I type "def" inside of a "class" block.  (of course the "def" is omitted otherwise).

I have a hotkey to turn this off in case other people find it annoying ­— and it's difficult to be ambivalent about this mode, you either love it or hate it: it's a very personal thing, and I don't expect your editor to support it directly.  This mode was developed after years of observing my own peculiar tics while editing and crafting conveniences to support that and free me from distractions.

Since developing this kind of support code isn't my main interest, coding needs to be more than possible, it needs to be easy.  I need interactive help and the ability to load a brief snippet of code into the editor without restarting it; a reasonable debugger so when it blows up in my face I can at least sort of tell why.

Do you feel lucky?  Well, do you?

Emacs has a lot of features.  You don't have to replicate them all.  But if you want me and the millions like me — okay well maybe not millions, but it's not just me either — to switch to your shiny new editor, you'd better be doing all of these things and doing at least one or two other totally awesome things that emacs can't do.

Databases and Twisted: When Threads Are OK (For Some Purposes)

Last month, a thread on the twisted-web mailing list got me thinking about a frequently implemented, but seldom understood usage of Twisted: writing applications backed by a traditional database server.  I tried to write a timely reply on the mailing list, but found what I had to say on the topic was overflowing the polite bounds of an email message.  I've tried to write about this before, buried in the middle of a post about something else.  I don't think I really got my message across, though, because I believe this was quoted as saying that I find asynchronous data-access APIs "extremely painful".  Asynchronousness is not the point that I find difficult, as much as I do transactionality (and integration with existing database bindings).

So this time, please bear with me as I explain enough context to properly frame my opinion.

I think that concurrency is a difficult problem that affects every aspect of your code, and so it is important to have a comprehensive, consistent, and easy to understand plan to deal with it in any given system.

Twisted's "cooperatively multitasking / callbacks-scheduled-by-I/O-and-timers" idiom is one concurrency model.  Deferreds are a super important convenience mechanism in that model, but they're not completely necessary; you can do this with just dataReceived, connectionLost, callLater etc.

In general this model - let's call it something memorable, like "CM/CSBIOAT" - is a pretty easy concurrency model to work with once you know how it works.  In particular, it's pretty easy to avoid making a common variety of serious concurrency mistakes, since you don't need to remember to declare any locks, and the behavior of the system under load is unsurprising, if not necessarily ideal for performance.

Threading is another concurrency model with which we are all familiar.  Shared-state multithreading is a pretty bad concurrency model for general use.  In particular, it's very easy to make mistakes that are impossible to diagnose or reproduce.  Despite its unsuitability for applications, threading can be a useful building-block as a low-level tool to construct higher-level concurrency models.  In many practical cases I am aware of, threading is the only available building-block at this level for building efficient implementations of other concurrency models, because operating systems and compilers don't provide anything better.

There is an antipattern that arises from a somewhat naive understanding of these two models.  The Twisted novitiate discovers that Twisted Is Good, and Threads Are Bad.  Experimentally they discover that this is indeed true, and that despite its eccentricities, writing and debugging "Twisted" code (whose benefits really come from the CM/CSBIOAT pattern) is a lot easier than writing and debugging threaded code.

So, our unfortunate Twisted novice now needs to write a database application: what to do?  Well, one way they can write it is to "just use threads" for data-access logic and communicate with Twisted some other how - for example, to put all their database logic in a function they pass to runInteraction.  The only other apparent option is to "use Deferreds" and invoke adbapi.ConnectionPool.runQuery or runOperation.  Deferreds are Twisted - Good!  Threads are ... threads!  Bad!  The answer seems obvious.

However, in choosing to use this facility, you've done far more than choosing between "twisted" and "threads".  If you use runInteraction, you can easily keep all of your work in a single transaction; since database APIs are blocking, you can only safely do a read followed by a write in the same transaction if you can block between those calls.  If you do a runQuery, take the result of that and pass it as input to a runOperation, you're sharing data between two different transactions and potentially two different cursors.  Whether Deferreds or good or not, this breaks the assumptions that the underlying database uses to keep its data consistent.  Consider incrementing a counter.  In the "threaded" case, you'd do something like this:

  def interaction(txn):
      x = doSql(txn, "select thingy from foo where bar = baz;")
      doSql(txn, "update foo set thingy = ? where bar = baz;", x+1)
  cp.runInteraction(interaction)

This always results in foo.thingy being set to foo.thingy + 1.  If your database is set up properly (and most are by default) there's no opportunity for other code to execute between those two statements.

But in the "twisted" case you do something like this:

  @inlineCallbacks
  def stuff(cp):
      x = yield cp.runQuery("select thingy from foo where bar = baz;");
      yield cp.runOperation("update foo set thingy = ? where bar = baz;", x+1)

As syntactically pleasant as that appears to be, and as convenient as it might seem to be able to call Twisted APIs as much as you want in the middle of this work, any amount of code can run between the first line and the second, thanks to those pesky 'yield's.  That means if you run 'stuff' twice, there's a good chance that your callbacks will stomp on each other and one of the increments will be lost.

Transactional relational database access is a really different concurrency model, all its own.  In many cases it appears to be the same as plain old shared-state multithreading; not least of which because it is implemented using threads and the threads are completely exposed to application code, and made part of the database interface's API.  However, using a transactional database to store your interesting state is much, much safer than just using threads to access any old datastructure.  An ACID database is specifically implemented to provide a consistent view of your data to any executing client at any time, and in the cases where that would be impossible, to schedule execution of various clients to provide an ordering where data is consistent.  (You'll notice that I have avoided saying "thread", but in practice an executing SQL client is a thread in your application.)  Caching middleware confuses this issue, making it more like regular multithreading; but in a good SQL database, using threads rather than just separate processes is just an optimization; one which should be completely transparent to your application code.

Axiom doesn't really have a concurrency model (it ought to, but that's a discussion for another day).  The idea there is that, like the rest of Twisted, you try hard never to block for too long.  It is possible — too easy, really — to screw this up and block for a long time waiting for the disk in an Axiom program, but to some extent that's true of any Python code.  Since Axiom is typically accessed by one, or at most two processes at a time, you won't end up blocking on your database for a long time because some other code is using it; the main thing Twisted's concurrency model is designed to prevent is your code blocking and getting stuck or being idle, not your code blocking at all.  So, I'm going to give you advice for using Storm or ADBAPI: the only advice for Axiom is "write fast queries".

Assuming that you're writing a traditional database application, here's my advice for you.

Let's assume that Storm (or ADBAPI) does not have any thread-safety bugs itself.  This assumption is unlikely to be completely true, but you probably have to make it regardless if you're going to use either of these things at all, regardless of my advice :).  With that assumption, you can use Storm (or ADBAPI) with Twisted from a thread-pool and pretend, in your application, that the threads do not exist.  You should avoid accessing global state and pretend that your code might be run in a subprocess or a thread or even on a different computer. If you're lucky, one day it will be, and your application will "magically" scale!  If you follow this simple discipline, you can cleanly interface between the Twisted concurrency model (where you do all of your non-database I/O) and the RDBMS concurrency model (where you interact with all of your "data" objects).

Don't touch any database objects in your Twisted mainloop.  Don't touch any Twisted objects in your database transactions.  This has the added benefit of not needing to worry that you're sending out information about partially-completed database operations to a network connection, or injecting potentially transient network state into a persistent database operation that may need to be re-tried.

In theory, there's nothing stopping an asynchronous data-access API from doing all of the same stuff that I just described threads doing.  All you'd need is good non-blocking database infrastructure, non-blocking transactions, and a bunch of code to associate a running transaction object with a particular database transaction and cursor.  It is possible, if you go down to the database-protocol level, to write a database wrapper which actually integrated with the Twisted concurrency model and treated your database as just another source of input and output.  In terms of preventing errors and assisting making code testable and deterministic, I think this would be an improvement over the threaded version of this solution.

However, implementing such an improvement would likely take quite a bit of time.  Time that most small database-backed projects don't have, so it's unlikely someone will need to scratch this particular itch any time soon.  Even if someone did do all that work for one database, it's likely that a lot of it would need to be done over again for each subsequent set of database bindings; so, using a DB-API module in a thread would remain the only way to retain database portability.

For the moment, threads and threadpools are the tools that existing database bindings give us to manage transactions, and it's likely that they're adequate for a huge majority of applications.  The only real problem is that you can't completely hide threads from the application and make sure they're not being used for evil.

Search History: L

I'm a bit late to the party, but I just found an old screenshot of my search history beginning with "L", from my laptop.



An Underserved Market

I play video games.  Also, I'm married.  Ying also plays video games.  More than I do, even.  Where are the games — besides WoW — that we can play together?

I know a couple of other guys who like to play games with their significant others.  I really feel like the gaming generation has grown up at this point, but where are the grown-up games?

My favorite kind of video games are immersive, story-driven games with open worlds and a lot of flexibility.  I am really digging The Witcher: Fewer Bugs Edition right now, but despite its "mature" and "philosophical" themes, it feels like a game written for a "mature" and "philosophical" adolescent male misfit rather than the usual vanilla adolescent male misfit.  That's not really a black mark against this specific title - it in particular seems to pull off the stereotypical fantasy tropes very well.

While the independent gaming scene is a lot better in terms of raw originality, I haven't seen anything I can recall on TIGSource where I thought "That would be great to play with Ying."

This is mostly a rhetorical question (get to making those games, game-makers who are reading this!) but I would also be very happy to be proven wrong.  If you leave a comment with a game we end up enjoying I'll definitely blog about this again.

Installing Software on Linux Doesn't Need to be Terrible: A Photo Essay

Installing third-party software, especially end-user GUI software, on Linux is frequently a disaster.  This is frequently taken (especially by pundits) as some kind of inherent limitation of the platform, an indictment of its design and core principles, but it isn't.  In fact, the platform has had solutions to this problem for a long time, but it seems like the people who need the solutions the most, i.e. the people packaging commercial software, don't know about them.

I love to complain about this problem.  I'll take this opportunity to do so, as a matter of fact.  Installing the vmware-server management console is a typical commercial-software-on-linux experience: you download an archive, start up a terminal, run a shar executable (as root!) to install their crappy package which doesn't fucking work, you play with obscure tools that a real "end user" is never going to figure out, like ldd and ltrace, and eventually the damn thing starts up.  (Their solution to this problem?  A web-based management console which is uglier, less usable, and still doesn't fucking work.)

I love vmware - well, vmware-server 1.0, at least.  I really like the fact that the management console uses GTK, fits right into my desktop, and seems to behave like an actual application on linux.  (If the gross web UI is some kind of trick to get me to buy Workstation instead of continuing to use Server... well, it might work.)  The packaging of the software, though, is a masterful example of snatching defeat from the jaws of victory.  It's bad enough that every time I have to set up a new vmware installation I spend a few hours surveying the virtualization options on linux to see if anything might ease the pain.

The point of this little write-up isn't to stare unblinking into the gut-wrenching horror of a vmware install on Ubuntu, though.  The various ubuntu forums do that well enough.  Thank goodness for that, too, or I probably wouldn't have figured out how to use it.  No, my purpose here is to show what happens when you do it right.  Installing software on Ubuntu can be as nice as on a Mac.

I've been telling people for a long time to make Debian packages of their software, when they release builds for Linux, but I don't know if I've really communicated why having a package is better than anything else.  Thankfully, Ubuntu has now set up everything so that there is a really obvious reason why you should build a package: it's about a million times more user-friendly than doing the opposite.

Inform 7 has managed to provide an excellent example of how an installation on Linux should go, for an end user.  It just so happens that inform 7 is not open source.  I installed the whole thing using only the mouse, except for typing my administrator password.  The story begins on their Download page.

http://www.twistedmatrix.com/users/glyph/images/content/blogowebs/i7essay/screenshot2.png
Notice how this page is clear about which thing I should click on, even if I can't read.  I just need to identify the little penguin, and the little three-hugging-people logo.  It would be pretty weird for an illiterate person to install Inform, but nevertheless, it saves us from reading lots of extra stuff.

So I click on the download link, and Firefox prompts me to decide what to do with this.
http://www.twistedmatrix.com/users/glyph/images/content/blogowebs/i7essay/screenshot3.png
In fact I would like the package installer!  I click OK (after waiting for it to activate, since firefox has helpfully prevented me from clicking that by accident...).  Now I wait for it to download...
http://www.twistedmatrix.com/users/glyph/images/content/blogowebs/i7essay/screenshot5.png
Done!  I don't even need to do anything before the Package Installer window helpfully pops up to help me install this package:
http://www.twistedmatrix.com/users/glyph/images/content/blogowebs/i7essay/screenshot6.png
Now, I click "Install Package", and my keyboard gets involved in this process for the first time:
http://www.twistedmatrix.com/users/glyph/images/content/blogowebs/i7essay/Screenshot.png
I'm pretty sure I do in fact want to install Inform 7, so I type it in, and the package installer does its magic...
http://www.twistedmatrix.com/users/glyph/images/content/blogowebs/i7essay/screenshot7.png
http://www.twistedmatrix.com/users/glyph/images/content/blogowebs/i7essay/screenshot8.png
Yay!  It's done!  I just need to look in my Applications menu to find it.
http://www.twistedmatrix.com/users/glyph/images/content/blogowebs/i7essay/Screenshot-2.png

And there you have it:

http://www.twistedmatrix.com/users/glyph/images/content/blogowebs/i7essay/screenshot9.png

This is what I want installing third-party and proprietary GUI software to be like on Linux.

Please do not use a shar archive.  I don't want to have to tell nautilus that yes I actually want to run this file which may be an executable text file, and I definitely don't want its EULA prompt on the console to be lost.

Please do not use a zip file.  Id on't want to have to drag files out of archive manager or open a terminal.

I just want to click a button and have the whole thing work.  The work required to make this happen is seriously not that substantial.  You even can pay me to do it, but chances are if you've already built an application for linux, you have enough experienced people on staff to do this yourself.

Notice that with very little additional work you could also provide a similar experience on Fedora; I haven't done it myself, but I have to assume that clicking on the little blue icon on a fedora machine would be a better experience than a bunch of shell commands.