Full-Duplex Metablog

I've been doing some thinking about what I use this medium for.  I've come to the conclusion that I'm not really sure.  And yet, I know there are a lot of the things I don't use it for.

I try not to write about blogging (although this post is clearly evidence that I do sometimes) becuase a medium that does nothing but navel-gaze is dull.  I got really tired of reading a few otherwise good authors (names withheld to protect the guilty) who seemed unable to stop writing about how amazing it was that they were writing all this stuff!

I try not to write about politics, because I think it'll do more harm than good.  Here, I will name names, because one person in particular stood out: ESR did a lot to damage his credibility in my eyes with his "Anti-Idiotarian" manifesto and subsequent political blathering.  (Link omitted on purpose. I'd rather remember his useful writing, not go read that junk again.)  My distaste was not really because of the views he espoused, but because they demonstrated the poorly-socialized adolescent perspective on politics that the popular media is so quick to ascribe to all nerds: "If we insult our opponents enough eventually they'll realize our superior reasoning is obviously right".  Now, I can't help but read all of his writing in that light.  I live in mild fear that I might be a poorly-socialized man-child myself.  I don't want to walk around wearing that fact on my sleeve.

I try not to write about personal events, because in many cases they're not that interesting to a wide audience.  For every blog post about how cool blogging is, there are two about how angry somebody is that somebody else is wasting all their time with articles about their cats.  MC Frontalot told me so himself.  (Although I wonder: does anyone complain about how much time is wasted by people forcing them to read blog posts about blog posts about people talking about their cats?  If so, is the irony intense enough to be a carbon-neutral power source?)

I don't bother to write on a schedule, because it seems somewhat arbitrary.  Nobody is desperately waiting to hear my experience of audio problems in Linux.  Nobody's week is going to be ruined if they don't hear me complain about shared-state multithreading yet again.  And of course, keeping to a schedule is hard.

There are lots of things I don't do.  So why am I publishing this stuff at all?

The one thing I can say that I do already do is write about technical topics.  I'm pretty confident in what I'm saying.  I have complex ideas that I want to refer to later, so writing things up in detail is useful to me personally.  It seems like other people sometimes enjoy my perspective.

More generally, I think writing is an important skill, and practicing it helps to develop it.  The process of writing itself is enjoyable.  It's like programming, but easier.  I don't have to be right all the time; if I misspell a few words or use the passive voice, the article doesn't crash.  I find writing for an audience a useful way to explore my own ideas, and having audience is a good way to draw attention to my work.  This is particularly useful in the dog-eat-dog world of open source, attention is the fuel that projects run on.  It's also nice to keep others up to date with what I'm doing.  Public writing serves as a sort of social lubricant.  It's always nice to start a conversation with, "so I heard on your blog...".  It short circuits awkward "I-don't-know-what-you're-about" smalltalk, as well as repeating oneself a lot when asked "what's going on with you".

My motivation, as stated, partially contradicts the above list of self-imposed prohibitions.  If I want to draw an audience, writing about emerging communication technologies, such as blogs (perhaps, especially blogs), seems to attract a default audience that is enthusiastic.  New Media geeks do, after all, read a lot of New Media.  Not that I'd have to force it, either, I have a lot of ideas about the world wide net that I could share.  Similarly, the popular reaction I mention to stories-about-cats blogs indicates that lots of people read them (albeit with mixed levels of disdain).  Writing about personal events would be useful to some of my audience; members of my family, for example, who don't really care about, or even understand, my technical stuff.  Even writing about politics might clarify my thinking.  Since I'm very concerned about politeness to those I disagree with, maybe I could write about my ideas without dropping an "anti-idiotarian" firebomb.

Writing on a schedule might also improve my writing skills somewhat, in that it would cause me to come up with things to write about even if the words weren't bursting forth.  In fact, it was Jeff Atwood talking about "success" as a blogger having to do with keeping a schedule that got me thinking about this in the first place, both about writing on a schedule and defining "success" for myself.

If I were on a schedule, I'd have to learn tricks like coming up with concise ways to phrase things quickly, rather than re-editing and deleting and re-editing and polishing in odd moments for weeks.  For reference, I wrote this post in one sitting, so it was easy to time how long it took.  Granted, it's non-technical, so it's a less comfortable area for me, and I wasn't giving it my full attention, but still: it took 9 hours, 4 of which were almost exclusively proofreading and deleting.  If this were a typing test, I'd clock in at just under 3 words per minute.  I could stand to get a bit faster.

Ultimately these thoughts end up being circular though.  I don't know how I'm going to evolve my writing habits because I don't know what my audience really is.  The whole point of writing for an audience is that one has to leave aside one's particular whims and try to communicate what the audience is interested in.  Feedburner hasn't given me too much information about you thus far.  I know there's about 300 of you.  I know that most of you use Google Reader to read my blog.  Beyond that, the general trend seems to be that later posts have gotten more interest, but that may just be a function of the fact that I'm picking up readers as I go along.  (Plus, feedburner never seemed to work quite right for my LiveJournal.  I'm sure I'm missing a lot of data.)

That's the magic of blogs as a medium though, right?  The instant feedback from the audience, transforming the creation process from the simple act of creation to a feedback loop?  From an isolated experience in the mind of the author to a continuous process, a genuine collaboration between creator and consumer? By the way, if you ask me for more blogging about blogging, don't complain if you get more pretentious nonsense like this.  I've got buckets of it.

Please, tell me what you'd like to read.  I'd also like to expand my audience to other people who might not have exactly your interests, so I'd like to hear what you wouldn't mind.  Would it bother you to see the occasional funny picture of my friends' cats?  To read all about my plan to revitalize the economy by investing a trillion dollars of federal money in buying me a lot of really nice computers, cars, and houses?  How about my Culture / Star Trek crossover fan-fiction?

If I do tackle a wider diversity of topics, do you have any preferences for how I should segregate them?  Different blogs?  Tags?  Does it matter if I segregate them, or would knowing that I'm a card-carrying member of the American Reptoid Control Party destroy your confidence in my technical acumen forever, even if you had to click through a bunch of links to find out?

I'd also be interest in when you'd like to read it.  Personally, I've cared about posting schedules for comics and serialized fiction.  I've never really been waiting around for the next episode of Joel on Software or Coding Horror though.  Have you?  Would you like to see a posting schedule here?

So, there you have it.  Your move, Internet!

The Dark Art of Sound on Linux

Introduction and Goal

Experiments need to be slotted into some larger context of research, and their results need to be communicated to other practitioners. That's what makes them true "experiments" instead of private fetishes.
 — Bruce Sterling, The Last Viridian Note
I've been trying to get a USB headset to work gracefully with a variety of applications on Linux for quite some time.  Recently I had a bit more time to investigate why this is so difficult, and to learn a few things about ALSA.  Inspired by Mr. Sterling, I feel compelled to share the results of this ... experimentation.

I realize that many of the things I am going to describe here are bugs.  Some are feature requests.  If I had only discovered one or two, I'd just file them myself, but in this case I feel that producing a comprehensive report, detailing the consequences of the relationships between bugs in different packages, would be more useful.  However, such a report is only useful as a resource for others to come along and extract the individual bug reports and do something about it.  I strongly encourage those of you with the appropriate know-how to extract bug reports from this article, report them, and link here for reference.  I will update the article, and subscribe / vote on any bug reports that get made.  When appropriate, I encourage you all to use the Launchpad bug-tracking service to report these issues.

My purpose here is to provide a snapshot of the state of audio on linux, and to ask the maintainers of various packages to reflect upon their complicity in this sprawling disaster.  To prompt them, perhaps, to fix the relatively basic parts of the audio stack before enhancing the crazy extra features that it sports.  There is plenty of finger-pointing and pointless whinging about the state of Linux's audio setup, but I haven't found much in the way of a detailed critique.

Let me be clear: every single layer in the Linux audio stack is broken: the ALSA kernel drivers, the ALSA library, the sound servers (Pulse, of course, but also ones that I don't cover here, like ESD, aRts, and Jack), and the applications which use all of these things.  I don't want to excuse their brokenness.  But the developers of these things have each given us some interesting code, for free, and we shouldn't blame them for that unless they are actively opposed to making it work right, which I don't believe they are.  We should give them as much clarity as possible into the nature of the problems, because every layer can work around the deficiencies of the others.

In the meanwhile, if elements of my own jury-rigged setup are helpful to anyone else, I hope that reading about my investigations will be easier than performing their own.  Before you dive into this whole mess, though, I'd like to be quite clear: it is not really possible to have multiple audio devices attached to a linux machine which arbitrary applications can select from and use.

Methodology

Right now I'm doing everything with Ubuntu Hardy.  When I eventually upgrade to Intrepid, I will post an update that describes any differences in my results.  However, in the course of reading the various forums out there, looking for answers to my questions, I believe that the problems persist, not just in Hardy, but in recent versions of RedHat and SuSE desktop distributions as well.  At least, my main problem with pulseaudio remains.

I'm writing about the problems in the most direct way that I perceive them.  That means that I oscillate between APIs, implementation code, configuration and end-user experience.  Although I'd like to motivate everyone to make a sound system that Aunt Tillie would find so pleasant and easy to use that it's almost invisible, I think that a sound system which is usable by a deeply knowledgeable, motivated, skilled programmer is a necessary pre-requisite to that.  So right now, I'm only looking for something that I can use.

Finally, I'm not really editing this.  It's already been a depressing amount of effort to compile all this information, so I'm trying to write it up quickly and avoid

The Problem(s)

I have a headset with a USB sound card, and a normal, onboard sound card that is hooked up to speakers.  I would like to use each of these sound cards at different times for different activities.

For one thing, I like to listen to music.  My desk is right next to my wife's; sometimes we feel like enjoying the same music, sometimes not.  So I'd like to be able to switch my music player between my headphones and my speakers easily.

I also like to make and receive calls via Voice-over-IP.  I would prefer to use the headset to actually make the calls, but when a call comes in, as with a regular phone, I'd like to be able to hear it on my speakers until I make it to the desktop.  Importantly I'd like to be able to do this even if I'm listening to music already.  Without this requirement, I'd probably be fine with just one sound device.  I'm beginning to wonder if it's worth the trouble.

I also play video games.  In Linux, my selection is somewhat limited, but World of Warcraft (when tweaked appropriately) runs very nicely under Cedega.  I'd like to be able to do VoIP and gaming on the same headset.

In order to do all of this audio stuff, this means I need to be able to tell the following applications which audio device to use, and to be able to use the same device concurrently from more than one of them at a time:
  • For VoIP
    • Ekiga
    • Skype
    • Twinkle
    • Ventrilo (via WINE)
    • Adobe Flash 10
  • For Gaming
    • World of Warcraft (via Cedega)
    • Neverwinter Nights
    • Quake4
    • On the Rain-Slick Precipice of Darkness (in other words, Torque)
  • For Media Playback
    • Quod Libet
    • Totem-GStreamer
    • VLC
On MacOS or Windows, telling the equivalent applications what device to use is trivial.  On Linux, I think it's impossible in the general case, and certainly challenging otherwise.

The Solution (Not Really)

I'm sure that, even given this disclaimer, somebody is going to tell me that if I used Pulse, all of my problems would be solved and life would be great.

I've previously written about not using PulseAudio.  In that article I briefly mentioned random lockups and crashes.  That's a pretty serious issue, and it makes Pulse unsuitable for daily use.  I've had others tell me that more recent versions of Pulse are more reliable, and so perhaps this is not such a concern any more.  However, it's not the only problem.

Many applications — including all of my VoIP applications, Ekiga, Skype, Ventrilo (i.e. WINE) and Twinkle — do not yet have support for Pulse.  That should be fine, because there's a "pulse" ALSA plugin which connects ALSA clients to PulseAudio servers.

The only problem there is that the compatibility layer doesn't work.  Until recently, it didn't work for Flash; this has been fixed in the Flash 10 update, but it still doesn't work for Skype.  By the way, If you have any interest in audio and Linux, please click on that issue and vote for it.  Even if you don't care about Skype in particular, other audio programmers are going to look at Skype for an example, so it would be good if they fixed their most serious problems.

The pulse plugin, in my experience (although this is less recent than the rest of the article) also has weird, intermittent issues.  Audio artifacts creeping into streams from certain applications.  Timing issues.  Latency, programs sometimes locking up for a few seconds when opening the audio device.

Another issue with Pulse is that non-pulse-native applications can't tell that there are multiple devices that Pulse is managing.  The whole point of my attempt at a setup here is to have different applications use different devices for different purposes.  Applications like Ekiga and Skype don't

Death of 1000 Bugs

The first and most obvious problem that I face is that although the default device is properly configured so that multiple programs can open it at once, other devices (such as my USB headset) do not inherit this configuration.  If you want software mixing you have to set it up yourself.  Luckily I sort of knew how to do that already.  Unfortunately the online documentation describes a setup that defines lots of extra PCM devices.  These cause programs like Skype (among others) to open lots of half-valid PCMs, emitting lots of scary-looking errors on stderr, pausing while they wait for the device, and generally looking broken.

Application Bugs

First, the bugs I already knew about when I started this.

Most applications choke and don't display a useful error message (or, at best, pop up a modal dialog) when they can't open their device.  They won't tell me what's wrong, so if I don't want to waste time trying to map a PID to a PCM device to an alsa configuration and figure out what the heck went wrong, I have to make sure everybody uses dmix all the time.

Quod Libet has the bad habit of leaving the device open when the music is paused.  Worse, there's no "stop" button, so the only way to free the device is to quit the application.  If anything needs exclusive access to the sound device it's using, too bad.

Flash and most native Linux games (quake4, NWN, OTRSPOD) don't let you configure what audio device they're going to use.  Quod Libet requires you to edit a text file and memorize gstreamer pipeline syntax, which I can never remember.

ALSA Configuration

I set out to define the entire thing in one simple alsaconf stanza, understanding each line of it rather than just copying and pasting.  This is the area where I'd like to raise my first complaint.  ALSA configuration is really, really poorly documented.  Reading the various wikis, one gets the impression that nobody really understands how it works.

In fact, reading pcm.c, I'm not sure that the ALSA people themselves really understand how it works.  There's no intermediary structure to represent actual devices and such, just the tree of config data itself.  Therefore there's no way to verify that a config file is valid without actually calling a bunch of snd_pcm_* functions.

However, with the help of two pieces of documentation, "PCM (digital audio) plugins" and "Configuration files", as well as some reading of the aforementioned .c files, I worked out what was expected in ~/.asoundrc.

I've annotated the final results of this adventure in the resultant configuration file, which is almost an article in its own right.

Broken By Default

Another complaint here is that ALSA's policy seems to be as broken as possible by default.  When you start defining a device, you get a device that can't do software mixing.  Why isn't dmix just part of a regular PCM?  What possible value does making device access exclusive have?  If there is some value, why isn't it explained anywhere?  Okay, okay, so I'll set up dmix.  Wait, now I can mix input, but can't multiplex output?  Why do I have to learn about dsnoop?  All right, now I've set up dsnoop.  Now I have to learn about "asym" to put them together.  Okay.  Wait, now 'arecord' can't record from the device?  What?  The sample rate is wrong?  Oh, I have to use the "plughw" plugin to allow mixing at any sample rate.  Wait what?  dmix doesn't work with plughw?  Oh, I have to wrap a "plug" around the outer device?  Why?  Wait, now that I've set this all up, I have to turn on access to other users who already have been explicitly granted permission to write to this audio device?  And of course, if I unplug and plug in my USB devices in a different order, or restart my computer, all my configuration is now wrong, because all the examples use card indexes rather than more stable identifiers.  So I have to go find the (mostly undocumented) stable identifier and start using that.

Which device?

However, now that I've defined my custom device, there's still another problem.  From my list of software above, Skype can see the new device, but Ekiga and Twinkle can't.  Neither "aplay -l" nor "arecord -l" shows my new device.  However, at this point, Ventrillo is interrogating the Windows sound API, it's getting a list of devices which always contains "default", but randomly contains "dsnoop:0", "dmix:0" — or sometimes, two or three devices with blank names.

A workaround is possible with aplay, arecord, and Twinkle, all of which all allow me to explicitly supply a device.  Leaving WINE aside for the moment, I decided to investigate why it was that Ekiga (purportedly as desktop- and sound-savvy a program as one is likely to find) could not see my custom device

A Detour: The Mystery of ALSA Device Enumeration

Ekiga uses a library, Portable Windows Library, or PWLib, to address audio devices.  When the ALSA PWLib plugin lists devices, it follows the example of the alsa utility "aplay" listing devices, which is to say it uses the API to list only hardware devices.  Ekiga has an explicit provision for the "default" device, realizing that someone may have reconfigured that to do something dynamic, but with no provision to allow you to select a different one (even by explicitly typing it in).

The Bluetooth-Alsa project has also noticed this problem, and has cooked up a patch which looks a little silly to me, hard-coding the device name "headset".  Oddly this would have fixed my problem, even though my "headset" device is not (currently) bluetooth.

After discovering all this, I resolved to find the "right" way to enumerate ALSA devices.  With Skype as the only "good" example, unable to get its source code, I spent a while downloading source to different programs.  Eventually I discovered an example in PortAudio which yielded similar results to Skype's introspection.  I implemented a brief C program of my own to verify it.

The lesson here is for programmers: if you are writing an application or library which uses libasound directly, you need to enumerate the configuration hierarchy under "pcm" and make some enlightened guesses about which devices are interesting.  It's not completely correct, but it will at least get you a list that contains the stuff the user wrote in their ~/.asoundrc.

No really, which device?

So, in the absence of any trick to convince Ekgia and friends to actually look at my newly built configuration — and remember, WINE was even worse — I needed a way to trick the ALSA library into paying attention to environment variables.  As it happens, ALSA does pay attention to environment variables!  Unfortunately, it pays just enough attention to hurt.

In its stock configuration, ALSA respects several environment variables: ALSA_PCM_CARD, ALSA_CARD (which mean "what hardware card to use by default") and ALSA_PCM_DEVICE (which means "what hardware device to use by default").  None of these options allow you to specify an additional config file.  None of these variables allow you to select a non-hardware PCM by default.

And it's quite tricky to tell it how to do that.  There's an example on the ALSA wiki which describes how to do this if you're using Pulse (which, sadly, I am not).  ALSA provides some very useful configuration on the default device, including (in my case, at least) automatic dmix, and I'm not really sure how it does it, so I don't want to override it for most applications.  In the case of the unusual, uncooperative ones, I wanted to be able to set an environment variable.

The configuration language for ALSA really sucks, which is weird, because it contains a LISP implementation that would have been perfect for doing this sort of thing.  However, asound.conf does not have support for conditionals (so you can't say "if this environment variable isn't set, omit this stanza").  Most importantly, the user configuration is evaluated before the system configuration, so even using the extremely primitive facilities available to you, you can't copy the default pcm.default and ctl.default declarations before you decide to override them with your own.

I discovered a fun fact at this point: while "pcm.!default = pcm.default" will just exit with an error, more creative forms of looping (i.e. with "getenv") will segfault applications that use ALSA.  You can probably guess why.  So this ad-hoc mini-language is just complex enough to be dangerous, but not enough to be useful.  This is a good example of the repeated theme that once you need to invoke functions, it's better to just use a real programming language.

With enough head-banging, I eventually realized that while I couldn't conditinally load a configuration file, I could decide to load a default (i.e. empty) file or a file of the user's choosing based on an environment variable.  While this definitely isn't as graceful as "sh -c 'ALSA_PCM=mypcm ekiga'", it works and I can put it into shell aliases and desktop icons and just forget about it.  You can see the result in my config file

Finally I created both .asoundrc.empty and .asoundrc.defaultheadset, which contained simply:
pcm.!default "headset"
ctl.!default "headset"

Okay, this device!

Now, I was ready to test out this setup and see if I could convince e.g. Ventrilo to use my headset, while most programs would start up with my speakers.  Experimenting with ALSA_BONUS_CONFIG using quodlibet, aplay and friends seemed to work fine.  Great!  Only 20 hours, two or three dozen C files, and a painstakingly custom-configured system later, I could use my headset to listen to music!

Surprise!  It Doesn't Quite Work

While most programs work OK in this configuration, it turns out that Wine crumbled under more than superficial testing.  Ventrilo starts up and plays some sounds, but if it's configured to use the dmix playback device, after the PTT button is pushed once, it chokes.  It doesn't matter if I use the OSS drivers with ALSA emulation, a different asym with the hardware capture device, or whatever.  dmix plus ventrilo equals broken.  Except that it does work on my actual default device, if I plug my mic into the regular microphone port rather than into the USB sound card.  Supposely my default sound card is using dmix as well, so — where's the problem?

Although the circumstances are different, this is very much like the last time I was messing with Linux audio, trying to get Skype and Flash to work with Pulse.  They didn't work with the Pulse plugin, but there was no indication why.  Audio skipped, it stopped, the sound server locked up, but in no case did I get a useful error message, or even a distinctive enough error message to google for.  The failure mode of pretty much every layer of the audio stack (and as I have just demonstrated, they all fail a lot) is to emit silence and record silence, to freeze, and to have applications crash before they start up, leaving the user wondering what happened.  It would be a lot less depressing trying to find and diagnose bugs if there were more error messages that made sense.

Still, this hasn't been a complete waste.  I have an environment variable which can select an appropriate default sound device for a given process.  Even if pulse does start working for me, that should be useful for the huge piles of applications that don't directly support it yet.

Probably Enough for Now

My fight with ALSA isn't quite over, and I haven't reached any terribly useful conclusions beyond "this sucks".  I do have a few suggestions, however.

To Audio Infrastructure Programmers

I'd like to say "make everything work", but I realize that isn't realistic.  For now, please just give those of us who are trying to slog through all of this stuff some better tools to understand what is going on, and to debug it.  See how your spiffy sound driver/plugin/library/server works with existing applications.  Even — or perhaps especially — with proprietary applications that users have little recourse to change.

The most frustrating thing about spending so much time trying to make such a simple use-case work is that I've learned so little of value.  I want to provide really useful, detailed bug reports, but at best I understand something somewhere is setting the wrong sample rate, and at worst I am completely baffled.

To Audio Application Programmers

The audio landscape in linux is obviously a mess.  Sorry.  That doesn't mean that you can get away with supporting one of the panoply of sound mechanisms available.  Use PortAudio or OpenAL or something if you possibly can, so that you can leave the work of choosing whether to try pulse or ALSA or OSS or Jack or whatever to someone else.

Please test your applications with more than one sound plugin.  Try it with dmix and dsnoop, try it with pulse, and try it with at least a USB sound device and an internal sound card.  It would probably be good to try it with more than one kind of internal sound card, too, since the drivers apparently vary a lot.

It would also be good if some of you could start talking to the infrastructure guys and maybe agree on some kind of standard for telling applications where their audio should go.  We have $DISPLAY for a reason; there are plenty of times when the "default" isn't really the default.

To You Poor Users

If you possibly can, stick to just one sound card in Linux.  I did that for a while and it worked OK.  It's obviously possible to go beyond that, but depending on what kind of card you're getting, you might have problems.

One trick I've learned is that if you want to exclusively use something other than your onboard sound card, you can disable the onboard card in the BIOS.  This is a lot more reliable than telling Linux to do it, and you can still get the benefits of linux thinking that your other sound card is the "default", which somehow gets various bits of ALSA mixing magic applied to it, which I don't think I've figured out all of yet.

Hopefully by 2010 all of these concerns will be moot, though.  The Linux world has been moving shockingly fast in every other area; maybe it's time for audio to catch up.

The X Window System

Sometimes people ask me what my desktop system looks like.  I use the X Window System.



The X Window System supports bit mapped displays with multiple color depths, from black-and white to the millions of colors shown here.  It supports overlapping windows, multiple fonts, keyboards, pointing devices such as "mice" and "trackballs".

With a sufficiently powerful display adapter, you can also run popular games such as "World of Warcraft" by Blizzard Entertainment.



Okay, that's enough of that.  Please forgive the self-indulgent nostalgia and inside humor — I hope some of you enjoyed it.

To those of you who have no idea what I'm on about: when I was younger, I was really interested in different operating systems and desktop environments, and read a great deal about them.  At the time I was doing this (circa 2000) I was already using a Linux machine, and therefore X windows.  It was a pretty reasonable (if somewhat rough and DIY) desktop environment at the time, but every time i ran across some online publication talking about it, the included picture was some hilariously improbable shot showing TWM, XBiff, XClock, and XLogo.  Inevitably there would inexplicably be an XEyes window as well.  Who would actually run a program that did nothing but display a logo?  Wouldn't a pair of googly eyes following your mouse around be distracting?  Why would you run TWM when you could run WindowMaker?  To go with this improbable screenshot, there was typically a retro blurb explaining that it supported "bit-mapped graphics" and "multiple colors".

These descriptions were written at a time when such things were taken for granted (and not written up in contemporary descriptions of, for example, the BeOS desktop environment).  I suppose the authors were somewhat lazily copying from ancient marketing copy, unable to make sense of the bizarre constellation of window managers and desktop environments surrounding X itself.

Unfortunately most of these pages have since faded from the web, but as an ironic example, Wikipedia still has an article that mentions "raster graphics" and "pointing devices" as well as the traditional screenshot.

Unlike other other systems which throw their history down the memory hole, every major Linux distribution still ships with a recent, up-to-date copy of these archiaic tools.  So, every once in a while, I choose "TWM" from the "sessions" menu in GDM and have a chuckle.  Once cedega allowed me to replace the traditional "xdémineur" with WoW, I realized this screenshot was just begging to be taken.  If you're curious what my desktop actually looks like, here's me switching between windows while writing this article:



Maybe one day Steve Holden will ask me to participate in "On Your Desktop" and I'll go into a bit more detail.

Zork: Now In Full HD

I am a fan of interactive fiction.  While it would be an understatement to say that the medium has experienced a bit of a lull since its heyday in the era of Infocom, I have been fairly impressed by the IFComp competitions of recent years; really enjoying these games as new, unique experiences rather than nostalgia or kitsch.

One thing has been consistently irritating is getting all of my fancy, modern hardware to play these games in a satisfying way.  The one place I definitely don't want any nostalgia in this experience is remembering the 14-inch flickery CRTs that I played these games on in my youth.  I could point fingers at various unsatisfactory pieces of software, but the fact is that the increasing variety of interpreters (glulxe, hugo, tads and z-machines of various versions) it is sometimes frustrating to even get these games to run on linux, let alone look good.

Enter Gargoyle, a classy looking, multi-platform, multi-interpreter interactive fiction system.

My original experience with Gargoyle, several months ago, was actually pretty bad.  It looked nice, but it took forever to figure out how to compile it.  I had to patch it and I couldn't figure out how to do it gracefully enough to compile on anybody else's system.  When I did built it, it opened a small, apparently fixed-size window with a fairly small font.  Not that good for an immersive experience on a desktop, and terrible for playing on the couch on my HDTV across the room.

Today, I discovered that, first of all, it builds now.  Not only that, there are some packages for Ubuntu if you are adventurous enough to try them.  Finally, and perhaps most importantly, I discovered that you can configure it pretty easily.  In a few minutes, thanks to the "Toggle Fullscreen" key in Compiz's "Extra WM Actions" plugin, I had a config file which looks really nice either on a "full HD" (1920x1080) TV or WUXGA desktop monitor.

Thanks to Gargoyle, here is what Zork looks like today:



Get your copy of "Zork HD" (by which I mean, "my ~/.garglkrc file") here.  Toggle it full-screen and enjoy.

The Emacs Test

I use Emacs.  However, unlike some Emacs users, I don't treat it as a religion.  In fact, I'd rather be using a more "modern" IDE; one that understands my code on a deeper level and provides things like refactoring tools, integrated debugging, and "view method implementation" that work reliably and don't require weeks of configuration effort to use.  One that uses modern UI conventions instead of arcana from the 70s so that my friends who are not emacs-heads can quickly wrap their heads around what's going on on my screen, and perhaps dare to touch my keyboard.

However, even if one is keen to do it, switching away from Emacs is a big deal.  I see lots of editors that advertise "emacs keybindings".  While I appreciate the effort, these features always look like someone who has no idea how to use Emacs worked through some kind of quick cheat-sheet of features like "keybinding for Save", "keybinding for Save As", "keybinding for Close Window" and just added them one after another.  Sometimes, with no regard for whether these keys conflict with other shortcuts!  (I'm looking at you, gnome "key-theme".)

Do you think you can write an editor which can replace Emacs for me?  Here are a few features, taken both from my years of customizing Emacs to meet my needs and some basic features in Emacs itself that non-natives never seem to understand.

I'm leaving out the extremely basic stuff like "syntax highlighting" and "automatic indentation" since most editors do OK on those fronts already.  These are the things that I find have been subtle, in that they're broken almost everywhere outside of emacs.

Can you do what I mean when I press the "go" button?

When I edit code, I repeat these steps endlessly:
  1. Edit a test file.
  2. Run the tests.
  3. Edit an implementation file.
  4. Run the tests.
In order to do this in my current emacs setup, I open my implementation file, then open my test file, then press F9.  Then, I edit a little bit, and press F9 again.  Then, I switch to the implementation file, type a little bit, and press F9 again - Emacs knows which tests to run, because the implementation file has an annotation in it which describes the test-cases that are associated with it.

When I'm done, I push F5 and it immediately jumps my cursor to the error that i'm working on.

This works for me in Java, in C, and in Python.  I've got a little bit of custom emacs-lisp code that I wrote to do each of them.  I'm willing to write a little more.  But, is it very easy in your editor to grab a global keybinding?  To write a plugin in 4 or 5 lines of code that just formats a command-line string to run?  To parse the output of a subprocess?  To visit a file and line number without requiring further user interaction?

Can I reach the thing I need to work on fast?

Let's say I'm working on a file called "foo.c".  I want to open "bar-baz-boz-qux.c" in the same directory.  In Emacs, this is probably just "control-x control-f b <tab> <enter>" - maybe a few more letters if there are other "bar-" files.  Do I need to hit more buttons than that in your editor?  Do I need to reach for the mouse?  Do I need to navigate the inefficient-even-from-the-keyboard GTK file dialog?

Now, let's say I've got the file "foo.c" open in several different branches of the same project, and I want to quickly alternate between them, display them side-by-side, etc.  In Emacs, "C-x C-b" will bring up a list of every open file, and as I type its name, the list is reduced to only those who match what I've typed so far.  So if, for example, I type "foo.c", I'll get all my "foo.c"s.  I can cycle between them with C-s until I get the one that I want.

I don't want to hunt around and click on a tab-bar or hit "next next next next".  I just want to type in a little bit of the file name and have the editor figure the rest out for itself.

Can I use it on Mac, Linux, FreeBSD, Solaris, and Windows?

Emacs is extremely portable.  Every operating system I want to work on (and many, many that I hope I never do) can run it.  I don't want to invest the energy to learn a new editor if I can't use it everywhere.

Can I use it remotely over the internet?  Collaboratively?

You can cheat on this one: if you can run in a terminal (and under Screen), then you get both of these for free.  Emacs cheats that way.  But if you can't run in a terminal - no, I'm sorry, VNC is not an acceptable substitute.  It doesn't perform adequately over the internet and it probably never will - as the internet gets faster, the average display gets bigger and has more colors on it.

Your editor probably can't run in a terminal.  So you'd better give me a way to pair-program with people over the internet and a way to access the editor on my desktop machine when I'm away from home.

Can I see whitespace?

I don't like to leave invisible droppings in files when I edit them.  I'd like to be able to see trailing whitespace, highlight it, and eliminate it.  I'd like to be able to tell if I have any tabs in my files (python does not like that very much).

Do I need to carefully juggle my clipboard - or do you have a "kill ring"?

Normally, if you cut some text, then cut some other text, you lose the first text — unless you use "undo" or something like that and screw up your editing state.

In Emacs, when I cut five or six different pieces of text, and I go to paste them, I can paste any of them.  I don't have to carefully remember what's on my clipboard, because the last 60 or so things that I cut or copied are around in case I need any of them.

Do I need to carefully remember where I was and scroll around to get back there - or do you have a "mark ring"?

Emacs has what amounts to a "back button" for your text editor.  If I edit something interesting, go to another window, go to another project, and then want to jump back a few steps, I can easily do that.

This is particularly helpful when, for example, inserting some import statements.  I'm in the middle of a function in the middle of a big file.  I want to use the Foo class, but it's not available yet.  So, I jump to the beginning of the buffer, type "from baz.bar.foo import Foo" and then hit "C-u C-space" to jump back to the middle of that function I was working on.  "C-x C-space" does something similar, but can even take me to different files.

Can you do smart word-wrapping?

/**
 * When I have a long documentation comment in ActionScript, JavaScript or
 * Java, Emacs will helpfully wrap it like this.  If I make changes in the
 * middle and then re-wrap it, it stays wrapped and helpfully adjusts the
 * placement of the asterisks along the left-hand side.  Can your editor do
 * this?
 */

# But when I have a comment in Python, it's formatted like this.  I don't have
# to tell Emacs anything about the different comment styles.

-- For that matter, it can understand and properly format SQL comments too.
-- And C/C++, and Ruby, and PHP, and more.

"""
If I format code inside a docstring, it flows properly too.  Granted, there are
a lot of bugs in this particular case in the stock Emacs, but since it works
everywhere else I have written some workarounds.  (You could always work around
it by inserting some extra blank lines before wrapping, but that always
bothered me.)  Can I customize how your flowing works if I don't like it?
"""

Is there version-control integration?

If I'm editing a project that uses bzr, darcs, git, svn, hg, perforce, or cvs, I can get a nice "status" page as a jumping-off point in Emacs to show me which files are in version control and what files I've edited.  I can update, commit, pull, push, and diff out without leaving the editor.  Can your editor do that?  And I don't just mean, can your editor do that for SVN.  Does it support all of the systems I just named and a few others for good measure?

Can I tell what I'm working on?

I don't like having to scroll around to figure out what function I'm in the middle of when I forget.  I work on a lot of code, and I browse a lot of code, and sometimes if I'm in the middle of a 200-line-long function I can't see the class or the function name.  Emacs has a feature called "which-func-mode" which allows me to glance at the bottom of the screen and instantly know what function I'm working on.  Fancy, glowy sidebars with tree-views of my whole source file and inheritance hierarchy are great, but can I always see the name of the class and the method that I'm working on now?  Even if there are so many other methods on that class that the fancy method list on the left has to scroll?

Last but not least...

Can I code for it?

I'm a programmer and I need a programmer's editor.  I don't want to write giant, heavyweight plugins; I want to be able to quickly toss off a snippet of code which modifies the editor.  But, I'm not an IDE developer.  I don't want to write a giant plugin; I want an editor which lets me organically grow my own modifications when I find myself doing some task frequently.

For example, I have my own "snippets"-type module, "quick-hack mode", which does a ton of clever-clever things like inserting
def (self):
    """
    """
when I type "def" inside of a "class" block.  (of course the "def" is omitted otherwise).

I have a hotkey to turn this off in case other people find it annoying ­— and it's difficult to be ambivalent about this mode, you either love it or hate it: it's a very personal thing, and I don't expect your editor to support it directly.  This mode was developed after years of observing my own peculiar tics while editing and crafting conveniences to support that and free me from distractions.

Since developing this kind of support code isn't my main interest, coding needs to be more than possible, it needs to be easy.  I need interactive help and the ability to load a brief snippet of code into the editor without restarting it; a reasonable debugger so when it blows up in my face I can at least sort of tell why.

Do you feel lucky?  Well, do you?

Emacs has a lot of features.  You don't have to replicate them all.  But if you want me and the millions like me — okay well maybe not millions, but it's not just me either — to switch to your shiny new editor, you'd better be doing all of these things and doing at least one or two other totally awesome things that emacs can't do.