Why Axiom Doesn't Expose SQL

I didn't put the full title of this article into the "subject" because it was super long.

It Is Pretty Much a Bad Idea to Expose Raw SQL Through Your Database Access Layer

or

Fun Things I Found Out About Your Company With Administrator Access to your Database

I was originally inspired to write something about this when I read Jonathan Ellis remarking that ORMs should include direct SQL access because Django recently added a different, but still dodgy, 'OR'-operator support to a syntactically disappointing ORM query syntax. However, it really came to a boil for me when I found that Divmod had been running some third-party software with an SQL injection vulnerability. (Yes, we have since patched it, no harm done.)

Security experts have long known that code-injection attacks are pretty easy on many popular programming platforms, and you should take steps to prevent them. It's easy to find commentary on this. Stephen Thorne has had many amusing and insightful things to say about PHP's vulnerability to SQL injection attacks, as well as the occasional dig about just injecting PHP code itself. If you're looking for something more serious, Steve Friedl has written a fairly comprehensive guide to understanding, executing, and preventing against SQL injection.

An ORM's job is to provide an alternative interface to a database. Interfaces should be complete things, not broken fragments of utility which require manual crank-turning to function. If you have to use a different mechanism to access the database within your application, the ORM is incomplete and should be fixed. Sure, many programmers who use ORMs also know SQL, and that is a useful skill, because today these are in closely related problem domains, but they should not have to use SQL within the same context that is using the ORM.

The python "os" module provides (among other things) an alternative interface to a large portion of the POSIX C API. As I said, interfaces should be complete things. If you use a different mechanism to access the POSIX C API, the "os" module is incomplete and should be fixed. Again, sure: many programmers who use Python's "os" module also know the POSIX C API, and that is a useful skill, because these are related problem domains, but they should not have to use the POSIX C API within the same context that is using the "os" module.

In both cases, you can generate the underlying code yourself, and in both cases, people sometimes really need to, so the fact that you can is important. However, Few C programmers ever want to drop back down to C when they're using Python, and will rightly avoid it (as a complexity cost) when they can; yet many Python programmers who use ORMs frequently and loudly declare that they want to use SQL all the time.

Not to pick on Mr. Ellis. The syntax he's reacting to really is abhorrent (although that's no comment on Django as a whole), and the tremendous Ruby on Rails movement seems to largely agree with his point. There are good reasons that people want SQL access from within ORMs. It's simple: most ORMs are really, really awful. They are heavy on the "object" and not so much on the "database". There are a lot of features that SQL provides which they don't expose.

If you use an open-source ORM and find yourself wanting to use SQL to get at one of those features, consider not clamoring for (or not using, if one exists) an SQL execution back-door included in the library. Instead, consider ways to integrate that SQL feature with the existing structure of the ORM, or an extension which wraps that SQL feature on top of the ORM and only generates the SQL you need in one small place. Obviously this goes double if you are a user of Axiom - I do my best to accept any patches that expose new database features that were previously obscured.

Don't just accept the status quo and generate SQL strings from within your application. Originally this post was going to be longer and talk about API structure and communication between programmers and preconditions and postconditions and all kinds of fancy computer-science garbage, but I think it would be better to leave you with just this one thought - the security implications alone are more than enough reason to be extremely sparing, and careful, with the places that your code generates SQL. Isolate it, test it, audit it, and don't make it a habit.

Update: This article is confusingly titled. In fact, Axiom does provide an API for getting at SQL. Store.executeSQL. The point I am stressing here is that axiom does not "expose" it in that it is not a supported, public API, and if you have to call it, Axiom is broken and you should let Divmod know what you needed it for. To attempt to totally deny access to that layer would be unwise; as I said earlier in the article, "you can generate the underlying code yourself, and in both cases, people sometimes really need to, so the fact that you can is important".

d00d ur pr0n warez sploits r pwnd

Microsoft has a guide to "leetspeak", or as they call it, "kidtalk", clearly written by someone who doesn't understand either leetspeakers or languages. It's unintentionally hilarious in many ways. Some highlights:
  • The first word Microsoft thinks it's important to introduce to parents is not "pr0n", or "cyber", or even the actual meaning of "leet"1, but "warez". The worst thing your child might be doing online is apparently copying Microsoft's hard-earned "intellectual property". As they put it "The first series is of particular concern, as their use could be an indicator that your teenager is involved in the theft of intellectual property, particularly licensed software."

  • This page purports to help you understand language, but who helps you understand Microsoft's incomprehensible doublespeak? At the bottom of the article, when asked "Was this information useful?" I clicked "Yes" to be greated by this semantic gem: "Object reference not set to an instance of an object. We are experiencing technical problems. Sorry for the inconvenience. We are still interested in hearing your comments if you have time to provide your feedback. You can do one of two things. You can close this window, refresh your browser, and submit your comments. Or, you can try later."

1. If you don't already know, "leet" means "elite", which in the world of online teenage hooliganism means something like "popular with the in crowd". Someone is is "leet" if they are worthy enough or have proven themselves in a variety of totally arbitrary challenges. The movie "hackers" captures this eloquently in a scene where the protagonist is quizzed on the contents of various technical specifications by giving both the slang name and the technical name of various colors of book, to demonstrate his knowledge. Us, uh, legitimate programmer types occasionally use leetspeak ironically, and online gamers slightly moreso, but if your teenager is using so much leetspeak that you need a glossary to deicpher their online communications, it's likely that the people they're talking to are not the good guys. More importantly than worrying about whether your teenager has illegally acquired a copy of Microsoft Office, you should be concerned if they are spending a lot of time and energy trying to prove themselves capable to a group of people who are, like as not, professional criminals.

Hmm, this footnote is now longer than the rest of the entry. Perhaps I will save the rest of this thought for another post...

Ubuntu is the only game in town

Due to a variety of problems, some caused by me, I ended up re-installing Linux and Windows on my main desktop machine this weekend. Given that I was re-partitioning and re-formatting and such anyway, I thought I'd take the opportunity to sample the various free operating systems that I normally wouldn't try out. It was an interesting experience. Here are some of my reactions:

  • Windows: Excruciating, but at least reliable. It took well over 20 reboots, and hundreds of clicks to get Windows to a barely usable state. No hardware was auto-detected. Sound didn't work out of the box. The one saving grace is that I never had to edit a text file, but I did have to know the vendors and model numbers of at least my motherboard and my sound card to get it to work. However, it does have the nasty habit of totally killing your boot sector immediately when you put the Windows CD in, so it did have the benefit of being installed first. I suspect the BSDs would have fared better with this advantage. It never ceases to amaze me that basic functions like copying files do not work on Windows; even a simple cloned install of World of Warcraft doesn't work, presumably because some registry keys are missing. On Linux, my entire home directory was cloned (between different architectures, even! i386 to x64) with no difficulty. This includes a cloned installation of WoW with Cedega, which ironically I can now play on Linux, but will take 5 more hours of installation to be ready on Windows.

  • Ubuntu Breezy: EVMS crashes hard and often on newer Intel motherboards. is still trying to figure out why, but I've given up, especially considering that the problem seems to be fixed in newer kernels. The installation experience, however, was smooth - with the exception of setting up the proprietary drivers for X, which was a simple matter of copying over my old xorg.conf. One reboot.

  • Ubuntu Dapper (Flight CD 2): Identical to Breezy installation experience, except everything was faster and more stable. The bootloader disabled my USB keyboard though - why? It turned back on immediately when I got to the installer. EVMS problems appear completely gone.

  • Fedora Core 4: Very slick looking installer, but harder to use than the Ubuntu one. The partitioning setup was especially grating: I started with Fedora and later had to back out to an Ubuntu install to get the disks set up as I wanted (I didn't want to put any of my OSes on primary partitions since I was installing so many). It didn't recognize my network card - I have a bog-standard onboard card, supported by the linux e100 driver. I even entered it in their network configuration window. I typed in its IRQ. It is almost 2006, and I typed in an IRQ. Still, it did not work. (An IRQ though? Why don't you make me twiddle some jumpers during the install, or ask me to enter a modeline by hand? Seriously, redhat, welcome to the nineties, we do not type IRQs into text fields, or for that matter anywhere, any more. Put this on your big Web Two Point Oh checklist, pundits: do not make your users enter a damn IRQ.)

  • Gentoo: booted a liveCD. Recognized my hardware almost as effectively as Ubuntu. (I didn't figure out X configuration; something was wrong with the integration with proprietary Nvidia drivers.) After three hours of compiling things, though, I just gave up. Who has the time for that? (Keep in mind I am saying this as a guy who had the time to try to install like ten operating systems.)

  • FreeBSD: Can't install on logical partitions. Didn't recognize my network card. Didn't recognize my sound card. Didn't recognize my video card. Couldn't mount a USB thumb drive. I got what looked like a kernel panic when I accidentally pushed a key that was not listed on the menu in "fdisk". Red Hat at least gets points for trying. I keep trying to think of a reason why anyone would install a BSD but every experience I have like this reinforces my prejudice that it is just a slow, buggy linux.

  • DragonFlyBSD: Same as FreeBSD, but with a shell prompt and no installer.

  • FreeSBIE: Booted, started an X server, but that was it. Still couldn't recognize my ethernet card. I could install onto a secondary partition from here, but the directions for formatting partitions appeared to be incorrect (or the code for dealing with partition tables is buggy, take your pick).

  • ReactOS: technically booted, but couldn't recognize the CDROM, so couldn't bootstrap into anything useful.

  • Syllable: wouldn't boot.

  • NetBSD: wouldn't boot.



I don't have any plans to offer any deeper insights into this process, or help anyone install any operating systems as a result of this little rant. I have also made no attempt to be fair, and I don't want to do this again any time soon, so if your favorite operating system got trashed here, don't bother to tell me that I'm not being fair. Think of it as a pop quiz!

Thank you, Ubuntu, for providing me with an operating system that can perform basic functions such as networking and playing sounds without installing a million extra drivers. It seems like you are the only one that can do these things. Now could you please make it so that crazy Enterprise Systems do not cause my computer to crash horribly at random.

Unreasonable Defaults

This is really just some useless whining, but my name is glyph, for goodness sake. Sometimes I just need to complain about fonts.

Firefox for Linux always seems to ship with the crappiest possible font configuration. Why? This is what the fonts look like by default on a fresh install of Firefox 1.5, and also in Ubuntu's unreleased "Dapper" distribution's current 1.5 package.


badtext


Now, here is the same text rendered using fonts that are packaged with the distribution, and in fact, almost all Linux distributions these days:


goodtext


"FreeMono", "FreeSans", and "monospace". Was that really so hard?

Python Namespace Problem

This problem periodically comes up, and generally an unsatisfactory solution is proposed, the cure is worse than the disease, it's rejected, and we ignore the problem for another year.

The 1:1 correspondence of modules to files in Python is handy, but sometimes inconvenient. I would like to be able to write large, complex packages in Python but constrain the interface that they present, so that users only have to import one module in order to use them.

What is a good way to do this?