The Olestra of the Web

I suppose if you do not know what Python is or what the Web is, maybe you don't know this, but Guido has been asking about web frameworks.

I wish I had something more substantial to throw into the discussion here, but if you're interested in my work on the subject, you know where to find it.

Python's web situation is a mess, there's no denying it. Even if there were some totally awesome framework which everyone should be using that was head and shoulders above everything else, it would take a while to figure that out at this point, and to achieve some consensus, just by the sheer amount of work required to get everyone's attention, and once the right people had found it, to focus the larger community on it. Unfortunately I think it's clear that no such silver bullet exists, and that there is unlikely to be one any time soon. Even if there were, larger projects like Mantissa and Zope would still exist, because it's not likely that most people's criterea for "web framework silver bullet" includes "integrated digital telephony".

The general frustration with (and bewilderment at) this situation has been breeding a poisonous feeling among Python developers though, I fear. I'm calling it "disenframeworkentarianism", and it is best typified by this post (unrelated, but linked-to by Guido's discussion). There is a strong desire for all the benefits of frameworks without any of the overhead. "Frameworks are too fattening for my code!", we exclaim, "Let's come up with something that has all the delicious flavor but none of the calories." This is the same line of reasoning which lead to Olestra, and various other food-substitutes.

While superficially the idea of a food that tastes great and has no impact on your weight is a great idea, it turns out that there is a word for things that you ingest which don't do anything for your body: poison. For example, I discovered a wonderful article recently: Why No One Should Eat Products That Contain Olestra.

Frameworks with zero "fat", that is, overhead imposed on your code, are probably not actually doing anything for you. It looks like they're "lighter-weight" than other frameworks, because they don't provide any of the "fat", that is, the cognitive overhead required to learn the framework, or as Guido puts it, "lavish quantities of kool-aid". By not providing a consistent view of the world from the framework's perspective, you end up having to do more of that work yourself. You have saved time learning someone else's vision of the world, but instead of actually saving time overall, you've had to construct your own island of functionality, totally isolated from other developers and integration opportunities. It may well be that you needed to do that work, because the requirements of your project are sufficiently bizarre and integration-heavy that you are really going to have to tie things together yourself anyway - that doesn't mean it's not more work.

I'm not saying this as a defense of super-heavy-weight frameworks either. The opposite of Olestra is not a bucket of fried lard, it's a balanced diet, with plenty of low-calorie things like fresh leafy vegetables. J2EE is the Big Mac here; plenty of calories, but most of them empty. I think this metaphor could be extended well into absurdity, with the requirements of your project being its "metabolism", unit tests being "exercise"... but let me get back to the actual point here.

No project has come along yet that really tries as hard as it can to be completely modular; i.e. let you plug in your own solution for any layer of any stack with no overhead. There's good reason - whether it's a good idea or not, it's just too hard. When frameworks do try to let you choose, it turns out that somebody has to come along and pick up the pieces, and do the work of getting the disparate pieces that you want to work together, to actually work together. The trendy thing these days is to call this a "megaframework", but I think it's a symptom for this frameworks-proliferation disease, not the cure. It's largely a matter of perspective. In some cases aggregation works wondefully - Ubuntu is the mega-est megaframework of them all, and it's fantastic. At some level it just falls apart though - nobody tries to cobble together a word processor from "best of breed" word-wrap algorithms and icons; you use AbiWord or you use OpenOffice, and you don't complain that you can't use AbiWord's tables feature in OpenOffice documents - they're just different tools, and you have to decide on one or the other.

I think that the community would mostly be better served right now by some straightforward competition - try to create the best full-stack experience, and maybe eventually it will become clear that project X really does have the best ORM and project Y has the best templating system, and the user communities can exert some pressure on the development teams to merge, but until then we need good, full, complete experiences for developers trying to build sites, whether they're megaframeworks or kiloframeworks or gigaframeworks, not half-finished chunks of functionality you are expected to finish yourself. It's much easier to extract just the part that you need from a polished, working solution than it is to create a polished, working solution from a handful of parts that by themselves do nothing useful, even if they're well-built parts.

Since it probably needs to be said... While I'm a "framework" guy, and I obviously have my own axe to grind here, note that I'm not commenting on Guido's blog (or responding to it here) and saying "Use Nevow!!!!" (Although, it seems like there are enough people doing that for me ;-)) There are serious problems with all current web frameworks, Nevow included. I think Nevow could be really awesome, but there are parts of it which really need to be removed and streamlined, cleaned up, and hidden from view. Let's face it, Nevow guys - the sequence renderer is a totem of Nevow's failure, at least from the perspective of comprehensibility to new users. Also, as Tim Peters famously said, the "it's surely too much for a lesser being" argument is pretty weak - I trip over sequence so regularly that I've stopped bothering to use it.

So, I've got to say, if you're in the market for one more tool in your build-your-own-web-framework toolbox, Nevow is probably not for you. If you're looking at a larger system, with object publishing, multi-protocol support, sessions, authentication, modular AJAX, and a variety of other features, it is a good tool to use, though. Especially the AJAX thing - most of Nevow's unnecessary complexity is worth wading through just to get to Athena, the AJAX-plus-javascript-modules-plus-pluggable-page-fragments Web 2.0 bonanza feature. There aren't enough examples, but I really haven't seen anything like it before. You can extend that even further with Mantissa, to include an integrated database, sign-up, account maintenance, administrative tools, per-user plug-ins, navigation, and so on - but Mantissa and Nevow are separated for good reason; Mantissa is still much more heavily in development, and is a much larger and more ambitious project. I am really looking forward to it stabilizing so that it is a more robust example to show other framework developers some of my ideas.

Statistical anomaly

According to the sitemeter OS stats for twistedmatrix.com, 43% of Twisted users are using some form of Microsoft Windows.

I don't have any such graphs for Twisted developers, but off the top of my head I would guess that 2 out of 40 developers actually use Windows for anything except games and grudgingly booting into it to fix a couple of Twisted issues per year.  I am not some kind of math genius, but even I can tell that is not 43%.

So - where are you Windows/Twisted people?  If you're really out there, we could use a few more dedicated maintianers that don't totally hate the platform.

Mary has an interesting post about IRC. I've h...

Mary has an interesting post about IRC.

I've had the same problem, although IRC is so integral to my work and my life that it's not feasible for me to eliminate it. Most of the time it's infeasible for me to even log out for longer than a few hours during a normal day. I don't really have a reasoned essay about this, but a few thoughts are floating around.


  1. IRC, as both software and a communication standard, is utter garbage. I could write a post twenty pages long just cataloguing its obvious flaws. That makes it very difficult to implement any of these suggestions. It might be reasonable to do this with Jabber, but somebody would have to write a reasonable Jabber server in Twisted first.
  2. There should be a way to broadcast the temperature of an IRC discussion. Twisted is largely just random noise - I want to be able to minimize my IRC client without fear that I'm missing anything, until someone raises the "temperature" to alert everyone "an interesting discussion is happening, tune in now".
  3. In that vein, more discussions should be scheduled, rather than happening spontaneously. An automatic moderator bot could probably help with that.
  4. It should be a lot easier to fork off a political conversation into a new room. I can identify with the rage reaction to various IRC conversations.
  5. There should be some sort of "comforting social gesture" built in to the protocol so that you can greet people, thank them, and generally make polite noises without having to dress up as a cat, weaponize trout, chew off someone's arm, or wrestle them to the ground and lick their face. Those are all "normal" greetings I've seen otherwise normal and professional IRC users use to greet each other. In my experience, women who engage in this sort of thing tend to be inappropriately affectionate, but men tend to concoct extremely violent, bizarre and intricate rituals, sometimes with a defined back-and-forth that might be five or six steps.

    World of Warcraft's emotes system achieves this by being really limiting, and thus far, from what I can see, that's a great thing. WoW players don't spend hours developing byzantine in-character rituals to greet each other - you have your option of /wave, /clap, or /bow.

Why Axiom Doesn't Expose SQL

I didn't put the full title of this article into the "subject" because it was super long.

It Is Pretty Much a Bad Idea to Expose Raw SQL Through Your Database Access Layer

or

Fun Things I Found Out About Your Company With Administrator Access to your Database

I was originally inspired to write something about this when I read Jonathan Ellis remarking that ORMs should include direct SQL access because Django recently added a different, but still dodgy, 'OR'-operator support to a syntactically disappointing ORM query syntax. However, it really came to a boil for me when I found that Divmod had been running some third-party software with an SQL injection vulnerability. (Yes, we have since patched it, no harm done.)

Security experts have long known that code-injection attacks are pretty easy on many popular programming platforms, and you should take steps to prevent them. It's easy to find commentary on this. Stephen Thorne has had many amusing and insightful things to say about PHP's vulnerability to SQL injection attacks, as well as the occasional dig about just injecting PHP code itself. If you're looking for something more serious, Steve Friedl has written a fairly comprehensive guide to understanding, executing, and preventing against SQL injection.

An ORM's job is to provide an alternative interface to a database. Interfaces should be complete things, not broken fragments of utility which require manual crank-turning to function. If you have to use a different mechanism to access the database within your application, the ORM is incomplete and should be fixed. Sure, many programmers who use ORMs also know SQL, and that is a useful skill, because today these are in closely related problem domains, but they should not have to use SQL within the same context that is using the ORM.

The python "os" module provides (among other things) an alternative interface to a large portion of the POSIX C API. As I said, interfaces should be complete things. If you use a different mechanism to access the POSIX C API, the "os" module is incomplete and should be fixed. Again, sure: many programmers who use Python's "os" module also know the POSIX C API, and that is a useful skill, because these are related problem domains, but they should not have to use the POSIX C API within the same context that is using the "os" module.

In both cases, you can generate the underlying code yourself, and in both cases, people sometimes really need to, so the fact that you can is important. However, Few C programmers ever want to drop back down to C when they're using Python, and will rightly avoid it (as a complexity cost) when they can; yet many Python programmers who use ORMs frequently and loudly declare that they want to use SQL all the time.

Not to pick on Mr. Ellis. The syntax he's reacting to really is abhorrent (although that's no comment on Django as a whole), and the tremendous Ruby on Rails movement seems to largely agree with his point. There are good reasons that people want SQL access from within ORMs. It's simple: most ORMs are really, really awful. They are heavy on the "object" and not so much on the "database". There are a lot of features that SQL provides which they don't expose.

If you use an open-source ORM and find yourself wanting to use SQL to get at one of those features, consider not clamoring for (or not using, if one exists) an SQL execution back-door included in the library. Instead, consider ways to integrate that SQL feature with the existing structure of the ORM, or an extension which wraps that SQL feature on top of the ORM and only generates the SQL you need in one small place. Obviously this goes double if you are a user of Axiom - I do my best to accept any patches that expose new database features that were previously obscured.

Don't just accept the status quo and generate SQL strings from within your application. Originally this post was going to be longer and talk about API structure and communication between programmers and preconditions and postconditions and all kinds of fancy computer-science garbage, but I think it would be better to leave you with just this one thought - the security implications alone are more than enough reason to be extremely sparing, and careful, with the places that your code generates SQL. Isolate it, test it, audit it, and don't make it a habit.

Update: This article is confusingly titled. In fact, Axiom does provide an API for getting at SQL. Store.executeSQL. The point I am stressing here is that axiom does not "expose" it in that it is not a supported, public API, and if you have to call it, Axiom is broken and you should let Divmod know what you needed it for. To attempt to totally deny access to that layer would be unwise; as I said earlier in the article, "you can generate the underlying code yourself, and in both cases, people sometimes really need to, so the fact that you can is important".

d00d ur pr0n warez sploits r pwnd

Microsoft has a guide to "leetspeak", or as they call it, "kidtalk", clearly written by someone who doesn't understand either leetspeakers or languages. It's unintentionally hilarious in many ways. Some highlights:
  • The first word Microsoft thinks it's important to introduce to parents is not "pr0n", or "cyber", or even the actual meaning of "leet"1, but "warez". The worst thing your child might be doing online is apparently copying Microsoft's hard-earned "intellectual property". As they put it "The first series is of particular concern, as their use could be an indicator that your teenager is involved in the theft of intellectual property, particularly licensed software."

  • This page purports to help you understand language, but who helps you understand Microsoft's incomprehensible doublespeak? At the bottom of the article, when asked "Was this information useful?" I clicked "Yes" to be greated by this semantic gem: "Object reference not set to an instance of an object. We are experiencing technical problems. Sorry for the inconvenience. We are still interested in hearing your comments if you have time to provide your feedback. You can do one of two things. You can close this window, refresh your browser, and submit your comments. Or, you can try later."

1. If you don't already know, "leet" means "elite", which in the world of online teenage hooliganism means something like "popular with the in crowd". Someone is is "leet" if they are worthy enough or have proven themselves in a variety of totally arbitrary challenges. The movie "hackers" captures this eloquently in a scene where the protagonist is quizzed on the contents of various technical specifications by giving both the slang name and the technical name of various colors of book, to demonstrate his knowledge. Us, uh, legitimate programmer types occasionally use leetspeak ironically, and online gamers slightly moreso, but if your teenager is using so much leetspeak that you need a glossary to deicpher their online communications, it's likely that the people they're talking to are not the good guys. More importantly than worrying about whether your teenager has illegally acquired a copy of Microsoft Office, you should be concerned if they are spending a lot of time and energy trying to prove themselves capable to a group of people who are, like as not, professional criminals.

Hmm, this footnote is now longer than the rest of the entry. Perhaps I will save the rest of this thought for another post...