Ethics for Programmers: Primum non Nocere

This post isn't about Divmod, exactly.

I've been mulling over these ideas for quite a while, and I think I may still have more thinking to do, but recent events have gotten me thinking again about the increasing urgency of the need for a professional code of conduct for computer programmers. Mark Russinovich reported on Sony BMG's criminal contempt for the integrity of their customer's computers, and some days later CNET reported on Sony BMG's halfhearted, temporary retraction of their crime. A day later, CNet's front page has news of Apple trying to institutionalize, as well as patent, a similar technique. While the debate over DRM continues to rage, there are larger issues and principles at stake here that it doesn't seem like anyone is talking about: when you run a program on your computer, who is really in charge?

I posit that, in no uncertain terms, it is a strong ethical obligation on the part of the programmer to make sure that programs do, always, and only, what the user asks them to. "The user" may in some cases be an ambiguous term, such as on a web-based system where customers interact with a system owned by someone else, and in these cases the programmer should strive to balance those concerns as exactly as possible: the administrator of the system should have no unnecessary access to the user's personal information, and the user should have no unnecessary control over the system's operation. All interactions with the system should faithfully represent both the intent and authority of the operator.

Participants in the DRM debate implicitly hold the view that the ownership of your operating system, your personal information, and your media is a complex, joint relationship between you, your operating system vendor, the authors of the applications you run, and the owners of any media that pass through that application. Prevailing wisdom is that the way any given software behaves should be jointly determined by all these parties, factoring in all their interests, and that the argument is simply a matter of degree: who should be given how much control, and by what mechanism.

I don't like to think of myself as an extremist, but on this issue, I can find no other position to take. When I hear lawmakers, commercial software developers, and even other open source programmers, asking questions like, "how much control should we afford to content producers in media playback programs?", I cannot help but think of Charles Babbage.
On two occasions I have been asked [by members of Parliament!], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
The "you don't own your computer" paradigm is not merely wrong. It is violently, disastrously wrong, and the consequences of this error are likely to be felt for generations to come, unless steps are taken to prevent it.

Computer programmers need a socially, and legally recognized code of professional ethics, to which we can be held accountable. There have been some efforts in this direction, the most widely-known one being the Software Engineering Code of Ethics and Professional Practice. As long as I'm being extreme: this code of conduct is completely inadequate. It's sophomoric. It's confused about its own purpose. It sounds like it was written by a committee more interested in promoting "software engineering" techniques, as defined by the ACM, than in ethics. I'll write a bit about exactly what's wrong with it after I describe some similarities in existing professional codes of conduct which themselves have legal ramifications.

Although there are many different codes of ethics for medical doctors, a principle which echoes through them all is one which was formulated in ancient history, originally by Hippocrates but distilled into a catch-phrase by Galen: "First, do no harm."

The idea is that, if you are going to be someone's doctor, you have to help them, or at least, you shouldn't ever harm them. Doctors generally regard this as a sacred responsibility. This basic tenet of the doctor-patient relationship typically overrides all other considerations: the doctor's payment, the good or harm that the patient has done or may do, and the advancement of medical science all take a back seat to the welfare of the patient.

In this modern day and age, where doctors often perform general anesthesia on their patients to prepare them for surgery, this understanding is critical to the credibility of the medical profession as it stands. Who would knowingly submit themselves to a doctor, knowing that they might give you a secondary, curable disease, just to ensure they got paid?

Lawyers have a similar, although slightly more nuanced, principle. Anybody who has watched a few episodes of Law and Order knows about it. A slightly more authoritative source than NBC, though, is the American Bar Association, who in their Model Code of Professional Responsibility (the basis for the professional responsibility codes of most states' Bar associations in the United States) declare:
The professional judgment of a lawyer should be exercised, within the bounds of the law, solely for the benefit of his client and free of compromising influences and loyalties. Neither his personal interests, nor the interests of other clients, nor the desires of third persons should be permitted to dilute his loyalty to his client.
(emphasis mine)
For criminal defense lawyers, these "compromising influences and loyalties" may include a basic commitment to the public good. A lawyer who represents a serial murderer who privately admits to having committed heinous crimes must, to the best of their ability, represent the sociopath's interests and try to get them exonerated, or, failing that, the lightest sentence possible. Low as we as a society might consider a lawyer who defends rapists and murderers, we would think even more poorly of one who gave intentionally bad advice to people who he personally didn't like, or sold out his client's interests to the highest bidder.

A doctor's responsibility is somewhat the same. If a doctor is treating a deeply evil person, they are still obligated by the aforementioned sacred patient/doctor pact to honestly treat that person, not use their position as a doctor to proclaim a death sentence, or cripple them. They are obligated to treat that person equitably, even if that person's evil extends to not paying their medical bills.

This pattern isn't confined to professional trades. Catholic priests have the concept of the "seal of confession". If you confess your sins to a catholic priest, they are not to reveal those sins under any circumstances, regardless of the possible harm to others. A priest certainly shouldn't threaten their flock with knowledge of their confessed sins to increase contributions to the donation plate, even if one of them has confessed a murder.

In each case, society calls upon a specialist for navigating a system too complex for laymen to understand: the body, the law, and the soul. In each case, both society at large and individuals privately put their trust completely into someone allegedly capable of navigating that system. Finally, in each case, the trust of that relationship is considered paramount, above the practitioner's idea of the public good, above the practitioner's (and other's) financial considerations.

There is a good reason for these restrictions. Society has systems in place to make these judgments. Criminal defense lawyers are not allowed to judge their clients because that's the judge's job. Doctors aren't allowed to pass sentences on their clients because that's the legal system's job. Catholic priests don't judge their confessors because that's God's job. More importantly, each of these functions may only be performed with the trust of the "client" - and it is important for the client to know that their trust will not be abused, even for an otherwise laudable goal, such as social welfare, because notions of social welfare differ.

I believe that computer programmers are a fourth such function.

Global telecommunications and digital recording are new enough that I think this is likely to be considered a radical idea. However, think of the importance of computer systems in our society today. Critical functions such as banking, mass transit, law enforcement, and commerce would not be able to take place on the scale they do today without the help of computer systems. Apropos of my prior descriptions, every lawyer and doctor's office has a computer, and they rely on the information provided by their computer systems to do their jobs.

More importantly, computers increasingly handle a central role in our individual lives. Many of us pay our bills on a computer, do our taxes on a computer, do our school work or our jobs on computers. Sometimes all of these things even happen on one computer. Today, in 2005, most of those tasks can be accomplished without a computer (with the exception, for those of us with technical professions, of our jobs) but as the public systems we need to interact with are increasingly computerized, it may not reasonable to expect that it will be possible to lead an average modern life in 100 years without the aid of a personal computing device of some kind.

If that sounds like an extreme time frame, consider the relative importance of the automobile, or the telephone, in today's society versus 1905. It's not simply a matter of convenience. Today it is considered a basic right today for accused criminals to make a phone call. Where was that right when there were no telephones?

Another way to think of this relationship with technology is not that we do a lot of things with computers, but that our computers do a lot of things on our behalf. They buy things. They play movies. They make legal claims about our incomes to the federal government. Most protocol specifications refer to a program which acts on your behalf (such as a web browser) as a user agent to reflect this responsibility. You are not buying a book on Amazon with your computer; you click on some links, you enter some information, and you trust that your computer has taken this information and performed a purchase on your behalf. Your computer could do this without your help, if someone has installed a malicious program on it. It could also pretend to have made a purchase, but actually do nothing at all.

Here is where we approach the intersection between programming and ethical obligation. Every time a user sits down to perform a task with a computer, they are, indirectly, trusting the programmers who wrote the code they will be using to accomplish that task. Users give not only the responsibility of performing a specific task, they trust those programs (and thereby their programmers) with intensely personal information: usernames, passwords, social security numbers, credit card numbers - the list goes on and on.

There may be a technological solution to this problem, a way to limit the amount of information that each program needs, and provide users with more control over what different programs can say to each other on their own computer. Some very smart people are working on this, and you can read about some of that work on Ka-Ping Yee's "Usable Security" blog. Still, one of the experts there contemplates that perhaps, given the abysmal state of software today, perhaps the general public shouldn't even use the internet.

DRM is definitely a problem, but the real problem is that it's the top of a very long, very slippery slope. Its advocates point at the top of that slope and say "See, it's not so bad!" - but where will it end? While I am annoyed, I'm not really that concerned with the use of this kind of technology to prevent copyright violations. It's when we start using it to prevent other sorts of crimes that the real fear sets in.

Today, it's considered almost (but not quite) acceptable that Sony installs the digital equivalent of a car-bomb on my computer to prevent me from copying music. As I said at the beginning of this article - they don't think that the practice is inherently wrong, simply that there are some flaws in its implementation. Where will this stop? Assuming they can perfect the technology, and given that my computer has all the information necessary to do it, will future versions of Sony's music player simply install themselves and lie in wait, monitoring every download, and automatically billing you for anything that looks unauthorized, not telling me about it until I get my credit card statement?

Whether unauthorized copying should be a crime or not, preventing it by these means is blatantly wrong. Let me be blunt here. It is simply using a technique to wring more money out of users because the technique is there. Much like the doctor who cuts off your nose and won't reattach it until he gets paid for his other (completely legitimate) services, this is an abuse of trust of the worst order. It doesn't matter how much money you actually owe the doctor, or Sony: in any case, they don't have the right to do violence to you or to your computer because of it.

What of "terrorism"? Will mandatory anti-terrorism software, provided to Microsoft by the federal government, monitor and report my computerized activities to the Department of Homeland Security for review? From here, I'll let you fill in the rest of the paranoid ravings. I don't see this particular outcome happening soon, but the concern is real. There is no system in place to prevent such an occurrence, no legal or ethical restriction incumbent upon software developers which would prevent it.

This social dilemma is the reason I termed the IEEE/ACM ethics code "sophomoric". With the directionless enthusiasm of a college freshman majoring in philosophy, it commands "software engineers" to "Moderate the interests of [themselves], the employer, the client and the users with the public good.", to "Disclose to appropriate persons or authorities any actual or potential danger to the user, the public, or the environment", to "Obey all laws governing their work, unless, in exceptional circumstances, such compliance is inconsistent with the public interest." These are all things that a good person should do, surely, but they are almost vague enough to be completely meaningless. These tenets also have effectively nothing to do with software in specific, let alone software engineering. They are in fact opposed to certain things that software should do, if it's written properly. If the government needs to get information about me, they need a warrant, and that's for good reason. I don't want them taking it off my computer without even asking a judge first, simply because a helpful software engineer thought it might be a "potential danger to the public".

Software developers should start considering that accurately reflecting the user's desires is not just a good design principle, it is a sacred duty. Much as it is not the criminal defense lawyer's place to judge their client regardless of how guilty they are, it is not the doctor's place to force experimental treatment upon a patient regardless of how badly the research is needed, and it is not the priest's place to pass worldly judgment on their flock, it is not the programmer's place to try and decide whether the user is using the software in a "good" way or not.

I fear that we will proceed down this slippery slope for many years yet. I imagine that a highly public event will happen at some point, a hundred times worse than this minor scandal with Sony BMG, and users the world over will angrily demand change. Even then, there will need to be a movement from within the industry to provide some direction for that change, and some sense of responsibility for the future of software.

I hope that some of these ideas can provide direction for those people, when the world is ready, but personally I already write my code this way.

I've written about this a couple of years ago, and I think there's more to the issue, but I feel like this key point of accurately relaying the user's intent is the first step to anything more interesting. I don't really know if a large group of people even agree on that yet.

So, like I said, this post isn't about Divmod - exactly - but when we say "your data is your data"... we mean it.

6:00 on a saturday morning...

... and for some reason, I'm awake again.

Not Exactly a Release

One last thing for tonight.

I have made a few enhancements to HATE since I announced it. I thought that since my previous post featured a screenshot of it, I should make those changes available. It is actually usable now, since I bound C-S-n to 'new window' as it is in gnome-terminal.

You can download the new version here.

By the way - sorry for the RSSSpam, but I am trying to convey a certain feeling here, to invoke a sense of the pace of development at Divmod and my general enthusiasm about the new code we, and I, have been producing over the last few weeks.

Is an adjective forming in your head? Good. No? Okay.

The adjective is furious.

Update: Relax, folks. The implied noun phrase here is pace of development, see above. It's not "at you".

Mantissa 0.3.0 - Twisted on Tracks

TracksDid you catch my reference to David Heinemeier Hansson in the previous article? It wasn't simply the unmasked, raw, hideous envy of his brilliant success - it was foreshadowing. That's because I am like a word ninja, assaulting you with various literary conventions.

You better watch out. I could, quite possibly, be on a rampage, what with all these releases, and writing about them.

I have been studying the new crop of web frameworks to see what the fuss is all about - watching screencasts, reading weblogs. At first I found the whole thing sort of boring. I could not understand what people got so excited about. I was over-focusing on the (rather boring) problem domain that all these frameworks attack - blogs, wikis, to-do lists and so on, and just not understanding why it was all so compelling.

This week I took a step back and realized it was not the product, but the process that made it so interesting. It's not that you're making a wiki - it's that it is so incredibly easy to make a wiki with TurboGears that, if you are already the sort of person who already finds wikis interesting and wants to explore the far-reaching problem domains associated with wikis, the ease with which you can sail past the easy stuff that everyone already agrees upon must be absolutely incredible. There are some really great lessons for framework developers that can be learned just by watching how these things are taught, rather than paying attention to what they actually do.

This release of Mantissa took the first small, but determined steps to make using Twisted, and the whole Divmod suite, more immediately approachable. There isn't any stub or automatic code generation happening yet, but we did improve the plug-in for the 'axiomatic' command-line tool quite a bit. Here is a screenshot - a pre-screencast, if you will:

That simple set of commands will create a database with a webserver, login system, an account for "admin@localhost" with a password of your choosing, a through-the-web interactive AJAX Python command line, and enough scaffolding at least on the database side for you to start deploying Mantissa plug-ins. Here's a peek:


Things will only improve from here on out.

I don't have any illusions that Mantissa will supplant TurboGears or Django or Rails or whatever as the tool du jour. We're not focused on wikis or blogs. We'll probably implement one, for integration between those features and the other stuff we're implementing, but it might still be harder to write a blog using Axiom and Mantissa than using one of the aforementioned tools.

Still, I guarantee it will be easier to make an Internet telephone service using Mantissa. Or, for that matter, a chat server that does IRC and Jabber. Or a clan server that controls instances of hl2 and quake4 multiplayer daemons.

I do certainly hope that we will get at least some press for our efforts, but to be honest the prospect of relative obscurity doesn't bother me. It might even be a sign of a good thing. It's similar to the bike shed problem: everybody knows how to build a bike shed, or a blog, so there's a huge demand for really good, really cheap tools to build them.

Still, dear reader, if you will permit me a moment of hubris as I labor down in the spice mines: somebody has to provide the power that those tools run on, so ultimately there's a demand for that nuclear power plant, too.

You see how I took that last bit into a comparison, without using 'like' or 'as'? Blam, it's a metaphor. Caught you by surprise there, didn't I! My writing technique is unstoppable.

Axiom 0.3.0 - The ORM I Won't Shut Up About, Already

We released Axiom again today.

It's making progress at an even keel these days.

A few minor issues: some bugs were fixed in upgraders, some deployment order dependency issues with Epsilon were fixed. We added some more data integrity checks, and came up with a way to dodge a few more inheritance quirks with Item.

The major upgrade in this release is that queries are objects, rather than simply generators. This shift in API is a subtle step further towards exposing as many efficient SQL operations as possible without exposing any SQL.

Since the release announcement is already over, you might have guessed - I tricked you. This is hardly a release announcement at all. It's an explanation of one of Axiom's goals: complete encapsulation of SQL in an object-oriented model.

Some people, most notably David Heinemeier Hansson, would say that's a pretty bad goal to have. Oddly enough, with the people they are generally arguing against, I'd agree. The relational model should get some respect. It's where your data actually lives. You shouldn't swaddle it to the point where it's suffocating.

The agreement ends pretty quickly though. I think of SQL in your application like fire on a cold night. It keeps me warm, sure, and I shouldn't douse it with water because I'm afraid of getting burned. I might freeze. That doesn't mean I stick my hand into the open flame.

You might ask, why such an intense metaphor? Isn't SQL just another tool in my programming toolbelt?

Metaprogramming is hard, and dress it up however you like, that's what using SQL is. Your code is generating other code, and evaluating its results.

Because metaprogramming is so hard, it is almost exclusively the province of frameworks, environments and operating systems. For good reason, too. Whenever code generates other code, there are potentially very serious mistakes that can get made. SQL is a hobbled language in most databases, so the damage is restricted. It's only restricted in the sense that it won't cause your server to segfault, though. Your database is probably where 99% of your application's (and possibly your company's) value lives anyway, not your server's heap memory. Mistakes in generating SQL that talks to that database can be very costly.

Getting away from SQL injection attacks is a minor feature of Axiom, so it's easy to forget that it is a really serious problem. People make this mistake all the time, with disastrous consequences. Unless you read Bugtraq, that is. In which case you are probably reminded of this somewhere between 2 and 20 times per week.

Having an object model for your SQL also helps you generalize things that might otherwise be overspecific, and test unrelated database features more independently. For example: there is a utility function in Axiom (go read the docstring, it's fun) which finds overlapping ranges of values. The query looks like this:

OR(
AND(startAttribute >= startValue,
startAttribute <= endValue),
AND(endAttribute >= startValue,
endAttribute <= endValue),
AND(startAttribute <= startValue,
endAttribute >= endValue)
)

If that had been accomplished with hand-generated SQL rather than objects, I'd have to quote every one of those attributes. I'd have to figure out how to make sure that the table's name was fully qualified whenever the user passed data into this function - if it needed to be. It would be hard to write a test for. Finally, it'd be hard to know how to properly involve it in a join. All the arguments would be strings, so I would have no idea if they were properly formatted or not until the database spit back a syntax error - which might not contain any useful information.

If I were generating SQL by hand though, I doubt it would have occurred to me to write such a function in the first place. I just would have written the SQL code necessary to do this for the calendar_event table and moved on.

If your database wrapper is going to do caching, queries have to be introspective so that the cache manager knows when a given chunk of SQL might invalidate your cache. Managing caching plus concurrency is hard enough with help from every layer of your system.

There are also some features which are directly supported by a complete object model for queries.

The feature I'm going to be working on tomorrow, a generic browser for tabular data, is another example of where having as much data access as possible happen through objects is helpful. The tabular data browser, or "TDB" as we call it, takes a query-like object as an argument, so you can page through complex queries. That object encapsulates both information about the objects being queried and the database itself, the TDB can both display the data appropriately and easily and quickly generate appropriate SQL. Without such an intermediary layer, the TDB would itself be a mess of string concatenations and quasi-generic SQL generation of its own.

Eventually we hope for Axiom to monitor queries, inserts, deletes and updates as they happen to provide a "live" query interface, which takes the same objects that SQL generation would, but provide an active view onto those objects as they appear, change, and disappear. If you're issuing UPDATEs and INSERTs to the database without going through an intermediary layer there is nowhere to catch this but triggers - and again I find a point of agreement with DHH, the database is not where the smarts belong.