The Perfect Iced Latte

The only thing I take more seriously than my keyboards is my coffee.  I've been tinkering with my recipe for a while now, but I'm really very happy with the current iteration.  Here's exactly how I make my morning coffee.
  1. Grind up roughly ¾ cup of 49th Parallel Epic Espresso using a Krups Fast Touch Coffee Grinder (Black).
  2. Filter some water using a Brita Riviera water filter.
  3. Place the grounds into an IKEA KAFFE french press, and fill it with filtered water.  Depress the plunger to make sure the grounds are soaked with water, then remove the plunger and cover with tinfoil.
  4. Put the water into the refridgerator for between 12 and 24 hours.  (More time produces stronger coffee, but after 24 hours it starts to get unpleasantly bitter, and after 72 it's undrinkable.)
  5. Put the plunger back in, and filter the espresso concentrate.  Decant it into another container.  (This concentrate will last for a week or two.)
  6. To serve, mix 1 part concentrate with 3 parts Stonyfield Farm Organic Milk and 1 teaspoon of Butternut Mountain Farm Pure Vermont Maple Syrup.
Disclosure: nobody actually paid me to mention all those products.  I just like hyperlinks.


It's Always Sunny in Python

Jonathan Lange suggests that you have three options when you have some code that needs testing.
  1. Give up
  2. Work hard to write the damn test
  3. Make your code testable.
In fact he starts with just the first two options and then reveals the third, but I suggest that there is an option number four, available only in Python (and other, similar dynamic languages):
  1. Cheat.
I agree with Jonathan that you should generally make your code more flexible, and thereby testable.  Any code which is friendly to being invoked as a library or as an independent unit of functionality can be tested, so you should endeavor to never write code that is unfriendly to being executed as just a normal method call.  This is especially true if you are writing new code in a properly test-driven manner.  Keeping in mind, of course, that one should not modify the SUT purely for the purpose of testing.

However, if one is (as I often find myself) adding test coverage to a grotty old system, written at a time or in a place where test-driven development was not the norm, one typically wants to establish test coverage before making any changes to the code or its design.  In such a situation, one may often find oneself in the undesirable position of needing to carefully modify some implementation code so that it can be tested, hoping that none of its untested interactions with other areas of the system will be broken as a result.  For example, you might encounter some paranoid and misguided Java code like this:

    // Startup.java
    private static final void emitLogMessage(final String message) {
        System.out.println(message);
    }

    public static final void startUp() {
        // ...
        emitLogMessage("Starting up!");
        // ...
    }

In this case, it's very difficult to get in the way of any part of this system.  Nothing is parameterized, everything is global, and the compiler won't even let you call one of these methods.  You really only have Jonathan's three options here, none of which are desirable.
  1. You can give up on testing this part of the system until you've covered other parts of the system.  In many cases this is the right thing to do, but it is often the lowest-level and most critical parts of a system which have calcified into this sort of untestable rubble.
  2. You can work hard to write the damn test.  There are a number of extremely subtle nuances of the Java runtime which you can take advantage of to make a lie of the "private" and "final" keywords.  You can load the code using a custom classloader, manipulate its bytecode, or invoke private methods using reflection.  This is ultimately the "right thing" to do, but it requires the development of a daunting skill-set which you would not otherwise need.
  3. You can make the code testable, changing it before you've properly tested the code which is already in use.  Ultimately this is what you want to get to anyway, but if the code is doing something subtle that you didn't test (and none of the rest of the system is tested yet) you might be (rightly) concerned that this could break something else.
In Python, the hard work is not so hard.  To start with, Python doesn't have the misfeatures of private and final.  It also doesn't have any baroque "reflection" constructs.  All you need to understand is attribute access.  So, if you have a similar Python file:
# startup.py
import sys

def emitLogMessage(message):
    sys.stdout.write("%s\n" % (message,))

def startUp():
    # ...
    emitLogMessage("Starting up!")
    # ...
Idiomatically, the situation looks just as hopeless.  Everything is global, and nothing is parameterized.  It's hard-coded.  However, if you look at it from the right angle, you will realize that you can't really code that "hard" in python.

What emitLogMessage is doing in this case is not making a fixed reference to the global sys module: it is simply accessing the sys attribute of the startup module.  So in fact, we can easily test it:
# test_startup.py
import sys
import startup
import unittest

class FakeSys(object):
    def __init__(self, test):
        self.test = test
    @property
    def stdout(self):
        return self
    def write(self, message):
        self.test.messages.append(message)

class StartupTest(unittest.TestCase):
    def setUp(self):
        self.messages = []
        startup.sys = FakeSys(self)

    def tearDown(self):
        startup.sys = sys

    def test_startupLogMessage(self):
        startup.startUp()
        self.assertEquals(self.messages, ["Starting up!\n"])
So testing startUp is a simple matter of replacing the sys object that it's talking to: which, in Python, is rarely more than a setattr() away.

I've taken care here to use only standard Python features.  This is, after all, theoretically possible in Java, it's just a heck of a lot harder, both to use and to understand — the learning curve is a big part of the problem.  However, if you're using Twisted, and willing to spend just a brief moment to learn about one of its testing features, you can save a few lines of code and opportunities for error:
import startup
from twisted.trial import unittest

class FakeSys(object):
    def __init__(self, test):
        self.test = test
    @property
    def stdout(self):
        return self
    def write(self, message):
        self.test.messages.append(message)

class StartupTest(unittest.TestCase):
    def setUp(self):
        self.messages = []
        self.patch(startup, "sys", FakeSys(self))

    def test_startupLogMessage(self):
        startup.startUp()
        self.assertEquals(self.messages, ["Starting up!\n"])
Please keep in mind that this is still not the best way to do things.  Use the front door first.  It's much better to use a stable, documented, supported API in your tests than to depend on an accident of implementation which should be able to change.  However, it is even worse to associate the feeling of testing with the feeling of being stuck, being unable to figure out how to dig yourself out of some hole that old, bad design has dug you into.

I'm writing this mostly for people who are new to test-driven development in Python and think that unit tests need to be a huge amount of extra work.  They don't.  If you ever find yourself struggling, unable to figure out how you could possibly write a test which would exercise some tangle of poorly-designed code, just remember: it's all just objects and methods, attributes and values.  You can replace anything with anything else with a pretty trivial amount of effort.  Of course you should try to figure out how to improve your design, but you should never think that you need to stop writing tests just because you used a global variable and you can't figure out what to replace it with.

Explaining Why Interfaces Are Great

Updated July 7, 2018:
  • modernized examples to be correct for python 3
  • added <code> annotations in many places where they were typographically necessary
  • made use of attrs in the example, since not only should you always use attr.s, but it also has direct support for Zope Interface
If you enjoy this post, and thereby, Zope Interface's capabilities, please consider contributing Zope Interface support to the Mypy type checker, which will amplify its usefulness considerably.

Why?

Why use interfaces? Especially with Python's new ABCs, is there really a use for them?

Some of us Zope Interface fans — names withheld to protect the guilty, although you may feel free to unmask yourselves in the comments if you like — have expressed frustration that ABCs beat out interfaces for inclusion of the standard library.  However, I recently explored various mailing lists, Interfaces literature, and blogs, and haven't found a coherent description of why one would prefer interfaces over ABCs.  It's no surprise that Zope's interface package is poorly understood, given that nobody has even explained it!  In fact, PEP 3119 specifically says:
For now, I'll leave it to proponents of Interfaces to explain why Interfaces are better.

It seems that nobody has taken up the challenge.

I remember Jim Fulton trying to explain this to me many years ago, at a PyCon in Washington DC.  I definitely didn't understand it then.  I was reluctant to replace the crappy little interfaces system in Twisted at the time with something big and complicated-looking.   Luckily the other Twisted committers prevailed upon me, and Zope Interface has saved us from maintaining that misdesigned and semi-functional mess.

During that explanation, I remember that Jim kept saying that interfaces provided a model to "reason about intent".  At the time I didn't understand why you'd want to reason about intent in code.  Wouldn't the docstrings and the implementation specify the intent clearly enough?  Now, I can see exactly what he's talking about and I use the features he was referring to all the time.  I don't know how I'd write large Python programs without them.

Caveat

This isn't a rant against ABCs.  I think ABCs are mostly pretty good, certainly an improvement over what was (or rather, wasn't) there before.  ABCs provide things that Interfaces don't, like the new @abstractmethod and @abstractproperty decorators. Plus, one of the irritating things about using zope.interface is that the metadata about standard objects in zope.interface.common is not hooked up to anything: IMapping.providedBy({}) returns False.  ABCs will provide that metadata in the standard library, making zope.interface that much more useful once it has been upgraded to understand the declarations that the collections and numbers modules provide.

So, on to the main event: what do Zope Interfaces provide which makes them so great?

Clarity

Let's say we have an idea of something called a "vehicle".  We can represent it as one of two things: a real base class (Vehicle), an ABC (AVehicle) or an Interface (IVehicle).

There are a set of operations that interfaces and base-classes share.  We can ask, "is this thing I have a vehicle"?  In the base-class case we spell that if isinstance(something, Vehicle).  In the interfaces case, we say if IVehicle.providedBy(something).  We can ask, "will instances of this type be a vehicle?".  For an interface, we say if IVehicle.implementedBy(Something), and for a base class we say issubclass(Something, Vehicle).  With the new hooks provided by the ABCs in 2.6 and 3.0, these are almost equivalent.  With zope.interface, you can subclass InterfaceClass and write your own providedBy method.  With the ABC system, you subclass type and implement __instancecheck__.

However, there are some questions you can't quite cleanly ask of the ABC system.  For one thing, what does it really mean to be a Vehicle?  If you are looking at AVehicle, you can't tell the difference between implementation details and the specification of the interface.  You can use dir() and ignore a few of the usual suspects — __doc__, __module__, __name__, _abc_negative_cache_version — but what about the quirkier bits?  Metaclasses, inherited attributes, and so on?  There's probably some way to do it, but I certainly can't figure it out quickly enough to include in this article.  In other words, types have two jobs: they might be ABCs, or they might be types, or they might be both, and it's impossible to separate those responsibilities.

With an Interface, this question is a lot easier to ask.  For a quick look, list(IVehicle) will give a complete list of all the attributes expected of a vehicle, as strings.  If you want more detail, IVehicle.namesAndDescriptions() and Method.getSignatureInfo() will oblige.

Since the interface encapsulates only what an object is supposed to be, and no functionality of its own, it's possible for frameworks to inspect them and provide much nicer error messages when objects don't match their expectations.  zope.interface.verifyClass and zope.interface.verifyObject can tell you, both for error-reporting and unit-testing purposes, whether an object looks like a proper vehicle or not, without actually trying to drive it around.

Flexibility

At the most basic level, interfaces are more flexible because they are objects.  ABCs aren't objects, at least in the message-passing smalltalk sense; they are a collection of top-level functions and some rules about how those functions apply to types.  If you want to change the answer to isinstance(), you need to register a type by using ABCMeta.register or overriding __instancecheck__ on a real subclass of type.  If you want to change the answer to providedBy, for example for a unit test, all you need is an object with a providedBy method.

Of course, you can do it "for real" with an InterfaceClass, but you don't need to.  In other words, its semantics are those of a normal method call.

Interfaces aren't completely self-contained, of course: there are top-level functions that operate on interfaces, like verifyObject.  However, there's an interface to describe what is expected:

>>> from zope.interface.interfaces import Interface, IInterface
>>> IInterface.providedBy(Interface)
True

There's also the issue of who implements what.  For example, you might have a plug-in system which requires modules to implement some functionality.  Generally speaking, modules are instances of ModuleType, so specifying that all modules implement some type with an ABC is somewhat awkward.  With an interface, however, there is a specific facility for this: you put a moduleProvides(IVehicle) declaration at the top of your module.

In zope.interface, there is a very clear conceptual break between implements and provides.  A module may provide an interface — i.e. be an object which satisfies that interface at runtime — without there being any object that implements that interface — i.e. is a type whose instances automatically provide it.  This distinction comes in handy when avoiding certain things.  This distinction exists with ABCs; either you "are a subclass of" a type or you "are an instance of" a type, but the language around it is more awkward and vague, especially since you can be a "virtual instance" or "virtual subclass" now as well.

There's also the issue of dynamic proxies.  If you have a wrapper which provides security around another object, or transparent remote access to another object, or records method calls (and so on) the wrapper really wants to say that it provides the interfaces provided by the object it is wrapping, but the wrapper type does not implement those interfaces.  In other words, different instances of your wrapper class will actually provide different interfaces.  With zope.interface you can declare this via the directlyProvides declaration.  With ABCs, this is not generally possible because ABCMeta.register will only work on a type.

Adaptation

Let's say I have an object that provides IVehicle.  I want to display it somehow — and in today's web-centric world, that probably means "I want to generate some HTML".  How do I get from here to there?  ABCs don't provide an answer to that question.  Interfaces don't do that directly either, but they do provide a mechanism which allows you to provide an answer: you can adapt from one interface to another.

I'm not going to get into the intricacies of exactly how adaptation works in zope.interface, since it isn't important to understand most of the time.  Suffice it to say you can adapt based on specific hooks that are registered, based on the type an object is, or based on what interfaces it provides.

The gist of it is that you have some thing that you don't know what it is, and you want an object that provides IHTMLRenderer.  The way you express that intent is:

renderer = IHTMLRenderer(someObject)

If there are no rules for adapting an object like the one you have passed to an IHTMLRenderer, then you will get an exception - which is all that will happen, normally.  However, this point of separation between the contract that your code expects and the concrete type that your code ends up actually talking to can be very useful.

The larger Zope application server has a rich and complex set of tools for defining which adapter is appropriate in which context, but Twisted has a very simple interface to adaptation.  You simply register an adapter, which is a 1-argument callable that takes an object that conforms to some interface or is an instance of some class, and returns an object that provides another interface.  Here's how you do it:

import attr
from zope.interface import implementer
from twisted.python.components import registerAdapter

@implementer(IHTMLRenderer)
@attr.s
class VehicleRenderer(object):
    "Render a vehicle as HTML"
    vehicle = attr.ib(validator=attr.validator.provides(IVehicle))
    def renderHTML(self):
        return "<h1>A Great Vehicle %s (%s)</h1>" % (
                   self.vehicle.make.name,
                   self.vehicle.model.name)

registerAdapter(VehicleRenderer, IVehicle, IHTMLRenderer)


Now, whenever you do IHTMLRenderer(someVehicle), you'll get a VehicleRenderer(someVehicle).

Your code for rendering now doesn't need any special-case knowledge about particular types.  It is written to an interface, and it's very easy to figure out which one; it says "IHTMLRenderer" right there.  It's also easy to find implementors of that interface; just grep for "implementer.*IHTMLRenderer" or similar.  Or, use pydoctor and look at the "known implementations" section for the interface in question.

Conclusion

In a super-dynamic language like Python, you don't need a system for explicit abstract interfaces.  No compiler is going to shoot you for calling a 'foo' method.  But, formal interface definitions serve many purposes.  They can function as documentation, as a touchstone for code which wants to clearly report programming errors ("warning:  MyWidget claims to implement IWidget, but doesn't implement a 'doWidgetStuff' method"), and a mechanism for indirection when you know what contract your code wants but you don't know what implementation will necessarily satisfy it (adaptation).

Even with a standard library mechanism for doing some of these things, Zope Interface remains a highly useful library, and if you are working on a large Python system you should consider augmenting its organization and documentation with this excellent tool.

Help Us Help You

The Twisted project is getting ready for another round of fund-raising.

Like last year, we'll be centering this effort around PyCon.  This year, we have a year's track-record for our potential sponsors to evaluate us on.

During this year of sponsored development, we closed a record number of tickets this year.  Partially this is due to the work that JP has done himself, partially it is due to the increased rate at which users' contributions have been reviewed.

Aside from raw numbers, funding has allowed us to dedicate the sustained effort required to deal with very old, very unpleasant, very difficult issues like properly handling child process termination and the development of a new, better HTTP client.

But we can't keep this up without help.  Unless you've been living in a very deep hole, you know that the world's economy has exploded and a lot of companies are feeling the pain.  It will be harder in this tough climate to convince companies that this is a good time to invest in software which "doesn't cost anything".

At the same time, we believe that we could get an even better outcome for the Twisted project if we can allocate more funds this year.  We could upgrade from part-time to full-time maintenance, do more new development, and possibly even fund a Twisted conference.

This is where you come in.  The people responsible for raising funds for Twisted are mostly the same people who write code for it.  The more help we can get from you — the developers who use Twisted — the more of our spare time we can spend writing code.

If you are interested in helping with this, especially if you have experience doing fund-raising, please let us know on the mailing list.  This is a great opportunity for those of you who would really like to give something back to Twisted but haven't had the opportunity to contribute code.

(Of course, if you don't have the time to help with fundraising either, you can always make a small personal contribution using the form on the front page of twistedmatrix.com.  Every little bit helps, and donations are tax-deductible.)

I would be remiss if I did not mention that the Software Freedom Conservancy has been extremely helpful in helping us collect donations, manage our accounts, and deal with the legal paperwork of establishing a non-profit.  Without their help we would likely not have had the collective attention span to establish a foundation and lay the foundations that now allows us to collect tax-exempt donations.  If you are contributing to Twisted, please consider contributing something to the SFC as well.

Using SSH Keys on a USB Drive on MacOS X

I keep my SSH private key on a USB thumb drive.

The idea is that I don't want my private key to be on the hard disk of any of the computers that I use.  I use several and so I'm not observing them all constantly, so I don't want to leave my key around for automated attackers to pick it up.

I load the key directly from the USB drive into my SSH agent, which then mlock()s it so it doesn't get put into swap.

This works on Windows (with PuTTY) and Linux just fine.  Unfortunately MacOS X has a nasty habit of mounting FAT volumes with free-for-all permissions, so when I try to load the key:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
         WARNING: UNPROTECTED PRIVATE KEY FILE!          @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Permissions 0777 for '/Volumes/GRAVITON/....id_rsa' are too open.
It is recommended that your private key files are NOT accessible by others.
This private key will be ignored.

I thought that this was an intractible problem.  The only solution I'd found previously was to make a copy of the key, make a sparse disk image, and manually mount the sparse disk image.  However, this workaround has two problems:
  1. It's inconvenient.  I have to manually locate the disk image every time, double click it, etc.
  2. It's insecure.  If I ever allow other users to log in to any of my OS X machines, they can read the version of the key I'm not using on the FAT filesystem, even if only I can read the one on the HFS+ disk image.
Today, almost by accident, I discovered the real answer.

The daemon that mounts disk on OS X is called "diskarbitrationd".  I discovered this by running across some OpenDarwin documentation which explains that you can configure this daemon by putting a line into fstab.

First you need a way to identify the device in question.  None of the suggested mechanisms for determining the device UUID worked for me, so I used the device label instead.  This is probably desirable anyway, since at least you can tell when the label changes; if you move your key to a similar device, the UUID is different but you can't tell.

You can set the device label by mounting your USB drive, doing "get info" on it, editing the name in the "name and extension" section, and then hitting enter.  You should use an all-caps name, since when you re-mount the drive it will be all-caps again anyway.

You also need to know your user-ID.  The command 'id -u' will return it.

Then, you need to add a single line to /etc/fstab.  My drive's label is "GRAVITON", and my user-ID is 501, so it looks like this:

LABEL=GRAVITON none msdos -u=501,-m=700

Now, all you have to do is eject your drive and plug it in again.  Voila!

$ ssh-add /Volumes/GRAVITON/....keychain.id_rsa
Identity added: /Volumes/GRAVITON/....keychain.id_rsa (/Volumes/GRAVITON/...keychain.id_rsa)

Now you can securely carry your SSH key with you to macs, without breaking ssh-agent's intended protection.