A Tired Raccoon’s Containerization Manifesto

Just Do The Containering

a group of raccoons looking at the viewer with text “death is coming, eat trash, be free”

Some of you out there are still stuck on old deployment workflows that drop software directly onto shared hosts. Maybe it’s a personal thing that you just don’t have the energy to maintain particularly well. Maybe it’s a service at work stuck without any dedicated owner or maintenance resources that keeps limping along.

This post is a call to action for doing the minimum possible work to get it into a container, and to do that transition badly and quickly. I’ve done it for a bunch of minor things I maintain and it’s improved my life greatly; I just re-build the images with the latest security updates every week or so and let them run on autopilot, never worrying about what previous changes have been made to the host. If you can do it1, it’s worth it.

Death is Coming

Your existing mutable infrastructure is slowly decaying. You already know that one day you’re going to log in and update the wrong package and it’s gonna blow up half of the software running on your box. Some .so is going to go missing, or an inscrutable configuration conflict will make some network port stop listening.

Either that or you’re not going to update religiously, and eventually it’ll get commandeered by cryptocurrency miners. Either way, your application goes down and you do a lot of annoying grunt work to get it back.

These boxes won’t survive forever. You’ve gotta do something.

Eat Trash

You don’t need to follow the daily churn of containerization “best practices” in order to get 95% of the benefit of containers. The huge benefit is just having a fully repeatable build process that can’t compromise your ability to boot or remotely administer your entire server. Your build doesn’t have to be good, or scalable. I will take 25 garbage shell scripts guaranteed to run isolated within a container over a beautifully maintained deployment system written in $YOUR_FAVORITE_LANGUAGE that installs arbitrary application packages as root onto a host any day of the week. The scope of potential harm from an error is orders of magnitude reduced.

Don’t think hard about it. Just pretend you’re deploying to a new host and manually doing whatever faffing around you’d have to do anyway if your existing server had some unrecoverable hardware failure. The only difference is that instead of typing the commands to do it after an administrative root@host# prompt on some freshly re-provisioned machine, you type it after a RUN statement in a Dockerfile.

Be Free

Now that you’ve built some images, rebuild them, including pulling new base images, every so often. Deploy them with docker run --restart=always ... and forget about them until you have time for another round of security updates. If the service breaks? Roll back to the previous image and worry about it later. Updating this way means you get to decide how much debugging effort it’s worth if something breaks in the rebuild, instead of inherently being down because of a bad update.

There. You’re done. Now you can go live your life instead of updating a million operating system packages.


  1. Sadly, this advice is not universal. I certainly understand what it’s like to have a rat king of complexity containing services with interdependencies too complex to be trivially stuffed into a single container. 

Detweeting

I'm taking a break from Twitter until at least July 1, 2021.

Twitter is horrible. Everyone already knows this. 1 2 3

But, Twitter can also be good, sometimes, after a fashion.

Throughout the pandemic, I have personally found Twitter to be a helpful tool for self-regulation. The little hits of dopamine on demand throughout the day have allowed me to suppress and modulate some truly unpleasant intrusive thoughts, during times when I have had neither the executive function nor sufficient continuous uninterrupted time allocated to focus on other, more useful things. Twitter has allowed me to anesthetize the internal doom-sayer during the absolutely most mind-shatteringly stressful period of my — and, presumably, most living humans’ — entire life.

Like any anesthetic, however, there comes a point where administering additional doses is more harmful than beneficial, even if the pain it’s suppressing is still there. It’s time for me to take a break, and it seems like it would be wise to take one long enough for new habits to form.

To that end, I’ll be taking the entirety of June off from Twitter; depending on how that goes, I might see you back there on 2021-07-01, or, should I find the fortitude in the meanwhile, never.

The “I’m taking a break from social media” genre of post is certainly a bit self-indulgent4, so it behooves me to say why I’m bothering to post about this rather than just, you know, doing it.

There are three reasons:

  1. Changing times: I’m naturally distractable so I tend to keep an eye on my social media usage. I periodically look at how much time I’m spending, the benefits I’m getting, and the problems it’s causing. For most of the pandemic I could point to at least one or two useful actions per week that I’d taken because of something I’d learned on Twitter. Sometimes I’d learn about risk modeling or health precautions, emerging understanding of impacts of isolation on mental health, and ways to participate to address the exhausting, non-stop political upheaval of 2020/2021. But now I’m mostly just agonizing over the lack of any useful guidance for parents with young children who cannot yet get vaccinated for COVID-19 at this late stage of the crisis, and getting directionlessly more angry about the state of the world. The benefits have slowly evaporated over the last few weeks but the costs remain.5

  2. Accountability: simply deleting the app, logging out of the website, etc, is clearly not enough to stay away, so an audience who can notice me posting and say “stop posting” should hopefully be enough to keep me honest. Please do note that I will still be allowing certain automated systems to post on my behalf, though. This post, for example, and any other posts I put on my blog, will show up in my Twitter feed automatically, I don’t post those manually.

  3. A gentle prompt for others: maybe you’re having similar issues. Maybe you’d like to join me. During the pandemic I’ve found that many types of unpleasant mental states that I’ve described are more relatable than usual. Some so much so that they’ve got whole articles about jargon to describe them, like “disenfranchised stress”6 and “vicarious trauma”7. Feel free to ignore this: I’m not saying you should join me. Just that if you’ve already been thinking you should, you can take this as a challenge to do the same.

In the meanwhile, I’ll try to do some longer-form writing, particularly writing that isn’t about social media.

If you’d like to get in touch, I won’t be replying to DMs, so feel free to send me an email directly. If you want to interact in real time, I am still on IRC, as glyph on irc.libera.chat. Feel free to drop by #glyph and say hi.

Interfaces and Protocols

Comparing zope.interface and typing.Protocol.

Some of you read my previous post on typing.Protocols and probably wondered: “what about zope.interface?” I’ve advocated strongly for it in the past — but now that we have Mypy and Protocols, is it simply a relic of an earlier time? Can we entirely replace it with Protocol?

Let’s have a look.

Typing in 2 dimensions

In the previous post I discussed structural versus nominal typing. In Mypy’s type system, most classes are checked nominally whereas Protocol is checked structurally. However, there’s another way that Protocol is distinct from a normal class: normal classes are concrete types, and Protocols are abstract.

Abstract types:

  1. cannot be instantiated: every instance of an abstract type is an instance of some concrete sub-type, and
  2. do not include (complete) implementation logic.

Concrete types:

  1. can be instantiated: they are complete descriptions of a type, and
  2. must include all their own implementation logic.

Protocols and Interfaces are both abstract, but Interfaces are nominal. The highest level distinction between the two is that when you have a problem that requires an abstract type, but nominal checking is preferable to structural, Interfaces are a better solution.

Python’s built-in Abstract Base Classes are technically abstract-and-nominal as well, but they’re in a strange halfway space; they’re formally “abstract” because they can’t be instantiated, but they’re partially concrete in that they can contain any amount of implementation logic themselves, and thereby making an object which is a subtype of multiple ABCs drags in all the usual problems of the conflicting namespaces within multiple inheritance.

Theoretically, there’s a way to treat ABCs as purely abstract — which is to use ABCMeta.register — but as of this writing (March 2021) it doesn’t work with Mypy, so within the context of “static typing in Python” we presently have to ignore it.

Practicalities

The first major advantage that Protocol has is that since it is now built in to Python itself, there’s no reason not to use it. When Protocol didn’t even exist, regardless of all the advantages of adding explicit abstract types to your project with zope.interface, it did still have the small down-side of requiring a new dependency, with all the minor headaches that might imply.

beyond the theoretical distinctions, there’s a question of how well tooling supports zope.interface. There are some clear gaps; there is not a ton of great built-in IDE support for zope.interface; less-sophisticated linters will sometimes still complain that Interfaces don’t take self as their first argument. Indeed, Mypy itself does this by default — although more on that in a moment. Less mainstream performance-focused type-checkers like Pyre and Pyright don’t support zope.interface, either, although their lack of support for zope.interface is just a part of a broader problem of their lack of extensibility; they also can’t support SQLAlchemy or the Django ORM without special-casing in the tools themselves.

But what about Mypy itself — if we have to discount ABCMeta.register due to practical tooling deficiencies even if they provide a built-in way to declare a nominal-but-abstract type in principle, we need to be able to use zope.interface within Mypy as well for a fair comparison with Protocol. Can we?

Luckily, yes! Thanks to Shoobx, there’s a fairly actively maintained Mypy plugin that supports zope.interface which you can use to statically check your Interfaces.

However, this plugin does have a few key limitations as of this writing (Again, March 2021), which makes its safety guarantees a bit lower-quality than Protocol.

The net result of this is that Protocols have the “home-field advantage” in most cases; out of the box, they’ll work more smoothly with your existing editor / linter setup, and as long as your project supports Python 3.6+, at worst (if you can’t use Python 3.7, where Protocol is built in to typing) you have to take a type-check-time dependency on the typing_extensions package, whereas with zope.interface you’ll need both the run-time dependency of zope.interface itself and the Mypy plugin at type-checking time.

So in a situation where both are roughly equivalent, Protocol tends to win by default. There are undeniably big areas where Interfaces and Protocols overlap, and in plenty of them, using Protocol is a fine idea. But there are still some clear places that zope.interface shines.

First, let’s look at a case which Interfaces handle more gracefully than Protocols: opting out of matching a simple shape, where the shape doesn’t fully describe its own meaning.

Where Interfaces work best: hidden and complex meanings

The string is a stark data structure and everywhere it is passed there is much duplication of process. It is a perfect vehicle for hiding information.

Alan Perlis, “Epigrams in Programming”, Epigram 34.

The place where structural typing has the biggest advantage is when the type system is expressive enough to fully encode the meaning of the desired behavior within the structure of the type itself. Consider a Protocol which describes an object that can add some integers together:

1
2
3
class Math(Protocol):
    def add_integers(addend1: int, addend2: int) -> int:
        ...

It’s fairly unambiguous what adherents to this Protocol should do, and anyone implementing such a thing should be able to clearly tell that the method is supposed to add a couple of integers together; there’s nothing hidden about the structure of the integers, no constraints the type system won’t let us specify. It would be quite surprising if anything that didn’t have the intended behavior would match this Protocol.

A the other end of the spectrum, we might have a plugin Interface that has a lot of hidden structure. For this example, we have an Interface called IPlugin containing a method with an easy-to-conflict-with name (“name”) overloaded with very specific constraints on its return type: the string must contain the dotted-path name of a Python object in an import-able module (like, for example, "os.path.join").

1
2
3
class IPlugin(Interface):
    def name() -> str:
        "Return the fully-qualified Python identifier of the thing to load."

With Protocols, you can work around these limitations, by manually making it harder to match; adding elements to the structure that embed names relevant to its semantics and thereby making the type behave more as if it were nominally typed.

You could make the method’s name long and ugly instead (plugin_name_to_load, let’s say) or add unused additional attributes (yep_i_am_a_plugin = Literal[True]) in order to reduce the risk of accidental matches, but these workarounds look hacky, and they have to be manually namespaced; if you want to mark it as having semantics associated with your specific plugin system, you have to embed the name of that system in your attributes themselves; here we’re just saying “plugin” but if we want to be truly careful, we have to embed the whole name of our project in there.

With Interfaces, the maintainer of each implementation must explicitly opt in, by choosing whether to specify that they are an @implementer(IPlugin). Since they had to import IPlugin from somewhere, this annotation carries with it a specific, namespaced declaration of semantic intent: “I know what the Interface IPlugin means, and I promise that I can provide it”.

This is the most salient distinction between Protocols and Interfaces: if you have strong reasons to want adherents to the abstract type to opt in, you want an Interface; if you want them to match automatically, you want a Protocol.

Runtime support

Interfaces also provide a more nuanced set of runtime checks.

You can say that an object directlyProvides an interface, allowing for some level of (at least runtime) type safety, and ask if IPlugin is .providedBy some object.

You can do most of this with Protocol, but it’s awkward. The @runtime_checkable decorator allows your Protocol to make isinstance(x, MyProtocol) work like IMyInterface.providedBy(x), but:

  1. you’re still missing directlyProvides; the runtime checking is all by type, not by the individual properties of the instance;
  2. it’s not the default, so if you’re not the one defining the Protocol, there’s no guarantee you’ll be able to use it.

With Interfaces, there’s also no mandatory relationship between the implementer (i.e. the type whose instances fit the specified shape) and the provider (the specific object which can fit the specified shape). This means you get features like classProvides and moduleProvides “for free”.

Interfaces work particularly well for communication between frameworks and application code. For example, let’s say you’re evolving the meaning of an Interface implemented by applications over time — EventHandler, EventHandler2, EventHandler3 — which have similarly named and typed methods, but subtly different expectations on their lifecycle or when precisely the methods will be called. A framework facing this problem can use a series of Interfaces, and check at runtime to see which of these the application implements, and be secure in the knowledge that the application has properly intentionally adopted the new interface, and doesn’t just happen to have a matching method name against an older version.

Finally, zope.interface gives you adaptation and adapter registries, which can be a useful mechanism for doing things like templating, like a much more powerful version of singledispatch from the standard library.

Adapter registries are nuanced, complex tools and unfortunately an example that captures the full utility of their power would itself be commensurately complex. However, the core of adaptation is the idea that if you have an arbitrary object x, and you want a provider of the interface IY, you can do the following:

1
y = IY(x, None)

This performs a multi-stage check:

  1. If x already provides IY (either via implementer, provider, directlyProvides, classProvides, or moduleProvides), it’s simply returned; so you don’t need to special-case the case where you’ve already got what you want.
  2. If x has a __conform__(interface) method, it’ll be called with IY as the interface, and if __conform__ returns anything non-None that result will be returned from the call to IY.
  3. If IY has a specially-defined __adapt__ method, it can implement its own logic for this hook directly.
  4. Each globally-registered function in zope.interface’s adapter_hooks will be invoked to find a function that can transform x into an IY provider. Twisted has its own global registry in this list, which is what registerAdapter manipulates.

But from the perspective of the caller, you can just say “I want an IY”.

With Protocols, you can emulate this with functools.singledispatch by making a function which returns your Protocol type and registers various types to do conversion. The place that adapter registries have an advantage is their central nature and consistent idiom for converting to the target type; you can use adaptation for any Interface in the same way, and any type can participate in adaptation in the ways listed above via flexible mechanisms depending on where it makes sense to put your implementation, whereas any singledispatch function to convert to a Protocol needs to be bespoke per-Protocol.

Describing and restricting existing shapes

There are still several scenarios where Protocol’s semantics apply more cleanly.

Unlike Interfaces, Protocols can describe the types of things that already exist. To see when that’s an advantage, consider a sprawling application that uses tons of libraries and manipulates 3D spatial data points.

There’s a convention among these disparate libraries where they all represent a “point” as an object with .x, .y, and .z attributes which are all floats. This is a natural enough shape, given the domain, that lots of your libraries just fit it by accident. You want to write functions that can work with data output by any of these libraries as long as it plausibly looks like your own concept of a Point:

1
2
3
4
class Point(Protocol):
    x: float
    y: float
    z: float

In this case, the thing defining the Protocol is your application; the thing implementing the Protocol is your collection of libraries. Since the libraries don’t and can’t know about the application — the dependency arrow points the other way — they can’t reference the Protocol to note that they implement it.

Using Protocol, you can also restrict an existing type to preserve future flexibility.

For example, let’s say we’re implementing a “mailbox” type pattern, where some systems deliver messages and other systems retrieve them later. To avoid mix-ups, the system that sends the messages shouldn’t retrieve them and vice versa - receivers only receive, and senders only send. With Protocols, we can describe this without having any new custom concrete types, like so:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
from typing import Protocol, TypeVar

T_co = TypeVar("T_co", covariant=True)
T_con = TypeVar("T_con", contravariant=True)

class Sender(Protocol[T_con]):
    def add(self, item: T_con) -> None:
        "Put an item in the slot."

class Receiver(Protocol[T_co]):
    def pop(self) -> T_co:
        "Retrieve an item from the PO box."

All of that code is just telling Mypy our intentions; there’s no behavior here yet.

The actual implementation is even shorter:

1
2
3
from typing import Set

mailbox: Set[int] = set()

Literally no code of our own - set already does the job we described. And how do we use this?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
def send(sender: Sender[int]) -> None:
    sender.add(3)

def receive(receiver: Receiver[int]) -> None:
    receiver.pop()
    receiver.add(3)
    # Mypy stops us making this mistake:
    # "Receiver[int]" has no attribute "add"

send(mailbox)
receive(mailbox)

For its initial implementation, this system requires nothing beyond types available in the standard library; just a set. However, by treating their parameter as a Sender and a Receiver respectively rather than a Set, send and receive prevent themselves from using any functionality from the set passed in aside from the one method that their respective roles are supposed to “see”. As a result, Mypy will now tell us if any code which receives the sender object tries to remove objects.

This allows us to use existing data structures in libraries without the usual attendant problem of advertising to all clients that every tiny implementation detail of those existing structures is an intended part of the public interface. Python has always tried to make these sort of distinctions by leaving certain things undocumented or saying narratively which things you should rely on, but it’s always hit-or-miss (usually miss) whether library consumers will see those admonitions or not; by making it a feature of the programming environment, Mypy makes it harder to ignore.

Conclusions

In modern Python code, when you have an abstract collection of behavior, you should probably consider using a Protocol to describe it by default. However, Interface is also staying up to date with modern Python tooling by with Mypy support, and it can be worthwhile for more sophisticated consumers that want support for nominal typing, or that want to draw on its reach adaptation and component registration feature-set.

Faster

The solution to bad questions is to ask better questions, not to ask no questions.

I’ve often heard Henry Ford quoted as saying:

“If I had asked people what they wanted, they would have said faster horses.”

Despite the fact that he probably didn’t actually say that, it does neatly encapsulate a certain approach to product development. And it’s one that the modern technology industry loves to lionize.

There’s a genre of mythologized product development whereby wholly unique and novel products spring, fully-formed, Athena-like, from the foreheads of Zeusian industrialists like Jobs, or Musk, or Bezos. This act of creation requires no input from customers. Indeed, the myths constructed about the iconic products associated with these industrialists often gloss over or outright ignore the work of their hundreds of thousands of employees, not to mention long years of iteration along with legions of early-adopter customers.

Ford’s other major area of contribution to public discourse was, of course, being a big ol’ Nazi, just writing so much Nazi stuff that he was one of Hitler’s heroes.1

This could be a coincidence, of course; lots of prominent thinkers in the past were absolutely hideous racists, anti-semites, slave owners and worse; these terrible ideas were often products of the time, and the people who held them sometimes nevertheless had other ideas worth examining.

But I think that this sentiment reflects a wider underlying infatuation with authoritarian ideology. At its core, the idea is that the uniquely gifted engineer is just better than their users, fundamentally smarter, more able to discern their true needs, more aware of the capabilities of the technology that we alone are familiar with. Why ask the little people, they can’t possibly know what they really need.

While we may blithely quote this sort of thing, when you look at the nuts and bolts of the technology industry, the actual practice of the industry has matured past it. Focus groups and user research are now cornerstones of interaction design. We know that it’s pure hubris to think that we can predict the way that users react with; you can’t just wing it.

But, I hadn’t heard a similarly pithy encapsulation of an empathetic approach that keeps the user in the loop and doesn’t condescend to them, until today. The quote came up, and my good friend Tristan Seligmann responded with this:

If you ask your users questions that they don’t have the skills to answer — like “how can we improve your horse?” — they will give you bad answers; but the solution to this is to ask better questions, not to ask no questions.

Tristan Seligmann

That, fundamentally, is the work-product of a really good engineer. Not faster horses or faster cars:

Better questions.


  1. Pro tip: don’t base your design ethos on Nazi ideas. 

Nice Animations with Twisted and PyGame

Flicker-free, time-accurate animation and movement using LoopingCall.

SNEKS

One of my favorite features within Twisted — but also one of the least known — is LoopingCall.withCount, which can be used in applications where you have some real-time thing happening, which needs to keep happening at a smooth rate regardless of any concurrent activity or pauses in the main loop. Originally designed for playing audio samples from a softphone without introducing a desync delay over time, it can also be used to play animations while keeping track of their appropriate frame.

LoopingCall is all around a fun tool to build little game features with. I’ve built a quick little demo to showcase some discoveries I’ve made over a few years of small hobby projects (none of which are ready for an open-source release) over here: DrawSnek.

This little demo responds to 3 key-presses:

  1. q quits. Always a useful thing for full-screen apps which don’t always play nice with C-c :).
  2. s spawns an additional snek. Have fun, make many sneks.
  3. h introduces a random “hiccup” of up to 1 full second so you can see what happens visually when the loop is overburdened or stuck.

Unfortunately a fully-functioning demo is a bit lengthy to go over line by line in a blog post, so I’ll just focus on a couple of important features for stutter- and tearing-resistant animation & drawing with PyGame & Twisted.

For starters, you’ll want to use a very recent prerelease of PyGame 2, which recently added support for vertical sync even without OpenGL mode; then, pass the vsync=1 argument to set_mode:

1
2
3
4
5
screen = pygame.display.set_mode(
    (640 * 2, 480 * 2),
    pygame.locals.SCALED | pygame.locals.FULLSCREEN,
    vsync=1
)

To allow for as much wall-clock time as possible to handle non-drawing work, such as AI and input handling, I also use this trick:

1
2
3
4
5
6
7
def drawScene():
    screen.fill((0, 0, 0))
    for drawable in self.drawables:
        drawable.draw(screen)
    return deferToThread(pygame.display.flip)

LoopingCall(drawScene).start(1 / 62.0)

By deferring pygame.display.flip to a thread1, the main loop can continue processing AI timers, animation, network input, and user input while blocking and waiting for the vertical blank. Since the time-to-vblank can easily be up to 1/120th of a second, this is a significant amount of time! We know that the draw won’t overlap with flip, because LoopingCall respects Deferreds returned from its callable and won’t re-invoke you until the Deferred fires.

Drawing doesn’t use withCount, because it just needs to repeat about once every refresh interval (on most displays, about 1/60th of a second); the vblank timing is what makes sure it lines up.

However, animation looks like this:

1
2
3
def animate(self, frameCount):
    self.index += frameCount
    self.index %= len(self.images)

We move the index forward by however many frames it’s been, then be sure it wraps around by modding it by the number of frames.

Similarly, the core2 of movement looks like this:

1
2
3
def move(self, frameCount):
    self.sprite.x += frameCount * self.dx
    self.sprite.y += frameCount * self.dy

Rather than moving based on the number of times we’ve been called, which can result in slowed-down movement when the framerate isn’t keeping up, we jump forward by however many frames we should have been called at this point in time.

One of these days, maybe I’ll make an actual game, but in the meanwhile I hope you all enjoy playing with these fun little basic techniques for using Twisted in your game engine.


  1. I’m mostly sure that this is safe, but, it’s definitely the dodgiest thing here. If you’re going to do this, make sure that you never do any drawing outside of the draw() method. 

  2. Hand-waving over a ton of tedious logic to change direction before we go out of bounds...