Zen Guardian

Let’s rewrite a fun toy Python program - in Python!

There should be one — and preferably only one — obvious way to do it.

Tim Peters, “The Zen of Python”


Moshe wrote a blog post a couple of days ago which neatly constructs a wonderful little coding example from a scene in a movie. And, as we know from the Zen of Python quote, there should only be one obvious way to do something in Python. So my initial reaction to his post was of course to do it differently — to replace an __init__ method with the new @dataclasses.dataclass decorator.

But as I thought about the code example more, I realized there are a number of things beyond just dataclasses that make the difference between “toy”, example-quality Python, and what you’d do in a modern, professional, production codebase today.

So let’s do everything the second, not-obvious way!


There’s more than one way to do it

Larry Wall, “The Other Zen of Python”


Getting started: the __future__ is now

We will want to use type annotations. But, the Guard and his friend are very self-referential, and will have lots of annotations that reference things that come later in the file. So we’ll want to take advantage of a future feature of Python, which is to say, Postponed Evaluation of Annotations. In addition to the benefit of slightly improving our import time, it’ll let us use the nice type annotation syntax without any ugly quoting, even when we need to make forward references.

So, to begin:

1
from __future__ import annotations

Doors: safe sets of constants

Next, let’s tackle the concept of “doors”. We don’t need to gold-plate this with a full blown Door class with instances and methods - doors don’t have any behavior or state in this example, and we don’t need to add it. But, we still wouldn’t want anyone using using this library to mix up a door or accidentally plunge to their doom by accidentally passing "certian death" when they meant certain. So a Door clearly needs a type of its own, which is to say, an Enum:

1
2
3
4
5
from enum import Enum

class Door(Enum):
    certain_death = "certain death"
    castle = "castle"

Questions: describing type interfaces

Next up, what is a “question”? Guards expect a very specific sort of value as their question argument and we if we’re using type annotations, we should specify what it is. We want a Question type that defines arguments for each part of the universe of knowledge that these guards understand. This includes who they are themselves, who the set of both guards are, and what the doors are.

We can specify it like so:

1
2
3
4
5
6
7
from typing import Protocol, Sequence

class Question(Protocol):
    def __call__(
        self, guard: Guard, guards: Sequence[Guard], doors: Sequence[Door]
    ) -> bool:
        ...

The most flexible way to define a type of thing you can call using mypy and typing is to define a Protocol with a __call__ method and nothing else1. We could also describe this type as Question = Callable[[Guard, Sequence[Guard], Door], bool] instead, but as you may be able to infer, that doesn’t let you easily specify names of arguments, or keyword-only or positional-only arguments, or required default values. So Protocol-with-__call__ it is.

At this point, we also get to consider; does the questioner need the ability to change the collection of doors they’re passed? Probably not; they’re just asking questions, not giving commands. So they should receive an immutable version, which means we need to import Sequence from the typing module and not List, and use that for both guards and doors argument types.

Guards and questions: annotating existing logic with types

Next up, what does Guard look like now? Aside from adding some type annotations — and using our shiny new Door and Question types — it looks substantially similar to Moshe’s version:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
from dataclasses import dataclass

@dataclass
class Guard:
    _truth_teller: bool
    _guards: Sequence[Guard]
    _doors: Sequence[Door]

    def ask(self, question: Question) -> bool:
        answer = question(self, self._guards, self._doors)
        if not self._truth_teller:
            answer = not answer
        return answer

Similarly, the question that we want to ask looks quite similar, with the addition of:

  1. type annotations for both the “outer” and the “inner” question, and
  2. using Door.castle for our comparison rather than the string "castle"
  3. replacing List with Sequence, as discussed above, since the guards in this puzzle also have no power to change their environment, only to answer questions.
  4. using the [var] = value syntax for destructuring bind, rather than the more subtle var, = value form
1
2
3
4
5
6
7
8
9
def question(guard: Guard, guards: Sequence[Guard], doors: Sequence[Door]) -> bool:
    [other_guard] = (candidate for candidate in guards if candidate != guard)

    def other_question(
        guard: Guard, guards: Sequence[Guard], doors: Sequence[Door]
    ) -> bool:
        return doors[0] == Door.castle

    return other_guard.ask(other_question)

Eliminating global state: building the guard post

Next up, how shall we initialize this collection of guards? Setting a couple of global variables is never good style, so let’s encapsulate this within a function:

1
2
3
4
5
6
7
from typing import List

def make_guard_post() -> Sequence[Guard]:
    doors = list(Door)
    guards: List[Guard] = []
    guards[:] = [Guard(True, guards, doors), Guard(False, guards, doors)]
    return guards

Defining the main point

And finally, how shall we actually have this execute? First, let’s put this in a function, so that it can be called by things other than running the script directly; for example, if we want to use entry_points to expose this as a script. Then, let's put it in a "__main__" block, and not just execute it at module scope.

Secondly, rather than inspecting the output of each one at a time, let’s use the all function to express that the interesting thing is that all of the guards will answer the question in the affirmative:

1
2
3
4
5
6
def main() -> None:
    print(all(each.ask(question) for each in make_guard_post()))


if __name__ == "__main__":
    main()

Appendix: the full code

To sum up, here’s the full version:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
from __future__ import annotations
from dataclasses import dataclass
from typing import List, Protocol, Sequence
from enum import Enum


class Door(Enum):
    certain_death = "certain death"
    castle = "castle"


class Question(Protocol):
    def __call__(
        self, guard: Guard, guards: Sequence[Guard], doors: Sequence[Door]
    ) -> bool:
        ...


@dataclass
class Guard:
    _truth_teller: bool
    _guards: Sequence[Guard]
    _doors: Sequence[Door]

    def ask(self, question: Question) -> bool:
        answer = question(self, self._guards, self._doors)
        if not self._truth_teller:
            answer = not answer
        return answer


def question(guard: Guard, guards: Sequence[Guard], doors: Sequence[Door]) -> bool:
    [other_guard] = (candidate for candidate in guards if candidate != guard)

    def other_question(
        guard: Guard, guards: Sequence[Guard], doors: Sequence[Door]
    ) -> bool:
        return doors[0] == Door.castle

    return other_guard.ask(other_question)


def make_guard_post() -> Sequence[Guard]:
    doors = list(Door)
    guards: List[Guard] = []
    guards[:] = [Guard(True, guards, doors), Guard(False, guards, doors)]
    return guards


def main() -> None:
    print(all(each.ask(question) for each in make_guard_post()))


if __name__ == "__main__":
    main()

Acknowledgments

I’d like to thank Moshe Zadka for the post that inspired this, as well as Nelson Elhage, Jonathan Lange, Ben Bangert and Alex Gaynor for giving feedback on drafts of this post.


  1. I will hopefully have more to say about typing.Protocol in another post soon; it’s the real hero of the Mypy saga, but more on that later... 

Modularity for Maintenance

Never send a human to do a machine’s job.

Never send a human to do a machine’s job.

One of the best things about maintaining open source in the modern era is that there are so many wonderful, free tools to let machines take care of the busy-work associated with collaboration, code-hosting, continuous integration, code quality maintenance, and so on.

There are lots of great resources that explain how to automate various things that make maintenance easier.

Here are some things you can configure your Python project to do:

  1. Continuous integration, using any one of a number of providers:
    1. GitHub Actions
    2. CircleCI
    3. Azure Pipelines
    4. Appveyor
    5. GitLab CI&CD
    6. Travis CI
  2. Separate multiple test jobs with tox
  3. Lint your code with flake8
  4. Type-Check your code with Mypy
  5. Auto-update your dependencies, with one of:
    1. pyup.io
    2. requires.io, or
    3. Dependabot
  6. automatically find common security issues with Bandit
  7. check the status of your code coverage, with:
    1. Coveralls, or
    2. Codecov
  8. Auto-format your code with:
    1. Black for style
    2. autopep8 to fix common errors
    3. isort to keep your imports tidy
  9. Help your developers remember to do all of those steps with pre-commit
  10. Automatically release your code to PyPI via your CI provider
    1. including automatically building any C code for multiple platforms as a wheel so your users won’t have to
    2. and checking those build artifacts:
      1. to make sure they include all the files they should, with check-manifest
      2. and also that the binary artifacts have the correct dependencies for Linux
      3. and also for macOS
  11. Organize your release notes and versioning with towncrier

All of these tools are wonderful.

But... let’s say you1 maintain a few dozen Python projects. Being a good maintainer, you’ve started splitting up your big monolithic packages into smaller ones, so your utility modules can be commonly shared as widely as possible rather than re-implemented once for each big frameworks. This is great!

However, every one of those numbered list items above is now a task per project that you have to repeat from scratch. So imagine a matrix with all of those down one side and dozens of projects across the top - the full Cartesian product of these little administrative tasks is a tedious and exhausting pile of work.

If you’re lucky enough to start every project close to perfect already, you can skip some of this work, but that partially just front-loads the tedium; plus, projects tend to start quite simple, then gradually escalate in complexity, so it’s helpful to be able to apply these incremental improvements one at a time, as your project gets bigger.

I really wish there were a tool that could take each of these steps and turn them into a quick command-line operation; like, I type pyautomate pypi-upload and the tool notices which CI provider I use, whether I use tox or not, and adds the appropriate configuration entries to both my CI and tox configuration to allow me to do that, possibly prompting me for a secret. Same for pyautomate code-coverage or what have you. All of these automations are fairly straightforward; almost all of the files you need to edit are easily parse-able either as yaml, toml, or ConfigParser2 files.

A few years ago, I asked for this to be added to CookieCutter, but I think the task is just too big and complicated to reasonably expect the existing maintainers to ever get around to it.

If you have a bunch of spare time, and really wanted to turbo-charge the Python open source community, eliminating tons of drag on already-over-committed maintainers, such a tool would be amazing.


  1. and by you, obviously, I mean “I” 

  2. “INI-like files”, I guess? what is this format even called? 

Mac Python Distribution Post Updated for Catalina and Notarization

Notarize your Python apps for macOS Catalina.

I previously wrote a post about shipping a PyGame app to users on macOS. It’s now substantially updated for the new Notarization requirements in Catalina. I hope it’s useful to somebody!

The Numbers, They Lie

when 2 + 2 = 4.00000000000000000001

It’s October, and we’re all getting ready for Halloween, so allow me to me tell you a horror story, in Python:

1
2
>>> 0.1 + 0.2 - 0.3
5.551115123125783e-17

some scary branches

Some of you might already be familiar with this chilling tale, but for those who might not have experienced it directly, let me briefly recap.

In Python, the default representation of a number with a decimal point in it is something called an “IEEE 754 double precision binary floating-point number”. This standard achieves a generally useful trade-off between performance, correctness, and is widely implemented in hardware, making it a popular choice for numbers in many programming language.

However, as our spooky story above indicates, it’s not perfect. 0.1 + 0.2 is very slightly less than 0.3 in this representation, because it is a floating-point representation in base 2.

If you’ve worked professionally with software that manipulates money1, you typically learn this lesson early; it’s quite easy to smash head-first into the problem with binary floating-point the first time you have an item that costs 30 cents and for some reason three dimes doesn’t suffice to cover it.

There are a few different approaches to the problem; one is using integers for everything, and denominating your transactions in cents rather than dollars. A strategy which requires less weird unit-conversion2, is to use the built-in decimal module, which provides a floating-point base 10 representation, rather than the standard base-2, which doesn’t have any of these weird glitches surrounding numbers like 0.1.

This is often where a working programmer’s numerical education ends; don’t use floats, they’re bad, use decimals, they’re good. Indeed, this advice will work well up to a pretty high degree of application complexity. But the story doesn’t end there. Once division gets involved, things can still get weird really fast:

1
2
3
>>> from decimal import Decimal
>>> (Decimal("1") / 7) * 14
Decimal('2.000000000000000000000000001')

The problem is the same: before, we were working with 1/10, a value that doesn’t have a finite (non-repeating) representation in base 2; now we’re working with 1/7, which has the same problem in base 10.

Any time you have a representation of a number which uses digits and a decimal point, no matter the base, you’re going to run in to some rational values which do not have an exact representation with a finite number of digits; thus, you’ll drop some digits off the (necessarily finite) end, and end up with a slightly inaccurate representation.

But Python does have a way to maintain symbolic accuracy for arbitrary rational numbers -- the fractions module!

1
2
3
4
5
>>> from fractions import Fraction
>>> Fraction(1)/3 + Fraction(2)/3 == 1
True
>>> (Fraction(1)/7) * 14 == 2
True

You can multiply and divide and add and subtract to your heart’s content, and still compare against zero and it’ll always work exactly, giving you the right answers.

So if Python has a “correct” representation, which doesn’t screw up our results under a basic arithmetic operation such as division, why isn’t it the default? We don’t care all that much about performance, right? Python certainly trades off correctness and safety in plenty of other areas.

First of all, while Python’s willing to trade off some storage or CPU efficiency for correctness, precise fractions rapidly consume huge amounts of storage even under very basic algorithms, like consuming gigabytes while just trying to maintain a simple running average over a stream of incoming numbers.

But even more importantly, you’ll notice that I said we could maintain symbolic accuracy for arbitrary rational numbers; but, as it turns out, a whole lot of interesting math you might want to do with a computer involves numbers which are irrational: like π. If you want to use a computer to do it, pretty much all trigonometry3 involves a slightly inaccurate approximation unless you have a literally infinite amount of storage.

As Morpheus put it, “welcome to the desert of the ”.


  1. or any proxy for it, like video-game virtual currency 

  2. and less time saying weird words like “nanodollars” to your co-workers 

  3. or, for that matter, geometry, or anything involving a square root 

A Few Bad Apples

incessantly advertise the bunch.

I’m a little annoyed at my Apple devices right now.

Time to complain.

“Trust us!” says Apple.

“We’re not like the big, bad Google! We don’t just want to advertise to you all the time! We’re not like Amazon, just trying to sell you stuff! We care about your experience. Magical. Revolutionary. Courageous!”

But I can’t hear them over the sound of my freshly-updated Apple TV — the appliance which exists solely to play Daniel Tiger for our toddler — playing the John Wick 3 trailer at full volume automatically as soon as it turns on.

For the aforementioned toddler.

I should mention that it is playing this trailer while specifically logged in to a profile that knows their birth date1 and also their play history2.


I’m aware of the preferences which control autoplay on the home screen; it’s disabled now. I’m aware that I can put an app other than “TV” in the default spot, so that I can see ads for other stuff, instead of the stuff “TV” shows me ads for.

But the whole point of all this video-on-demand junk was supposed to be that I can watch what I want, when I want — and buying stuff on the iTunes store included the implicit promise of no advertisements.

At least Google lets me search the web without any full-screen magazine-style ads popping up.

Launch the app store to check for new versions?

apple arcade ad

I can’t install my software updates without accidentally seeing HUGE ads for new apps.

Launch iTunes to play my own music?

apple music ad

I can’t play my own, purchased music without accidentally seeing ads for other music — and also Apple’s increasingly thirsty, desperate plea for me to remember that they have a streaming service now. I don’t want it! I know where Spotify is if I wanted such a thing, the whole reason I’m launching iTunes is that I want to buy and own the music!

On my iPhone, I can’t even launch the Settings app to turn off my WiFi without seeing an ad for AppleCare+, right there at the top of the UI, above everything but my iCloud account. I already have AppleCare+; I bought it with the phone! Worse, at some point the ad glitched itself out, and now it’s blank, and when I tap the blank spot where the ad used to be, it just shows me this:

undefined is not an insurance plan

I just want to use my device, I don’t need ad detritus littering every blank pixel of screen real estate.

Knock it off, Apple.


  1. less than 3 years ago 

  2. Daniel Tiger, Doctor McStuffins, Word World; none of which have super significant audience overlap with the John Wick franchise