Telemetry Is Not Your Enemy

Not all data collection is the same, and not all of it is bad.

Part 1: A Tale of Two Metaphors

In software development “telemetry” is data collected from users of the software, almost always delivered to the authors of the software via the Internet.

In recent years, there has been a great deal of angry public discourse about telemetry. In particular, there is a lot of concern that every software vendor and network service operator collecting any data at all is spying on its users, surveilling every aspect of our lives. The media narrative has been that any tech company collecting data for any purpose is acting creepy as hell.

I am quite sympathetic to this view. In general, some concern about privacy is warranted whenever some new data-collection scheme is proposed. However it seems to me that the default response is no longer “concern and skepticism”; but rather “panic and fury”. All telemetry is seen as snooping and all snooping is seen as evil.

There’s a sense in which software telemetry is like surveillance. However, it is only like surveillance. Surveillance is a metaphor, not a description. It is far from a perfect metaphor.

In the discourse around user privacy, I feel like we have lost a lot of nuance about the specific details of telemetry when some people dismiss all telemetry as snooping, spying, or surveillance.

Here are some ways in which software telemetry is not like “snooping”:

  1. The data may be aggregated. The people consuming the results of telemetry are rarely looking at individual records, and individual records may not even exist in some cases. There are tools, like Prio, to do this aggregation to be as privacy-sensitive as possible.
  2. The data is rarely looked at by human beings. In the cases (such as ad-targeting) where the data is highly individuated, both the input (your activity) and the output (your recommendations) are both mainly consumed by you, in your experience of a product, by way of algorithms acting upon the data, not by an employee of the company you’re interacting with.1
  3. The data is highly specific. “Here’s a record with your account ID and the number of times you clicked the Add To Cart button without checking out” is not remotely the same class of information as “Here’s several hours of video and audio, attached to your full name, recorded without your knowledge or consent”. Emotional appeals calling any data “surveillance” tend to suggest that all collected data is the latter, where in reality most of it is much closer to the former.

There are other metaphors which can be used to understand software telemetry. For example, there is also a sense in which it is like voting.

I emphasize that voting is also a metaphor here, not a description. I will also freely admit that it is in many ways a worse metaphor for telemetry than “surveillance”. But it can illuminate other aspects of telemetry, the ones that the surveillance metaphor leaves out.

Data-collection is like voting because the data can represent your interests to a party that has some power over you. Your software vendor has the power to change your software, and you probably don’t, either because you don’t have access to the source code. Even if it’s open source, you almost certainly don’t have the resources to take over its maintenance.

For example, let’s consider this paragraph from some Microsoft documentation about telemetry:

We also use the insights to drive improvements and intelligence into some of our management and monitoring solutions. This improvement helps customers diagnose quality issues and save money by making fewer support calls to Microsoft.

“Examples of how Microsoft uses the telemetry data” from the Azure SDK documentation

What Microsoft is saying here is that they’re collecting the data for your own benefit. They’re not attempting to justify it on the basis that defenders of law-enforcement wiretap schemes might. Those who want literal mass surveillance tend to justify it by conceding that it might hurt individuals a little bit to be spied upon, but if we spy on everyone surely we can find the bad people and stop them from doing bad things. That’s best for society.

But Microsoft isn’t saying that.2 What Microsoft is saying here is that if you’re experiencing a problem, they want to know about it so they can fix it and make the experience better for you.

I think that is at least partially true.

Part 2: I Qualify My Claims Extensively So You Jackals Don’t Lose Your Damn Minds On The Orange Website

I was inspired to write this post due to the recent discussions in the Go community about how to collect telemetry which provoked a lot of vitriol from people viscerally reacting to any telemetry as invasive surveillance. I will therefore heavily qualify what I’ve said above to try to address some of that emotional reaction in advance.

I am not suggesting that we must take Microsoft (or indeed, the Golang team) fully at their word here. Trillion dollar corporations will always deserve skepticism. I will concede in advance that it’s possible the data is put to other uses as well, possibly to maximize profits at the expense of users. But it seems reasonable to assume that this is at least partially true; it’s not like Microsoft wants Azure to be bad.

I can speak from personal experience. I’ve been in professional conversations around telemetry. When I have, my and my teams’ motivations were overwhelmingly focused on straightforwardly making the user experience good. We wanted it to be good so that they would like our products and buy more of them.

It’s hard enough to do that without nefarious ulterior motives. Most of the people who develop your software just don’t have the resources it takes to be evil about this.

Part 3: They Can’t Help You If They Can’t See You

With those qualifications out of the way, I will proceed with these axioms:

  1. The developers of software will make changes to it.
  2. These changes will benefit some users.
  3. Which changes the developers select will be derived, at least in part, from the information that they have.
  4. At least part of the information that the developers have is derived from the telemetry they collect.

If we can agree that those axioms are reasonable, then let us imagine two user populations:

  • Population A is privacy-sensitive and therefore sees telemetry as bad, and opts out of everything they possibly can.
  • Population B doesn’t care about privacy, and therefore ignores any telemetry and blithely clicks through any opt-in.

When the developer goes to make changes, they will have more information about Population B. Even if they’re vaguely aware that some users are opting out (or refusing to opt in), the developer will know far less about Population A. This means that any changes the developer makes will not serve the needs of their privacy-conscious users, which means fewer features that respect privacy as time goes on.

Part 4: Free as in Fact-Free Guesses

In the world of open source software, this problem is even worse. We often have fewer resources with which to collect and analyze telemetry in the first place, and when we do attempt to collect it, a vocal minority among those users are openly hostile, with feedback that borders on harassment. So we often have no telemetry at all, and are making changes based on guesses.

Meanwhile, in proprietary software, the user population is far larger and less engaged. Developers are not exposed directly to users and therefore cannot be harassed or intimidated into dropping their telemetry. Which means that proprietary software gains a huge advantage: they can know what most of their users want, make changes to accommodate it, and can therefore make a product better than the one based on uninformed guesses from the open source competition.

Proprietary software generally starts out with a panoply of advantages already — most of which boil down to “money” — but our collective knee-jerk reaction to any attempt to collect telemetry is a massive and continuing own-goal on the part of the FLOSS community. There’s no inherent reason why free software’s design cannot be based on good data, but our community’s history and self-selection biases make us less willing to consider it.

That does not mean we need to accept invasive data collection that is more like surveillance. We do not need to allow for stockpiled personally-identifiable information about individual users that lives forever. The abuses of indiscriminate tech data collection are real, and I am not suggesting that we forget about them.

The process for collecting telemetry must be open and transparent, the data collected needs to be continuously vetted for safety. Clear data-retention policies should always be in place to avoid future unanticipated misuses of data that is thought to be safe today but may be de-anonymized or otherwise abused in the future.

I want the collaborative feedback process of open source development to result in this kind of telemetry: thoughtful, respectful of user privacy, and designed with the principle of least privilege in mind. If we have this kind of process, then we could hold it up as an example for proprietary developers to follow, and possibly improve the industry at large.

But in order to be able to produce that example, we must produce criticism of telemetry efforts that is specific, grounded in actual risks and harms to users, rather than a series of emotional appeals to slippery-slope arguments that do not correspond to the actual data being collected. We must arrive at a consensus that there are benefits to users in allowing software engineers to have enough information to do their jobs, and telemetry is not uniformly bad. We cannot allow a few users who are complaining to stop these efforts for everyone.

After all, when those proprietary developers look at the hard data that they have about what their users want and need, it’s clear that those who are complaining don’t even exist.


  1. Please note that I’m not saying that this automatically makes such collection ethical. Attempting to modify user behavior or conduct un-reviewed psychological experiments on your customers is also wrong. But it’s wrong in a way that is somewhat different than simply spying on them. 

  2. I am not suggesting that data collected for the purposes of improving the users’ experience could not be used against their interest, whether by law enforcement or by cybercriminals or by Microsoft itself. Only that that’s not what the goal is here. 

What Would You Say You Do Here?

A brief description of the various projects that I am hoping to do independently, with your support. In other words, this is an ad, for me.

What have I been up to?

Late last year, I launched a Patreon. Although not quite a “soft” launch — I did toot about it, after all — I didn’t promote it very much.

I started this way because I realized that if I didn’t just put something up I’d be dithering forever. I’d previously been writing a sprawling monster of an announcement post that went into way too much detail, and kept expanding to encompass more and more ideas until I came to understand that salvaging it was going to be an editing process just as brutal and interminable as the writing itself.

However, that post also included a section where I just wrote about what I was actually doing.

So, for lots of reasons1, there are a diverse array of loosely related (or unrelated) projects below which may not get finished any time soon. Or, indeed, may go unfinished entirely. Some are “done enough” now, and just won’t receive much in the way of future polish.

That is an intentional choice.

The rationale, as briefly as I can manage, is: I want to lean into the my strength2 of creative, divergent thinking, and see how these ideas pan out without committing to them particularly intensely. My habitual impulse, for many years, has been to lean extremely hard on strategies that compensate for my weaknesses in organization, planning, and continued focus, and attempt to commit to finishing every project to prove that I’ll never flake on anything.

While the reward tiers for the Patreon remain deliberately ambiguous3, I think it would be fair to say that patrons will have some level of influence in directing my focus by providing feedback on these projects, and requesting that I work more on some and less on others.

So, with no further ado: what have I been working on, and what work would you be supporting if you signed up? For each project, I’ll be answering 3 questions:

  1. What is it?
  2. What have I been doing with it recently?
  3. What are my plans for it?

This. i.e. blog.glyph.im

What is it?

For starters, I write stuff here. I guess you’re reading this post for some reason, so you might like the stuff I write? I feel like this doesn’t require much explanation.

What have I done with it recently?

You might appreciate the explicitly patron-requested Potato Programming post, a screed about dataclass, or a deep dive on the difficulties of codesigning and notarization on macOS along with an announcement of a tool to remediate them.

What are my plans for it?

You can probably expect more of the same; just all the latest thoughts & ideas from Glyph.

Twisted

What is it?

If you know of me you probably know of me as “the Twisted guy” and yeah, I am still that. If, somehow, you’ve ended up here and you don’t know what it is, wow, that’s cool, thanks for coming, super interested to know what you do know me for.

Twisted is an event-driven networking engine written in Python, the precursor and inspiration for the asyncio module, and a suite of event-driven programming abstractions, network protocol implementations, and general utility code.

What have I done with it recently?

I’ve gotten a few things merged, including type annotations for getPrimes and making the bundled CLI OpenSSH server replacement work at all with public key authentication again, as well as some test cleanups that reduce the overall surface area of old-style Deferred-returning tests that can be flaky and slow.

I’ve also landed a posix_spawnp-based spawnProcess implementation which speed up process spawning significantly; this can be as much as 3x faster if you do a lot of spawning of short-running processes.

I have a bunch of PRs in flight, too, including better annotations for FilePath Deferred, and IReactorProcess, as well as a fix for the aforementioned posix_spawnp implementation.

What are my plans for it?

A lot of the projects below use Twisted in some way, and I continue to maintain it for my own uses. My particular focus is in quality-of-life improvements; issues that someone starting out with a Twisted project will bump into and find confusing or difficult. I want it to be really easy to write applications with Twisted and I want to use my own experiences with it.

I also do code reviews of other folks’ contributions; we do still have over 100 open PRs right now.

DateType

What is it?

DateType is a workaround for a very specific bug in the way that the datetime standard library module deals with type composition: to wit, that datetime is a subclass of date but is not Liskov-substitutable for it. There are even #type:ignore comments in the standard library type stubs to work around this problem, because if you did this in your own code, it simply wouldn’t type-check.

What have I done with it recently?

I updated it a few months ago to expose DateTime and Time directly (as opposed to AwareDateTime and NaiveDateTime), so that users could specialize their own functions that took either naive or aware times without ugly and slightly-incorrect unions.

What are my plans for it?

This library is mostly done for the time being, but if I had to polish it a bit I’d probably do two things:

  1. a readthedocs page for nice documentation
  2. write a PEP to get this integrated into the standard library

Although the compatibility problems are obviously very tricky and a PEP would probably be controversial, this is ultimately a bug in the stdlib, and should be fixed upstream there.

Automat

What is it?

It’s a library to make deterministic finite-state automata easier to create and work with.

What have I done with it recently?

Back in the middle of last year, I opened a PR to create a new, completely different front-end API for state machine definition. Instead of something like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
class MachineExample:
    machine = MethodicalMachine()

    @machine.state()
    def a_state(self): ...

    @machine.state()
    def other_state(self): ...

    @machine.input()
    def flip(self): ...

    @machine.output()
    def _do_flip(self): return ...

    on.upon(flip, enter=off, outputs=[_do_flip], collector=list)
    off.upon(flip, enter=on, outputs=[_do_flip], collector=list)

this branch lets you instead do something like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
class MachineProtocol(Protocol):
    def flip(self) -> None: ...

class MachineCore: ...

def buildCore() -> MachineCore: ...
machine = TypicalBuilder(MachineProtocol, buildCore)

@machine.state()
class _OffState:
    @machine.handle(MachineProtocol.flip, enter=lambda: _OnState)
    def flip(self) -> None: ...

@machine.state()
class _OnState:
    @machine.handle(MachineProtocol.flip, enter=lambda: _OffState)
    def flip(self) -> None: ...

MachineImplementation = machine.buildClass()

In other words, it creates a state for every type, and type safety that much more cleanly expresses what methods can be called and by whom; no need to make everything private with tons of underscore-prefixed methods and attributes, since all the caller can see is “an implementation of MachineProtocol”; your state classes can otherwise just be normal classes, which do not require special logic to be instantiated if you want to use them directly.

Also, by making a state for every type, it’s a lot cleaner to express that certain methods require certain attributes, by simply making them available as attributes on that state and then requiring an argument of that state type; you don’t need to plot your way through the outputs generated in your state graph.

What are my plans for it?

I want to finish up dealing with some issues with that branch - particularly the ugly patterns for communicating portions of the state core to the caller and also the documentation; there are a lot of magic signatures which make sense in heavy usage but are a bit mysterious to understand while you’re getting started.

I’d also like the visualizer to work on it, which it doesn’t yet, because the visualizer cribs a bunch of state from MethodicalMachine when it should be working purely on core objects.

Secretly

What is it?

This is an attempt at a holistic, end-to-end secret management wrapper around Keyring. Whereas Keyring handles password storage, this handles the whole lifecycle of looking up the secret to see if it’s there, displaying UI to prompt the user (leveraging a pinentry program from GPG if available)

What have I done with it recently?

It’s been a long time since I touched it.

What are my plans for it?

  • Documentation. It’s totally undocumented.
  • It could be written to be a bit more abstract. It dates from a time before asyncio, so its current Twisted requirement for Deferred could be made into a generic Awaitable one.
  • Better platform support for Linux & Windows when GPG’s pinentry is not available.
  • Support for multiple accounts so that when the user is prompted for the relevant credential, they can store it.
  • Integration with 1Password via some of their many potentially relevant APIs.

Fritter

What is it?

Fritter is a frame-rate independent timer tree.

In the course of developing Twisted, I learned a lot about time and timers. LoopingCall encodes some of this knowledge, but it’s very tightly coupled to the somewhat limited IReactorTime API.

Also, LoopingCall was originally designed with the needs of media playback (particularly network streaming audio playback) in mind, but I have used it more for background maintenance tasks and for animations. Both of these things have requirements that LoopingCall makes awkward but FRITTer is designed to meet:

  1. At higher loads, surprising interactions can occur with the underlying priority queue implementation, and different algorithms may make a significant difference to performance. Fritter has a pluggable implementation of a priority queue and is carefully minimally coupled to it.

  2. Driver selection is a first-class part of the API, with an included, public “Memory” driver for testing, rather than LoopingCall’s “testing is at least possible.reactor attribute. This means that out of the box it supports both Twisted and asyncio, and can easily have other things added.

  3. The API is actually generic on what constitutes time itself, which means that you can use it for both short-term (i.e.: monotonic clock values as float-seconds) and long-term (civil times as timezone-aware datetime objects) recurring tasks. Recurrence rules can also be arbitrary functions.

  4. There is a recursive driver (this is the “tree” part) which both allows for:

    a. groups of timers which can be suspended and resumed together, and

    b. scaling of time, so that you can e.g. speed up or slow down the ticks for AIs, groups of animations, and so on, also in groups.

  5. The API is also generic on what constitutes work. This means that, for example, in a certain timer you can say “all work units scheduled on this scheduler, in addition to being callable, must also have an asJSON method”. And in fact that’s exactly what the longterm module in Fritter does.

I can neither confirm nor deny that this project was factored out of a game engine for a secret game project which does not appear on this list.

What have I done with it recently?

Besides realizing, in the course of writing this blog post, that its CI was failing its code quality static checks (oops), the last big change was the preliminary support for recursive timers and serialization.

What are my plans for it?

  • These haven’t been tested in anger yet and I want to actually use them in a larger project to make sure that they don’t have any necessary missing pieces.

  • Documentation.

Encrust

What is it?

I have written about Encrust quite recently so if you want to know about it, you should probably read that post. In brief, it is a code-shipping tool for py2app. It takes care of architecture-independence, code-signing, and notarization.

What have I done with it recently?

Wrote it. It’s brand new as of this month.

What are my plans for it?

I really want this project to go away as a tool with an independent existence. Either I want its lessons to be fully absorbed into Briefcase or perhaps py2app itself, or for it to become a library that those things call into to do its thing.

Various Small Mac Utilities

What is it?

  • QuickMacApp is a very small library for creating status-item “menu bar apps” in Python which don’t have much of a UI but want to run some Python code in the background and occasionally pop up a notification or ask the user a question or something. The idea is that if you have a utility that needs a minimal UI to just ask the user one or two things, you should be able to give it a GUI immediately, without thinking about it too much.
  • QuickMacHotkey this is a very minimal API to register hotkeys on macOS. this example is what comes up if you search the web for such a thing, but it hasn’t worked on a current Python for about 11 years. This isn’t the “right” way to do such a thing, since it provides no UI to set the shortcut, you’d have to hard-code it. But MASShortcut is now archived and I haven’t had the opportunity to investigate HotKey, so for the time being, it’s a handy thing, and totally adequate for the sort of quick-and-dirty applications you might make with QuickMacApp.
  • VEnvDotApp is a way of giving a virtualenv its own Info.plist and bundle ID, so that command-line python tools that just need to pop up a little mac GUI, like an alert or a notification, can do so with cross-platform tools without looking like it’s an app called “Python”, or in some cases breaking entirely.
  • MOPUp is a command-line updater for upstream Python.org macOS Python. For distributing third-party apps, Python.org’s version is really the one you want to use (it’s universal2, and it’s generally built with compiler options that make it a distributable thing itself) but updating it by downloading a .pkg file from a web browser is kind of annoying.

What have I done with it recently?

I’ve been releasing all these tools as they emerge and are factored out of other work, and they’re all fairly recent.

What are my plans for it?

I will continue to factor out any general-purpose tools from my platform-specific Python explorations — hopefully more Linux and Windows too, once I’ve got writing code for my own computer down, but most of the tools above are kind of “done” on their own, at the moment.

The two things that come to mind though are that QuickMacApp should have a way of owning the menubar sometimes (if you don’t have something like Bartender, menu-bar-status-item-only apps can look like they don’t do anything when you launch them), and that MOPUp should probably be upstreamed to python.org.

Pomodouroboros

What is it?

Pomodouroboros is a pomodoro timer with a highly opinionated take. It’s based on my own experience of ADHD time blindness, and is more like a therapeutic intervention for that specific condition than a typical “productivity” timer app.

In short, it has two important features that I have found lacking in other tools:

  1. A gigantic, absolutely impossible to ignore visual timer that presents a HUD overlay over your entire desktop. It remains low-opacity and static most of the time but pulses every 30 seconds to remind you that time is passing.
  2. Rather than requiring you to remember to set a timer before anything happens, it has an idea of “work hours” when you want to be time-sensitive and presents constant prompting to get started.

What have I done with it recently?

I’ve been working on it fairly consistently lately. The big things I’ve been doing have been:

  1. factoring things out of the Pomodouroboros-specific code and into QuickMacApp and Encrust.
  2. Porting the UI to the redesigned core of the application, which has been implemented and tested in platform-agnostic Python but does not have any UI yet.
  3. fully productionizing the build process and ensuring that Encrust is producing binary app bundles that people can use.

What are my plans for it?

In brief, “finish the app”. I want this to have its own website and find a life beyond the Python community, with people who just want a timer app and don’t care how it’s written. The top priority is to replace the current data model, which is to say the parts of the UI that set and evaluate timers and edit the list of upcoming timers (the timer countdown HUD UI itself is fine).

I also want to port it to other platforms, particularly desktop Linux, where I know there are many users interested in such a thing. I also want to do a CLI version for folks who live on the command line.

Finally: Pomodouroboros serves as a test-bed for a larger goal, which is that I want to make it easier for Python programmers, particularly beginners who are just getting into coding at all, to write code that not only interacts with their own computer, but that they can share with other users in a real way. As you can see with Encrust and other projects above, as much as I can I want my bumpy ride to production code to serve as trailblazing so that future travelers of this path find it as easy as possible.

And Here Is Where The CTA Goes

If this stuff sounds compelling, you can obviously sign up, that would be great. But also, if you’re just curious, go ahead and give some of these projects some stars on GitHub or just share this post. I’d also love to hear from you about any of this!

If a lot of people find this compelling, then pursuing these ideas will become a full-time job, but I’m pretty far from that threshold right now. In the meanwhile, I will also be doing a bit of consulting work.

I believe much of my upcoming month will be spoken for with contracting, although quite a bit of that work will also be open source maintenance, for which I am very grateful to my generous clients. Please do get in touch if you have something more specific you’d like me to work on, and you’d like to become one of those clients as well.


  1. Reasons which will have to remain mysterious until I can edit about 10,000 words of abstract, discursive philosophical rambling into something vaguely readable. 

  2. A strength which is common to many, indeed possibly most, people with ADHD. 

  3. While I want to give myself some leeway to try out ideas without necessarily finishing them, I do not want to start making commitments that I can’t keep. Particularly commitments that are tied to money! 

Data Classification

Does Python still have a need for class without @dataclass?

Is there a place for non-@dataclass classes in Python any more?

I have previously — and somewhat famously — written favorably about @dataclass’s venerable progenitor, attrs, and how you should use it for pretty much everything.

At the time, attrs was an additional dependency, a piece of technology that you could bolt on to your Python stack to make your particular code better. While I advocated for it strongly, there are all the usual implicit reasons against using a new thing. It was an additional dependency, it might not interoperate with other convenience mechanisms for type declarations that you were already using (i.e. NamedTuple), it might look weird to other Python programmers familiar with existing tools, and so on. I don’t think that any of these were good counterpoints, but there was nevertheless a robust discussion to be had in addressing them all.

But for many years now, dataclasses have been — and currently are — built in to the language. They are increasingly integrated to the toolchain at a deep level that is difficult for application code — or even other specialized tools — to replicate. Everybody knows what they are. Few or none of those reasons apply any longer.

For example, classes defined with @dataclass are now optimized as a C structure might be when you compile them with mypyc, a trick that is extremely useful in some circumstances, which even attrs itself now has trouble keeping up with.

This all raises the question for me: beyond backwards compatibility, is there any point to having non-@dataclass classes any more? Is there any remaining justification for writing them in new code?

Consider my original example, translated from attrs to dataclasses. First, the non-dataclass version:

1
2
3
4
5
class Point3D:
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z

And now the dataclass one:

1
2
3
4
5
6
7
from dataclasses import dataclass

@dataclass
class Point3D:
    x: int
    y: int
    z: int

Many of my original points still stand. It’s still less repetitive. In fewer characters, we’ve expressed considerably more information, and we get more functionality (repr, sorting, hashing, etc). There doesn’t seem to be much of a downside besides the strictness of the types, and if typing.Any were a builtin, x: any would be fine for those who don’t want to unduly constrain their code.

The one real downside of the latter over the former right now is the need for an import. Which, at this point, just seems… confusing? Wouldn’t it be nicer to be able to just write this:

1
2
3
4
class Point3D:
    x: int
    y: int
    z: int

and not need to faff around with decorator semantics and fudging the difference between Mypy (or Pyright or Pyre) type-check-time and Mypyc or Cython compile time? Or even better, to not need to explain the complexity of all these weird little distinctions to new learners of Python, and to have to cover import before class?

These tools all already treat the @dataclass decorator as a totally special language construct, not really like a decorator at all, so to really explore it you have to explain a special case and then a special case of a special case. The extension hook for this special case of the special case notwithstanding.

If we didn’t want any new syntax, we would need a from __future__ import dataclassification or some such for a while, but this doesn’t seem like an impossible bar to clear.


There are still some folks who don’t like type annotations at all, and there’s still the possibility of awkward implicit changes in meaning when transplanting code from a place with dataclassification enabled to one without, so perhaps an entirely new unambiguous syntax could be provided. One that more closely mirrors the meaning of parentheses in def, moving inheritance (a feature which, whether you like it or not, is clearly far less central to class definitions than ‘what fields do I have’) off to its own part of the syntax:

1
2
3
data Point3D(x: int, y: int, z: int) from Vector:
    def method(self):
        ...

which, for the “I don’t like types” contingent, could reduce to this in the minimal case:

1
2
data Point3D(x, y, z):
    pass

Just thinking pedagogically, I find it super compelling to imagine moving from teaching def foo(x, y, z):... to data Foo(x, y, z):... as opposed to @dataclass class Foo: x: int....

I don’t have any desire for semantic changes to accompany this, just to make it possible for newcomers to ignore the circuitous historical route of the @dataclass syntax and get straight into defining their own types with legible reprs from the very beginning of their Python journey.

(And make it possible for me to skip a couple of lines of boilerplate in short examples, as a bonus.)


I’m curious to know what y’all think, though. Shoot me an email or a toot and let me know.

In particular:

  1. Do you think there’s some reason I’m missing why Python’s current method for defining classes via a bunch of dunder methods is still better than dataclasses, or should stick around into the future for reasons beyond “compatibility”?
  2. Do you think “compatibility” is sufficient reason to keep the syntax the way it is forever, and I’m underestimating the cost of adding a keyword like this?
  3. If you do think that a change should be made, would you prefer:
    1. changing the meaning of class itself via a __future__ import,
    2. a new data keyword like the one I’ve proposed,
    3. a new keyword that functions exactly like the one I have proposed but really want to bikeshed the word data a bunch,
    4. something more incremental like just putting dataclass and field in builtins,
    5. or an option I haven’t even contemplated here?

If I find I’m not alone in this perhaps I will wander over to the Python discussion boards to have a more substantive conversation...


Thank you to my patrons who are helping me while I try to turn… whatever this is… along with open source maintenance and application development, into a real job. Do you want to see me pursue ideas like this one further? If so, you can support my work as a sponsor!

A Very Silly Program

This program will not work on your computer.

One of the persistently lesser-known symptoms of ADHD is hyperfocus. It is sometimes quasi-accurately described as a “superpower”1 2, which it can be. In the right conditions, hyperfocus is the ability to effortlessly maintain a singular locus of attention for far longer than a neurotypical person would be able to.

However, as a general rule, it would be more accurate to characterize hyperfocus not as an “ability to focus on X” but rather as “an inability to focus on anything other than X”. Sometimes hyperfocus comes on and it just digs its claws into you and won’t let go until you can achieve some kind of closure.

Recently, the X I could absolutely not stop focusing on — for days at a time — was this extremely annoying picture:

chroma subsampling carnage

Which lead to me writing the silliest computer program I have written in quite some time.


You see, for some reason, macOS seems to prefer YUV422 chroma subsampling3 on external displays, even when the bitrate of the connection and selected refresh rate support RGB.4 Lots of people have been trying to address this for a literal decade5 6 7 8 9 10 11, and the problem has gotten worse with Apple Silicon, where the operating system no longer even supports the EDID-override functionality available on every other PC operating system that supports plugging in a monitor.

In brief, this means that every time I unplug my MacBook from its dock and plug it back in more than 5 minutes later, its color accuracy is destroyed and red or blue text on certain backgrounds looks like that mangled mess in the picture above. Worse, while the color distinction is definitely noticeable, it’s so subtle that it’s like my display is constantly gaslighting me. I can almost hear it taunting me:

Magenta? Yeah, magenta always looked like this. Maybe it’s the ambient lighting in this room. You don’t even have a monitor hood. Remember how you had to use one of those for print design validation? Why would you expect it to always look the same without one?

Still, I’m one of the luckier people with this problem, because I can seem to force RGB / 444 color format on my display just by leaving the display at 120Hz rather than 144, then toggling HDR on and then off again. At least I don’t need to plug in the display via multiple HDMI and displayport cables and go into the OSD every time. However, there is no API to adjust, or even discover the chroma format of your connected display’s link, and even the accessibility features that supposedly let you drive GUIs are broken in the system settings “Displays” panel12, so you have to do it by sending synthetic keystrokes and hoping you can tab-focus your way to the right place.

Anyway, this is a program which will be useless to anyone else as-is, but if someone else is struggling with the absolute inability to stop fiddling with the OS to try and get colors to look correct on a particular external display, by default, all the time, maybe you could do something to hack on this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import os
from Quartz import CGDisplayRegisterReconfigurationCallback, kCGDisplaySetMainFlag, kCGDisplayBeginConfigurationFlag
from ColorSync import CGDisplayCreateUUIDFromDisplayID
from CoreFoundation import CFUUIDCreateString
from AppKit import NSApplicationMain, NSApplicationActivationPolicyAccessory, NSApplication

NSApplication.sharedApplication().setActivationPolicy_(NSApplicationActivationPolicyAccessory)

CGDirectDisplayID = int
CGDisplayChangeSummaryFlags = int

MY_EXTERNAL_ULTRAWIDE = '48CEABD9-3824-4674-9269-60D1696F0916'
MY_INTERNAL_DISPLAY = '37D8832A-2D66-02CA-B9F7-8F30A301B230'

def cb(display: CGDirectDisplayID, flags: CGDisplayChangeSummaryFlags, userInfo: object) -> None:
    if flags & kCGDisplayBeginConfigurationFlag:
        return
    if flags & kCGDisplaySetMainFlag:
        displayUuid = CGDisplayCreateUUIDFromDisplayID(display)
        uuidString = CFUUIDCreateString(None, displayUuid)
        print(uuidString, "became the main display")
        if uuidString == MY_EXTERNAL_ULTRAWIDE:
            print("toggling HDR to attempt to clean up subsampling")
            os.system("/Users/glyph/.local/bin/desubsample")
            print("HDR toggled.")

print("registered", CGDisplayRegisterReconfigurationCallback(cb, None))

NSApplicationMain([])

and the linked desubsample is this atrocity, which I substantially cribbed from this helpful example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#!/usr/bin/osascript

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use framework "AppKit"
use scripting additions

tell application "System Settings"
    quit
    delay 1
    activate
    current application's NSWorkspace's sharedWorkspace()'s openURL:(current application's NSURL's URLWithString:"x-apple.systempreferences:com.apple.Displays-Settings.extension")
    delay 0.5

    tell application "System Events"
    tell process "System Settings"
        key code 48
        key code 48
        key code 48
            delay 0.5
        key code 49
        delay 0.5
        -- activate hdr on left monitor

        set hdr to checkbox 1 of group 3 of scroll area 2 of ¬
                group 1 of group 2 of splitter group 1 of group 1 of ¬
                window "Displays"
        tell hdr
                click it
                delay 1.0
                if value is 1
                    click it
                end if
        end tell

    end tell
    end tell
    quit
end tell

This ridiculous little pair of programs does it automatically, so whenever I reconnect my MacBook to my desktop dock at home, it faffs around with clicking the HDR button for me every time. I am leaving it running in a background tmux session so — hopefully — I can finally stop thinking about this.

Potato Programming

One potato, two potato, three potato, four…

One potato, two potato, three potato, four
Five potato, six potato, seven potato, more.

Traditional Children’s Counting Rhyme

Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.

Knuth, Donald
“Structured Programming with go to statements”
Computing Surveys, Vol. 6, No. 4, December 1974
(p. 268)
(Emphasis mine)

Knuth’s admonition about premature optimization is such a cliché among software developers at this point that even the correction to include the full context of the quote is itself a a cliché.

Still, it’s a cliché for a reason: the speed at which software can be written is in tension — if not necessarily in conflict — with the speed at which it executes. As Nelson Elhage has explained, software can be qualitatively worse when it is slow, but spending time optimizing an algorithm before getting any feedback from users or profiling the system as a whole can lead one down many blind alleys of wasted effort.

In that same essay, Nelson further elaborates that performant foundations simplify architecture1. He then follows up with several bits of architectural advice that is highly specific to parsing—compilers and type-checkers specifically—which, while good, is hard to generalize beyond “optimizing performance early can also be good”.

So, here I will endeavor to generalize that advice. How does one provide a performant architectural foundation without necessarily wasting a lot of time on early micro-optimization?

Enter The Potato

Many years before Nelson wrote his excellent aforementioned essay, my father coined a related term: “Potato Programming”.

In modern vernacular, a potato is very slow hardware, and “potato programming” is the software equivalent of the same.

The term comes from the rhyme that opened this essay, and is meant to evoke a slow, childlike counting of individual elements as an algorithm operates upon them. it is an unfortunately quite common software-architectural idiom whereby interfaces are provided in terms of scalar values. In other words, APIs that require you to use for loops or other forms of explicit, individual, non-parallelized iteration. But this is all very abstract; an example might help.

For a generic business-logic example, let’s consider the problem of monthly recurring billing. Every month, we pull in the list of all of all subscriptions to our service, and we bill them.

Since our hypothetical company has an account-management team that owns the UI which updates subscriptions and a billing backend team that writes code to interface with 3rd-party payment providers, we’ll create 2 backends, here represented by some Protocols.

Finally, we’ll have an orchestration layer that puts them together to actually run the billing. I will use async to indicate which things require a network round trip:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
class SubscriptionService(Protocol):
    async def all_subscriptions(self) -> AsyncIterable[Subscription]:
        ...

class Subscription(Protocol):
    account_id: str
    to_charge_per_month: money

class BillingService(Protocol):
    async def bill_amount(self, account_id: str, amount: money) -> None:
        ...

To many readers, this may look like an entirely reasonable interface specification; indeed, it looks like a lot of real, public-facing “REST” APIs. An equally apparently-reasonable implementation of our orchestration between them might look like this:

1
2
3
async def billing(s: SubscriptionService, b: BillingService) -> None:
    async for sub in s.all_subscriptions():
        await b.bill_amount(sub.account_id, sub.to_charge_per_month)

This is, however, just about the slowest implementation of this functionality that it’s possible to implement. So, this is the bad version. Let’s talk about the good version: no-tato programming, if you will. But first, some backstory.

Some Backstory

My father began his career as an APL programmer, and one of the key insights he took away from APL’s architecture is that, as he puts it:

Computers like to do things over and over again. They like to do things on arrays. They don’t want to do things on scalars. So, in fact, it’s not possible to write a program that only does things on a scalar. [...] You can’t have an ‘integer’ in APL, you can only have an ‘array of integers’. There’s no ‘loop’s, there’s no ‘map’s.

APL, like Python2, is typically executed via an interpreter. Which means, like Python, execution of basic operations like calling functions can be quite slow. However, unlike Python, its pervasive reliance upon arrays meant that almost all of its operations could be safely parallelized, and would only get more and more efficient as more and more parallel hardware was developed.

I said ‘unlike Python’ there, but in fact, my father first related this concept to me regarding a part of the Python ecosystem which follows APL’s design idiom: NumPy. NumPy takes a similar approach: it cannot itself do anything to speed up Python’s fundamental interpreted execution speed3, but it can move the intensive numerical operations that it implements into operations on arrays, rather than operations on individual objects, whether numbers or not.

The performance difference involved in these two styles is not small. Consider this case study which shows a 5828% improvement4 when taking an algorithm from idiomatic pure Python to NumPy.

This idiom is also more or less how GPU programming works. GPUs cannot operate on individual values. You submit a program5 to the GPU, as well as a large array of data6, and the GPU executes the program on that data in parallel across hundreds of tiny cores. Submitting individual values for the GPU to work on would actually be much slower than just doing the work on the CPU directly, due to the bus latency involved to transfer the data back and forth.

Back from the Backstory

This is all interesting for a class of numerical software — and indeeed it works very well there — but it may seem a bit abstract to web backend developers just trying to glue together some internal microservice APIs, or indeed most app developers who aren’t working in those specialized fields. It’s not like Stripe is going to let you run their payment service on your GPU.

However, the lesson generalizes quite well: anywhere you see an API defined in terms of one-potato, two-potato iteration, ask yourself: “how can this be turned into an array”? Let’s go back to our example.

The simplest change that we can make, as a consumer of these potato-shaped APIs, is to submit them in parallel. So if we have to do the optimization in the orchestration layer, we might get something more like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
from asyncio import Semaphore, AbstractEventLoop

async def one_bill(
    loop: AbstractEventLoop,
    sem: Semaphore,
    sub: Subscription,
    b: BillingService,
) -> None:
    await sem.acquire()
    async def work() -> None:
        try:
            await b.bill_amount(sub.account_id, sub.to_charge_per_month)
        finally:
            sem.release()
    loop.create_task(work)

async def billing(
    loop: AbstractEventLoop,
    s: SubscriptionService,
    b: BillingService,
    batch_size: int,
) -> None:
    sem = Semaphore(batch_size)
    async for sub in s.all_subscriptions():
        await one_bill(loop, sem, sub, b)

This is an improvement, but it’s a bit of a brute-force solution; a multipotato, if you will. We’ve moved the work to the billing service faster, but it still has to do just as much work. Maybe even more work, because now it’s potentially got a lot more lock-contention on its end. And we’re still waiting for the Subscription objects to dribble out of the SubscriptionService potentially one request/response at a time.

In other words, we have used network concurrency as a hack to simulate a performant design. But the back end that we have been given here is not actually optimizable; we do not have a performant foundation. As you can see, we have even had to change our local architecture a little bit here, to include a loop parameter and a batch_size which we had not previously contemplated.

A better-designed interface in the first place would look like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
class SubscriptionService(Protocol):
    async def all_subscriptions(
        self, batch_size: int,
    ) -> AsyncIterable[Sequence[Subscription]]:
        ...

class Subscription(Protocol):
    account_id: str
    to_charge_per_month: money

@dataclass
class BillingRequest:
    account_id: str
    amount: money

class BillingService(Protocol):
    async def submit_bills(
        self,
        bills: Sequence[BillingRequest],
    ) -> None:
        ...

Superficially, the implementation here looks slightly more awkward than our naive first attempt:

1
2
3
4
5
6
7
8
async def billing(s: SubscriptionService, b: BillingService, batch_size: int) -> None:
    async for sub_batch in s.all_subscriptions(batch_size):
        await b.submit_bills(
            [
                BillingRequest(sub.account_id, sub.to_charge_per_month)
                for sub in sub_batch
            ]
        )

However, while the implementation with batching in the backend is approximately as performant as our parallel orchestration implementation, backend batching has a number of advantages over parallel orchestration.

First, backend batching has less internal complexity; no need to have a Semaphore in the orchestration layer, or to create tasks on an event loop. There’s less surface area here for bugs.

Second, and more importantly: backend batching permits for future optimizations within the backend services, which are much closer to the relevant data and can achieve more substantial gains than we can as a client without knowledge of their implementation.

There are many ways this might manifest, but consider that each of these services has their own database, and have got to submit queries and execute transactions on those databases.

In the subscription service, it’s faster to run a single SELECT statement that returns a bunch of results than to select a single result at a time. On the billing service’s end, it’s much faster to issue a single INSERT or UPDATE and then COMMIT for N records at once than to concurrently issue a ton of potentially related modifications in separate transactions.

Potato No Mo

The initial implementation within each of these backends can be as naive and slow as necessary to achieve an MVP. You can do a SELECT … LIMIT 1 internally, if that’s easier, and performance is not important at first. There can be a mountain of potatoes hidden behind the veil of that batched list. In this way, you can avoid the potential trap of premature optimization. Maybe this is a terrible factoring of services for your application in the first place; best to have that prototype in place and functioning quickly so that you can throw it out faster!

However, by initially designing an interface based on lists of things rather than individual things, it’s much easier to hide irrelevant implementation details from the client, and to achieve meaningful improvements when optimizing.

Acknowledgements

This is the first post supported by my patrons, with a topic suggested by a member of my Patreon!


  1. It’s a really good essay, you should read it. 

  2. Yes, I know it’s actually bytecode compiled and then run on a custom interpreting VM, but for the purposes of comparing these performance characteristics “interpreted” is a more accurate approximation. Don’t @ me. 

  3. Although, thankfully, a lot of folks are now working very hard on that problem. 

  4. No, not a typo, that’s a 4-digit improvement. 

  5. Typically called a “shader” due to its origins in graphically shading polygons. 

  6. The data may rerepresenting vertices in a 3-D mesh, pixels in texture data, or, in the case of general-purpose GPU programming, “just a bunch of floating-point numbers”.