iOS Mail To Omnifocus Task

Convert messages in the Mail app built in to iOS into tasks in OmniFocus.

One of my longest-running frustrations with iOS is that the default mail app does not have a “share” action, making it impossible to do the one thing that a mail client needs to be able to do for me, which is to selectively turn messages into tasks. This deficiency has multiple components which makes it difficult to work around:

There is no UI to “share” a message from within the mail app, so you can’t share it to OmniFocus or a shortcut.
There is no way to query for a mail message in Shortcuts and then get some kind of “message” object, so there’s no way you can start in Shortcuts and manually invoke something.
There’s no such thing as MailKit on iOS, so third-party apps can’t query your mail database either.

To work around this, I’ve long subscribed to the “AirMail” app, which has a “message to omnifocus” action but is otherwise kind of a buggy mess.

But today, I read that you can set up an iPad in a split-screen view and drag messages from the built-in Mail app’s message list view into the OmniFocus inbox, and I extrapolated, and discovered how to get Mail-message-to-Omnifocus-task work on an iPhone.

I’m thrilled that this functionality exists, but it is a bit of a masterclass in how to get a terrible UX out of a series of decisions that were probably locally reasonable. So, without any further ado, here’s how you do it:

Open up mail.app, and find the message you want to share in the message list. Here, you have two choices:
1. With one finger, press and hold on a message until you feel a haptic. When you feel the haptic, immediately move your finger a little bit, before you see the preview come up. This lets you operate directly from the message list, but is very fiddly.
2. Tap the message to open the detail view, then press and hold on the sent date in the top right.
Continue holding down with your first finger. With a second finger, swipe up from the bottom to enter the multitasking view, or to go back to your home screen. While holding your first finger in place, either switch to or launch OmniFocus.
With your second finger, navigate to the Inbox, drag your first finger to the bottom of the list, and release it. Voila! You should have a task with a brief summary and a link back to the message.
Swipe up from the bottom to switch back to Mail, then archive the message.

Get Your Mac Python From Python.org

There are many ways to get Python installed on macOS, but for most people the version that you download from Python.org is best.

python programming supported Tuesday August 29, 2023

One of the most unfortunate things about learning Python is that there are so many different ways to get it installed, and you need to choose one before you even begin. The differences can also be subtle and require technical depth to truly understand, which you don’t have yet.¹ Even experts can be missing information about which one to use and why.

There are perhaps more of these on macOS than on any other platform, and that’s the platform I primarily use these days. If you’re using macOS, I’d like to make it simple for you.

The One You Probably Want: Python.org

My recommendation is to use an official build from python.org.

I recommed the official installer for most uses, and if you were just looking for a choice about which one to use, you can stop reading now. Thanks for your time, and have fun with Python.

If you want to get into the nerdy nuances, read on.

For starters, the official builds are compiled in such a way that they will run on a wide range of macs, both new and old. They are universal2 binaries, unlike some other builds, which means you can distribute them as part of a mac application.

The main advantage that the Python.org build has, though, is very subtle, and not any concrete technical detail. It’s a social, structural issue: the Python.org builds are produced by the people who make CPython, who are more likely to know about the nuances of what options it can be built with, and who are more likely to adopt their own improvements as they are released. Third party builders who are focused on a more niche use-case may not realize that there are build options or environment requirements that could make their Pythons better.

I’m being a bit vague deliberately here, because at any particular moment in time, this may not be an advantage at all. Third party integrators generally catch up to changes, and eventually achieve parity. But for a specific upcoming example, PEP 703 will have extensive build-time implications, and I would trust the python.org team to be keeping pace with all those subtle details immediately as releases happen.

(And Auto-Update It)

The one downside of the official build is that you have to return to the website to check for security updates. Unlike other options described below, there’s no built-in auto-updater for security patches. If you follow the normal process, you still have to click around in a GUI installer to update it once you’ve clicked around on the website to get the file.

I have written a micro-tool to address this and you can pip install mopup and then periodically run mopup and it will install any security updates for your current version of Python, with no interaction besides entering your admin password.

(And Always Use Virtual Environments)

Once you have installed Python from python.org, never pip install anything globally into that Python, even using the --user flag. Always, always use a virtual environment of some kind. In fact, I recommend configuring it so that it is not even possible to do so, by putting this in your ~/.pip/pip.conf:

[global]
require-virtualenv = true

This will avoid damaging your Python installation by polluting it with libraries that you install and then forget about. Any time you need to do something new, you should make a fresh virtual environment, and then you don’t have to worry about library conflicts between different projects that you may work on.

If you need to install tools written in Python, don’t manage those environments directly, install the tools with pipx. By using pipx, you allow each tool to maintain its own set dependencies, which means you don’t need to worry about whether two tools you use have conflicting version requirements, or whether the tools conflict with your own code.²

The Others

There are, of course, several other ways to install Python, which you probably don’t want to use.

The One For Running Other People’s Code, Not Yours: Homebrew

In general, Homebrew Python is not for you.

The purpose of Homebrew’s python is to support applications packaged within Homebrew, which have all been tested against the versions of python libraries also packaged within Homebrew. It may upgrade without warning on just about any brew operation, and you can’t downgrade it without breaking other parts of your install.

Specifically for creating redistributable binaries, Homebrew python is typically compiled only for your specific architecture, and thus will not create binaries that can be used on Intel macs if you have an Apple Silicon machine, or will run slower on Apple Silicon machines if you have an Intel mac. Also, if there are prebuilt wheels which don’t yet exist for Apple Silicon, you cannot easily arch -x86_64 python ... and just install them; you have to install a whole second copy of Homebrew in a different location, which is a headache.

In other words, homebrew is an alternative to pipx, not to Python. For that purpose, it’s fine.

The One For When You Need 20 Different Pythons For Debugging: pyenv

Like Homebrew, pyenv will default to building a single-architecture binary. Even worse, it will not build a Framework build of Python, which means several things related to being a mac app just won’t work properly. Remember those build-time esoterica that the core team is on top of but third parties may not be? “Should I use a Framework build” is an enduring piece of said esoterica.

The purpose of pyenv is to provide a large matrix of different, precise legacy versions of python for library authors to test compatibility against those older Pythons. If you need to do that, particularly if you work on different projects where you may need to install some random super-old version of Python that you would not normally use to test something on, then pyenv is great. But if you only need one version of Python, it’s not a great way to get it.

The Other One That’s Exactly Like pyenv: asdf-python

The issues are exactly the same as with pyenv, as the tool is a straightforward alternative for the exact same purpose. It’s a bit less focused on Python than pyenv, which has pros and cons; it has broader community support, but it’s less specifically tuned for Python. But a comparative exploration of their differences is beyond the scope of this post.

The Built-In One That Isn’t Really Built-In: `/usr/bin/python3`

There is a binary in /usr/bin/python3 which might seem like an appealing option — it comes from Apple, after all! — but it is provided as a developer tool, for running things like build scripts. It isn’t for building applications with.

That binary is not a “system python”; the thing in the operating system itself is only a shim, which will determine if you have development tools, and shell out to a tool that will download the development tools for you if you don’t. There is unfortunately a lot of folk wisdom among older Python programmers who remember a time when apple did actually package an antedeluvian version of the interpreter that seemed to be supported forever, and might suggest it for things intended to be self-contained or have minimal bundled dependencies, but this is exactly the reason that Apple stopped shipping that.

If you use this option, it means that your Python might come from the Xcode Command Line Tools, or the Xcode application, depending on the state of xcode-select in your current environment and the order in which you installed them.

Upgrading Xcode via the app store or a developer.apple.com manual download — or its command-line tools, which are installed separately, and updated via the “settings” application in a completely different workflow — therefore also upgrades your version of Python without an easy way to downgrade, unless you manage multiple Xcode installs. Which, at 12G per install, is probably not an appealing option.³

The One With The Data And The Science: Conda

As someone with a limited understanding of data science and scientific computing, I’m not really qualified to go into the detailed pros and cons here, but luckily, Itamar Turner-Trauring is, and he did.

My one coda to his detailed exploration here is that while there are good reasons to want to use Anaconda — particularly if you are managing a data-science workload across multiple platforms and you want a consistent, holistic development experience across a large team supporting heterogenous platforms — some people will tell you that you need Conda to get you your libraries if you want to do data science or numerical work with Python at all, because Conda is how you install those libraries, and otherwise things just won’t work.

This is a historical artifact that is no longer true. Over the last decade, Python Wheels have been comprehensively adopted across the Python community, and almost every popular library with an extension module ships pre-built binaries to multiple platforms. There may be some libraries that only have prebuilt binaries for conda, but they are sufficiently specialized that I don’t know what they are.

The One for Being Consistent With Your Cloud Hosting

Another way to run Python on macOS is to not run it on macOS, but to get another computer inside your computer that isn’t running macOS, and instead run Python inside that, usually using Docker.⁴

There are good reasons to want to use a containerized configuration for development, but they start to drift away from the point of this post and into more complicated stuff about how to get your Python into the cloud.

So rather than saying “use Python.org native Python instead of Docker”, I am specifically not covering Docker as a replacement for a native mac Python here because in a lot of cases, it can’t be one. Many tools require native mac facilities like displaying GUIs or scripting applications, or want to be able to take a path name to a file without elaborate pre-work to allow the program to access it.

Summary

If you didn’t want to read all of that, here’s the summary.

If you use a mac:

Get your Python interpreter from python.org.
Update it with mopup so you don’t fall behind on security updates.
Always use venvs for specific projects, never pip install anything directly.
Use pipx to manage your Python applications so you don’t have to worry about dependency conflicts.
Don’t worry if Homebrew also installs a python executable, but don’t use it for your own stuff.
You might need a different Python interpreter if you have any specialized requirements, but you’ll probably know if you do.

Acknowledgements

Thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor! I am also available for consulting work if you think your organization could benefit from expertise on topics like “which Python is the really good one”.

If somebody sent you this article because you’re trying to get into Python and you got stuck on this point, let me first reassure you that all the information about this really is highly complex and confusing; if you’re feeling overwhelmed, that’s normal. But the good news is that you can really ignore most of it. Just read the next little bit. ↩
Some tools need to be installed in the same environment as the code they’re operating on, so you may want to have multiple installs of, for example, Mypy, PuDB, or sphinx. But for things that just do something useful but don’t need to load your code — such as this small selection of examples from my own collection: certbot, pgcli, asciinema, gister, speedtest-cli — pipx means you won’t have to debug wonky dependency interactions. ↩
The command-line tools are a lot smaller, but cannot have multiple versions installed at once, and are updated through a different mechanism. There are odd little details like the fact that the default bundle identifier for the framework differs, being either org.python.python or com.apple.python3. They’re generally different in a bunch of small subtle ways that don’t really matter in 95% of cases until they suddenly matter a lot in that last 5%. ↩
Or minikube, or podman, or colima or whatever I guess, there’s way too many of these containerization Pokémon running around for me to keep track of them all these days. ↩

Bilithification

Not sure how to do microservices? Split your monolith in half.

programming architecture microservices management supported Thursday July 20, 2023

Several years ago at O’Reilly’s Software Architecture conference, within a comprehensive talk on refactoring “Technical Debt: A Masterclass”, r0ml¹ presented a concept that I think should be highlighted.

If you have access to O’Reilly Safari, I think the video is available there, or you can get the slides here. It’s well worth watching in its own right. The talk contains a lot of hard-won wisdom from a decades-long career, but in slides 75-87, he articulates a concept that I believe resolves the perennial pendulum-swing between microservices and monoliths that we see in the Software as a Service world.

I will refer to this concept as “the bilithification strategy”.

Background

Personally, I have long been a microservice skeptic. I would generally articulate this skepticism in terms of “YAGNI”.

Here’s the way I would advise people asking about microservices before encountering this concept:

Microservices are often adopted by small teams due to their advertised benefits. Advocates from very large organizations—ones that have been very successful with microservices—frequently give talks claiming that microservices are more modular, more scalable, and more fault-tolerant than their monolithic progenitors. But these teams rarely appreciate the costs, particularly the costs for smaller orgs. Specifically, there is a fixed operational marginal cost to each new service, and a fairly large fixed operational overhead to the infrastructure for an organization deploying microservices in at all.

With a large enough team, the operational cost is easy to absorb. As the overhead is fixed, it trends towards zero as your total team size and system complexity trend towards infinity. Also, in very large teams, the enforced isolation of components in separate services reduces complexity. It does so specifically intentionally causing the software architecture to mirror the organizational structure of the team that deploys it. This — at the cost of increased operational overhead and decreased efficiency — allows independent parts of the organization to make progress independently, without blocking on each other. Therefore, in smaller teams, as you’re building, you should bias towards building a monolith until the complexity costs of the monolith become apparent. Then you should build the infrastructure to switch to microservices.

I still stand by all of this. However, it’s incomplete.

What does it mean to “switch to microservices”?

The biggest thing that this advice leaves out is a clear understanding of the “micro” in “microservice”. In this framing, I’m implicitly understanding “micro” services to be services that are too small — or at least, too small for your team. But if they do work for large organizations, then at some point, you need to have them. This leaves open several questions:

What size is the right size for a service?
When should you split your monolith up into smaller services?
Wait, how do you even measure “size” of a service? Lines of code? Gigabytes of memory? Number of team members?

In a specific situation I could probably look at these questions for that situation, and make suggestions as to the appropriate course of action, but that’s based largely on vibes. There’s just a lot of drawing on complex experiences, not a repeatable pattern that a team could apply on their own.

We can be clear that you should always start with a monolith. But what should you do when that’s no longer working? How do you even tell when it’s no longer working?

Bilithification

Every codebase begins as a monolith. That is a single (mono) rock (lith). Here’s what it looks like.

a circle with the word “monolith” on it

Let’s say that the monolith, and the attendant team, is getting big enough that we’re beginning to consider microservices. We might now ask, “what is the appropriate number of services to split the monolith into?” and that could provoke endless debate even among a team with total consensus that it might need to be split into some number of services.

Rather than beginning with the premise that there is a correct number, we may observe instead that splitting the service into N services where N is more than one may be accomplished splitting the service in half N-1 times.

So let’s bi (two) lithify (rock) this monolith, and take it from 1 to 2 rocks.

The task of splitting the service into two parts ought to be a manageable amount of work — two is a definitively finite number, as composed to the infinite point-cloud of “microservices”. Thus, we should search, first, for a single logical seam along which we might cleave the monolith.

a circle with the word “monolith” on it and a line through it

In many cases—as in the specific case that r0ml gave—the easiest way to articulate a boundary between two parts of a piece of software is to conceptualize a “frontend” and a “backend”. In the absence of any other clear boundary, the question “does this functionality belong in the front end or the back end” can serve as a simple razor for separating the service.

Remember: the reason we’re splitting this software up is because we are also splitting the team up. You need to think about this in terms of people as well as in terms of functionality. What division between halves would most reduce the number of lines of communication, to reduce the quadratic increase in required communication relationships that comes along with the linear increase in team size? Can you identify two groups who need to talk amongst themselves, but do not need to talk with all of each other?²

two circles with the word “hemilith” on them and a double-headed arrow
between them

Once you’ve achieved this separation, we no longer have a single rock, we have two half-rocks: hemiliths to borrow from the same Greek root that gave us “monolith”.

But we are not finished, of course. Two may not be the correct number of services to end up with. Now, we ask: can we split the frontend into a frontend and backend? Can we split the backend? If so, then we now have four rocks in place of our original one:

four circles with the word “tetartolith” on them and double-headed arrows
connecting them all

You might think that this would be a “tetralith” for “four”, but as they are of a set, they are more properly a tetartolith.

Repeat As Necessary

At some point, you’ll hit a point where you’re looking at a service and asking “what are the two pieces I could split this service into?”, and the answer will be “none, it makes sense as a single piece”. At that point, you will know that you’ve achieved services of the correct size.

One thing about this insight that may disappoint some engineers is the realization that service-oriented architecture is much more an engineering management tool than it is an engineering tool. It’s fun to think that “microservices” will let you play around with weird technologies and niche programming languages consequence-free at your day job because those can all be “separate services”, but that was always a fantasy. Integrating multiple technologies is expensive, and introducing more moving parts always introduces more failure points.

Advanced Techniques: A Multi-Stack Microservice Environment

You’ll note that splitting a service heavily implies that the resulting services will still all be in the same programming language and the same tech stack as before. If you’re interested in deploying multiple stacks (languages, frameworks, libraries), you can proceed to that outcome via bilithification, but it is a multi-step process.

First, you have to complete the strategy that I’ve already outlined above. You need to get to a service that is sufficiently granular that it is atomic; you don’t want to split it up any further.

Let’s call that service “X”.

Second, you identify the additional complexity that would be introduced by using a different tech stack. It’s important to be realistic here! New technology always seems fun, and if you’re investigating this, you’re probably predisposed to think it would be an improvement. So identify your costs first and make sure you have them enumerated before you move on to the benefits.

Third, identify the concrete benefits to X’s problem domain that the new tech stack would provide.

Finally, do a cost-benefit analysis where you make sure that the costs from step 2 are clearly exceeded by the benefits from step three. If you can’t readily identify that in advance – sometimes experimentation is required — then you need to treat this as an experiment, rather than as a strategic direction, until you’ve had a chance to answer whatever questions you have about the new technology’s benefits benefits.

Note, also, that this cost-benefit analysis requires not only doing the technical analysis but getting buy-in from the entire team tasked with maintaining that component.

Conclusion

To summarize:

Always start with a monolith.
When the monolith is too big, both in terms of team and of codebase, split the monolith in half until it doesn’t make sense to split it in half any more.
(Optional) Carefully evaluate services that want to adopt new technologies, and keep the costs of doing that in mind.

There is, of course, a world of complexity beyond this associated with managing the cost of a service-oriented architecture and solving specific technical problems that arise from that architecture.

If you remember the tetartolith, though, you should at least be able to get to the right number and size of services for your system.

Thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor! I am also available for consulting work if you think your organization could benefit from more specificity on the sort of insight you've seen here.

AKA “my father” ↩
Denouncing “silos” within organizations is so common that it’s a tired trope at this point. There is no shortage of vaguely inspirational articles across the business trade-rag web and on LinkedIn exhorting us to “break down silos”, but silos are the point of having an organization. If everybody needs to talk to everybody else in your entire organization, if no silos exist to prevent the necessity of that communication, then you are definitionally operating that organization at minimal efficiency. What people actually want when they talk about “breaking down silos” is a re-org into a functional hierarchy rather than a role-oriented hierarchy (i.e., “this is the team that makes and markets the Foo product, this is the team that makes and markets the Bar product” as opposed to “this is the sales team”, “this is the engineering team”). But that’s a separate post, probably. ↩

Post-PyCon-US 2023 Notes

Some stream of consciousness post-conference notes.

python pycon supported Sunday April 30, 2023

PyCon 2023 was last week, and I wanted to write some notes on it while the memory is fresh. Much of this was jotted down on the plane ride home and edited a few days later.

Health & Safety

Even given my smaller practice run at PyBay, it was a bit weird for me to be back around so many people, given that it was all indoors.

However, it was very nice that everyone took masking seriously. I personally witnessed very few violations of the masking rules, and they all seemed to be momentary, unintentional slip-ups after eating or drinking something.

As a result, I’ve now been home for 4 full days, am COVID negative and did not pick up any more generic con crud. It’s really nice to be feeling healthy after a conference!

Overall Vibe

I was a bit surprised to find the conference much more overwhelming than I remembered it being. It’s been 4 years since my last PyCon; I was out of practice! It was also odd since last year was in person and at the same venue, so most folks had a sense of Salt Lake, and I really didn’t.

I think this was good, since I’ll remember this experience and have a fresher sense of what it feels like (at least a little bit) to be a new attendee next year.

The Schedule

I only managed to attend a few talks, but every one was excellent. In case you were not aware, un-edited livestream VODs of the talks are available with your online ticket, in advance of the release of the final videos on the YouTube channel, so if you missed these but you attended the conference you can still watch them¹

Ned Batchelder’s keynote was awesome, and was a great kickoff to the conference. It helped set a tone of kindness and thoughtfulness throughout.
Russel Keith-Magee gave a great update on the state of BeeWare, which I hope will help to popularize the excellent work that he, and his collaborators, are doing: You can take it with you: Packaging your Python code with Briefcase. Anaconda sponsoring time for people to work on it has really pushed it forward.
Sumana Harihareswara and Jacob Kaplan-Moss put on an excellent stage play — Argument Clinic: What Healthy Professional Conflict Looks Like — highlighting how to have a productive professional conflict in a way which was both illuminatingly structured and viscerally real.

My Talk

My talk, “How To Keep A Secret”, seemed to be very well received.²

I got to talk to a lot of people who said they learned things from it. I had the idea to respond to audience feedback by asking “will you be doing anything differently as a result of seeing the talk?” and so I got to hear about which specific information was actually useful to help improve the audience’s security posture. I highly recommend this follow-up question to other speakers in the future.

As part of the talk, I released and announced 2 projects related to its topic of better security posture around secrets-management:

PINPal, a little spaced-repetition tool to help you safely rotate your “core” passwords, the ones you actually need to memorize.
TokenRing, a backend for the keyring module which uses a hardware token to require user presence for any secret access, by encrypting your vault and passwords as Fernet tokens.

I also called for donations to a local LGBT+ charity in Salt Lake City and made a small matching donation, to try to help the conference have a bit of a positive impact on the area’s trans population, given the extremely bigoted law passed by the state legislature in the run-up to the conference.

We raised $330 in total³, and I think other speakers were making similar calls. Nobody wanted any credit; everyone who got in touch and donated just wanted to help out.

Open Spaces

I went to a couple of open spaces that were really engaging and thought-provoking.

Hynek hosted one based on his talk (which is based on this blog post) where we explored some really interesting case-studies in replacing subclassing with composition.
There was a “web framework maintainers” open-space hosted by David Lord, which turned into a bit of a group therapy session amongst framework maintainers from Flask, Django, Klein (i.e. Twisted), and Sanic. I had a few key takeaways from this one:
- We should try to keep our users in the loop with what is going on in the project. Every project should have a project blog so that users have a single point of contact.
  - It turns out Twisted does actually have one of these. But we should actually post updates to that blog so that users can see new developments. We have forgotten to even post.
  - We should repeatedly drive users to those posts, from every communications channel; social media (mastodon, twitter), chat (discord, IRC, matrix, gitter), or mailing lists. We should not be afraid to repeat ourselves a bit. We’re often afraid to spam our users but there’s a lot of room between where we are now — i.e. “users never hear from us” — and spamming them.
- We should regularly remind ourselves, and each other, that any work doing things like ticket triage, code review, writing for the project blog, and writing the project website are valuable work. We all kinda know this already, but psychologically it just feels like ancillary “stuff” that isn’t as real as the coding itself.
- We should find ways to recognize contributions, especially the aforementioned less-visible stuff, like people who hang out in chat and patiently direct users to the appropriate documentation or support channels.

The Sprints

The sprints were not what I expected. I sat down thinking I’d be slogging through some Twisted org GitHub Actions breakage on Klein and Treq, but what I actually did was:

Request an org on the recently-released PyPI “Organizations” feature, got it approved, and started adding a few core contributors.
Have some lovely conversations with PyCon and PSF staff about several potential projects that I think could really help the ecosystem. I don’t want to imply anyone has committed to anything here, so I’ll leave a description of exactly what those were for later.
Filed a series of issues against BeeWare™ Briefcase™ detailing exactly what I needed from Encrust that wasn’t already provided by Briefcase’s existing Mac support.
I also did much more than I expected on Pomodouroboros, including:
I talked to my first in-the-wild Pomodouroboros user, someone who started using the app early enough to get bitten by a Pickle data-migration bug and couldn’t upgrade! I’d forgotten that I’d released a version that modeled time as a float rather than a datetime.
Started working on a design with Moshe Zadka for integrations for external time-tracking services and task-management services.
I had the opportunity to review datetype with Paul Ganssle and explore options for integrating it with or recommending it from the standard library, to hopefully start to address the both the datetime-shouldn’t-subclass-date problem and the how-do-you-know-if-a-datetime-is-timezone-aware problem.
Speaking of Twisted infrastructure maintenance, special thanks to Paul Kehrer, who noticed that pyasn1 was breaking Twisted’s CI, and submitted a PR to fix it. I finally managed to do a review a few days after the conference and that’s landed now.

Everything Else

I’m sure I’m forgetting at least a half a dozen other meaningful interactions that I had; the week was packed, and I talked to lots of interesting people as always.

See you next year in Pittsburgh!.

Go to your dashboard and click the “Join PyCon US 2023 Online Now!” button at the top of the page, then look for the talk on the “agenda” tab or the speaker in the search box on the right. ↩
Talks like these and software like PINPal and TokenRing are the sorts of things things that I hope to get support for from my Patreon, so please go there if you’d like to support my continuing to do this sort of work. ↩
If you’d like to make that number bigger, ~~I’ll do another $100 match on this blog post, and update that paragraph if I receive anything; just send the receipt to encircle@glyph.im.~~ A reader sent in another matching donation and I made a contribution, so the total raised is now $530. ↩

No Executable Is An Island

Single-file executables are neither necessary nor sufficient to provide a good end-user software installation experience. They don’t work at all on macOS today, but they don't really work great anywhere else either. The focus of Python packaging tool development ought to be elsewhere.

python packaging supported Thursday March 30, 2023

One of the perennial talking points in the Python packaging discourse is that it’s unnecessarily difficult to create a simple, single-file binary that you can hand to users.

This complaint is understandable. In most other programming languages, the first thing you sit down to do is to invoke the compiler, get an executable, and run it. Other, more recently created programming languages — particularly Go and Rust — have really excellent toolchains for doing this which eliminate a lot of classes of error during that build-and-run process. A single, dependency-free, binary executable file that you can run is an eminently comprehensible artifact, an obvious endpoint for “packaging” as an endeavor.

While Rust and Go are producing these artifacts as effectively their only output, Python has such a dizzying array of tooling complexity that even attempting to describe “the problem” takes many thousands of words and one may not ever even get around to fully describing the complexity of the issues involved in the course of those words. All the more understandable, then, that we should urgently add this functionality to Python.

A programmer might see Python produce wheels and virtualenvs which can break when anything in their environment changes, and see the complexity of that situation. Then, they see Rust produce a statically-linked executable which “just works”, and they see its toolchain simplicity. I agree that this shouldn’t be so hard, and some of the architectural decisions that make this difficult in Python are indeed unfortunate.

But then, I think, our hypothetical programmer gets confused. They think that Rust is simple because it produces an executable, and they think Python’s complexity comes from all the packaging standards and tools. But this is not entirely true.

Python’s packaging complexity, and indeed some of those packaging standards, arises from the fact that it is often used as a glue language. Packaging pure Python is really not all that hard. And although the tools aren’t included with the language and the DX isn’t as smooth, even packaging pure Python into a single-file executable is pretty trivial.

But almost nobody wants to package a single pure-python script. The whole reason you’re writing in Python is because you want to rely on an enormous ecosystem of libraries, and some low but critical percentage of those libraries include things like their own statically-linked copies of OpenSSL or a few megabytes of FORTRAN code with its own extremely finicky build system you don’t want to have to interact with.

When you look aside to other ecosystems, while Python still definitely has some unique challenges, shipping Rust with a ton of FFI, or Go with a bunch of Cgo is substantially more complex than the default out-of-the-box single-file “it just works” executable you get at the start.¹

Still, all things being equal, I think single-file executable builds would be nice for Python to have as a building block. It’s certainly easier to produce a package for a platform if your starting point is that you have a known-good, functioning single-file executable and you all you need to do is embed it in some kind of environment-specific envelope. Certainly if what you want is to copy a simple microservice executable into a container image, you might really want to have this rather than setting up what is functionally a full Python development environment in your Dockerfile. After team-wide philosophical debates over what virtual environment manager to use, those Golang Dockerfiles that seem to be nothing but the following 4 lines are really appealing:

FROM golang
COPY *.go ./
RUN go build -o /app
ENTRYPOINT ["/app"]

All things are rarely equal, however.

The issue that we, as a community, ought to be trying to address with build tools is to get the software into users’ hands, not to produce a specific file format. In my opinion, single-file binary builds are not a great tool for this. They’re fundamentally not how people, even quite tech-savvy programming people, find and manage their tools.

A brief summary of the problems with single-file distributions:

They’re not discoverable. A single file linked on your website will not be found via something like brew search, apt search, choco search or searching in a platform’s GUI app store’s search bar.
They’re not updatable. People expect their system package manager to update stuff for them. Standalone binaries might add their own updaters, but now you’re shipping a whole software-update system inside your binary. More likely, it’ll go stale forever while better-packaged software will be updated and managed properly.
They have trouble packaging resources. Once you’ve got your code stuffed into a binary, how do you distribute images, configuration files, or other data resources along with it? This isn’t impossible to solve, but in other programming languages which do have a great single-file binary story, this problem is typically solved by third party tooling which, while it might work fine, will still generally exist in multiple alternative forms which adds its own complexity.

So while it might be a useful building-block that simplifies those annoying container builds a lot, it hardly solves the problem comprehensively.

If we were to build a big new tool, the tool we need is something that standardizes the input format to produce a variety of different complex, multi-file output formats, including things like:

deb packages (including uploading to PPA archives so people can add an apt line; a manual dpkg -i has many of the same issues as a single file)
container images (including the upload to a registry so that people can "$(shuf -n 1 -e nerdctl docker podman)" pull or FROM it)
Flatpak apps
Snaps
macOS apps
Microsoft store apps
MSI installers
Chocolatey / NuGet packages
Homebrew formulae

In other words, ask yourself, as a user of an application, how do you want to consume it? It depends what kind of tool it is, and there is no one-size-fits-all answer.

In any software ecosystem, if a feature is a building block which doesn’t fully solve the problem, that is an issue with the feature, but in many cases, that’s fine. We need lots of building blocks to get to full solutions. This is the story of open source.

However, if I had to take a crack at summing up the infinite-headed hydra of the Problem With Python Packaging, I’d put it like this:

Python has a wide array of tools which can be used to package your Python code for almost any platform, in almost any form, if you are sufficiently determined. The problem is that the end-to-end experience of shipping an application to end users who are not Python programmers² for any particular platform has a terrible user experience. What we need are more holistic solutions, not more building blocks.³

This makes me want to push back against this tendency whenever I see it, and to try to point towards more efficient ways to achieving a good user experience, with the relatively scarce community resources at our collective disposal⁴. Efficiency isn’t exclusively about ideal outcomes, though; it’s the optimization a cost/benefit ratio. In terms of benefits, it’s relatively low, as I hope I’ve shown above.

Building a tool that makes arbitrary Python code into a fully self-contained executable is also very high-cost, in terms of community effort, for a bunch of reasons. For starters, in any moderately-large collection of popular dependencies from PyPI, at least a few of them are going to want to find their own resources via __file__, and you need to hack in a way to find those, which is itself error prone. Python also expects dynamic linking in a lot of places, and messing around with C linkers to change that around is a complex process with its own huge pile of failure modes. You need to do this on pre-existing binaries built with options you can’t necessarily control, because making everyone rebuild all the binary wheels they find on PyPI is a huge step backwards in terms of exposing app developers to confusing infrastructure complexity.

Now, none of this is impossible. There are even existing tools to do some of the scarier low-level parts of these problems. But one of the reasons that all the existing tools for doing similar things have folk-wisdom reputations and even official documentation expecting constant pain is that part of the project here is conducting a full audit of every usage of __file__ on PyPI and replacing it with some resource-locating API which we haven’t even got a mature version of yet⁵.

Whereas copying all the files into the right spots in an archive file that can be directly deployed to an existing platform is tedious, but only moderately challenging. It usually doesn’t involve fundamentally changing how the code being packaged works, only where it is placed.

To the extent that we have a choice between “make it easy to produce a single-file binary without needing to understand the complexities of binaries” or “make it easy to produce a Homebrew formula / Flatpak build / etc without the user needing to understand Homebrew / Flatpak / etc”, we should always choose the latter.

If this is giving you déjà vu, I’ve gestured at this general concept more vaguely in a few places, including tweeting⁶ about it in 2019, saying vaguely similar stuff:

If you're making a packaging tool for Python, stop trying to make single-file executables. They are a pointless technical flourish. You need to build:
• .app bundles
• .deb/.rpm/flatpack packages
• container images
• .msi installers
• Homebrew formulæ
• a codesigning pipeline
— glyph ⎷⃣ (@glyph) August 5, 2019

Everything I’ve written here so far is debatable.

You can find that debate both in replies to that original tweet and in various other comments and posts elsewhere that I’ve grumbled about this. I still don’t agree with that criticism, but there are very clever people working on complex tools which are slowly gaining popularity and might be making the overall packaging situation better.

So while I think we should in general direct efforts more towards integrating with full-featured packaging standards, I don’t want to yuck anybody’s yum when it comes to producing clean single-file executables in general. If you want to build that tool and it turns out to be a more useful building block than I’m giving it credit for, knock yourself out.

However, in addition to having a comprehensive write-up of my previously-stated opinions here, I want to impart a more concrete, less debatable issue. To wit: single-file executables as a distribution mechanism, specifically on macOS is not only sub-optimal, but a complete waste of time.

Late last year, Hynek wrote a great post about his desire for, and experience of, packaging a single-file binary for multiple platforms. This should serve as an illustrative example of my point here. I don’t want to pick on Hynek. Prominent Python programmers wish for this all the time.. In fact, Hynek also did the thing I said is a good idea here, and did, in fact, create a Homebrew tap, and that’s the one the README recommends.

So since he kindly supplied a perfect case-study of the contrasting options, let’s contrast them!

The first thing I notice is that the Homebrew version is Apple Silicon native, whereas the single-file binary is still x86_64, as the brew build and test infrastructure apparently deals with architectural differences (probably pretty easy given it can use Homebrew’s existing Python build) but the more hand-rolled PyOxidizer setup builds only for the host platform, which in this case is still an Intel mac thanks to GitHub dragging their feet.

The second is that the Homebrew version runs as I expect it to. I run doc2dash in my terminal and I see Usage: doc2dash [OPTIONS] SOURCE, as I should.

So, A+ on the Homebrew tap. No notes. I did not have to know anything about Python being in the loop at all here, it “just works” like every Ruby, Clojure, Rust, or Go tool I’ve installed with the same toolchain.

Over to the single-file brew-less version.

Beyond the architecture being emulated and having to download Rosetta2⁸, I have to note that this “single file” binary already comes in a zip file, since it needs to include the license in a separate file.⁷ Now that it’s unarchived, I have some choices to make about where to put it on my $PATH. But let’s ignore that for now and focus on the experience of running it. I fire up a terminal, and run cd Downloads/doc2dash.x86_64-apple-darwin/ and then ./doc2dash.

Now we hit the more intractable problem:

The executable does not launch because it is neither code-signed nor notarized. I’m not going to go through the actual demonstration here, because you already know how annoying this is, and also, you can’t actually do it.

Code-signing is more or less fine. The codesign tool will do its thing, and that will change the wording in the angry dialog box from something about an “unidentified developer” to being “unable to check for malware”, which is not much of a help. You still need to notarize it, and notarization can’t work.

macOS really wants your executable code to be in a bundle (i.e., an App) so that it can know various things about its provenance and structure. CLI tools are expected to be in the operating system, or managed by a tool like brew that acts like a sort of bootleg secondary operating-system-ish thing and knows how to manage binaries.

If it isn’t in a bundle, then it needs to be in a platform-specific .pkg file, which is installed with the built-in Installer app. This is because apple cannot notarize a stand-alone binary executable file.

Part of the notarization process involves stapling an external “notarization ticket” to your code, and if you’ve only got a single file, it has nowhere to put that ticket. You can’t even submit a stand-alone binary; you have to package it in a format that is legible to Apple’s notarization service, which for a pure-CLI tool, means a .pkg.

What about corporate distributions of proprietary command-line tools, like the 1Password CLI? Oh look, their official instructions also tell you to use their Homebrew formula too. Homebrew really is the standard developer-CLI platform at this point for macOS. When 1Password distributes stuff outside of Homebrew, as with their beta builds, it’s stuff that lives in a .pkg as well.

It is possible to work around all of this.

I could open the unzipped file, right-click on the CLI tool, go to “Open”, get a subtly differently worded error dialog, like this…

…watch it open Terminal for me and then exit, then wait multiple seconds for it to run each time I want to re-run it at the command line. Did I mention that? The single-file option takes 2-3 seconds doing who-knows what (maybe some kind of security check, maybe pyoxidizer overhead, I don’t know) but the Homebrew version starts imperceptibly instantly.

Also, I then need to manually re-do this process in the GUI every time I want to update it.

If you know about the magic of how this all actually works, you can also do xattr -d com.apple.quarantine doc2dash by hand, but I feel like xattr -d is a step lower down in the user-friendliness hierarchy than python3 -m pip install⁹, and not only because making a habit of clearing quarantine attributes manually is a little like cutting the brake lines on Gatekeeper’s ability to keep out malware.

But the point of distributing a single-file binary is to make it “easy” for end users, and is explaining gatekeeper’s integrity verification accomplishing that goal?

Apple’s effectively mandatory code-signing verification on macOS is far out ahead of other desktop platforms right now, both in terms of its security and in terms of its obnoxiousness. But every mobile platform is like this, and I think that as everyone gets more and more panicked about malicious interference with software delivery, we’re going to see more and more official requirements that software must come packaged in one of these containers.

Microsoft will probably fix their absolute trash-fire of a codesigning system one day too. I predict that something vaguely like this will eventually even come to most Linux distributions. Not necessarily a prohibition on individual binaries like this, or like a GUI launch-prevention tool, but some sort of requirement imposed by the OS that every binary file be traceable to some sort of package, maybe enforced with some sort of annoying AppArmor profile if you don’t do it.

The practical, immediate message today is: “don’t bother producing a single-file binary for macOS users, we don’t want it and we will have a hard time using it”. But the longer term message is that focusing on creating single-file binaries is, in general, skating to where the puck used to be.

If we want Python to have a good platform-specific distribution mechanism for every platform, so it’s easy for developers to get their tools to users without having to teach all their users a bunch of nonsense about setuptools and virtualenvs first, we need to build that, and not get hung up on making a single-file executable packer a core part of the developer experience.

Thanks very much to my patrons for their support of writing like this, and software like these.

Oh, right. This is where I put the marketing “call to action”. Still getting the hang of these.

Did you enjoy this post and want me to write more like it, and/or did you hate it and want the psychological leverage and moral authority to tell me to stop and do something else? You can sign up here!

I remember one escapade in particular where someone had to ship a bunch of PKCS#11 providers along with a Go executable in their application and it was, to put it lightly, not a barrel of laughs. ↩
Shipping to Python programmers in a Python environment is kind of fine now, and has been for a while. ↩
Yet, even given my advance awareness of this criticism, despite my best efforts, I can’t seem to stop building little tools that poorly solve only one problem in isolation. ↩
And it does have to be at our collective disposal. Even the minuscule corner of this problem I’ve worked on, the aforementioned Mac code-signing and notarization stuff, is tedious and exhausting; nobody can take on the whole problem space, which is why I find writing about this is such an important part of the problem. Thanks Pradyun, and everyone else who has written about this at length! ↩
One of the sources of my anti-single-file stance here is that I tried very hard, for many years, to ensure that everything in Twisted was carefully zipimport-agnostic, even before pkg_resources existed, by using the from twisted.python.modules import getModule, getModule(__name__).filePath.sibling(...) idiom, where that .filePath attribute might be anything FilePath-like, specifically including ZipPath. It didn’t work; fundamentally, nobody cared, other contributors wouldn’t bother to enforce this, or even remember that it might be desirable, because they’d never worked in an environment where it was. Today, a quick git grep __file__ in Twisted turns up tons of usages that will make at least the tests fail to run in a single-file or zipimport environment. Despite the availability of zipimport itself since 2001, I have never seen tox or a tool like it support running with a zipimport-style deployment to ensure that this sort of configuration is easily, properly testable across entire libraries or applications. If someone really did want to care about single-file deployments, fixing this problem comprehensively across the ecosystem is probably one of the main things to start with, beginning with an international evangelism tour for importlib.resources. ↩
This is where the historical document is, given that I was using it at the time, but if you want to follow me now, please follow me on Mastodon. ↩
Oops, I guess we might as well not have bothered to make a single-file executable anyway! Once you need two files, you can put whatever you want in the zip file... ↩
Just kidding. Of course it’s installed already. It’s almost cute how Apple shows you the install progress to remind you that one day you might not need to download it whenever you get a new mac. ↩
There’s still technically a Python included in Xcode and the Xcode CLT, so functionally macs do have a /usr/bin/python3 that is sort of a python3.9. You shouldn’t really use it. Instead, download the python installer from python.org. But probably you should use it before you start disabling code integrity verification everywhere. ↩

iOS Mail To Omnifocus Task

Get Your Mac Python From Python.org

The One You Probably Want: Python.org

(And Auto-Update It)

(And Always Use Virtual Environments)

The Others

The One For Running Other People’s Code, Not Yours: Homebrew

The One For When You Need 20 Different Pythons For Debugging: pyenv

The Other One That’s Exactly Like pyenv: asdf-python

The Built-In One That Isn’t Really Built-In: /usr/bin/python3

The One With The Data And The Science: Conda

The One for Being Consistent With Your Cloud Hosting

Summary

Acknowledgements

Bilithification

Background

Bilithification

Repeat As Necessary

Advanced Techniques: A Multi-Stack Microservice Environment

Conclusion

Post-PyCon-US 2023 Notes

Health & Safety

Overall Vibe

The Schedule

My Talk

Open Spaces

The Sprints

Everything Else

No Executable Is An Island

The Built-In One That Isn’t Really Built-In: `/usr/bin/python3`