Software You Can Use

The Python community needs a tool for distributing software to end users.

pythonprogrammingdeployment Saturday September 12, 2015

Python has a big problem. While it’s easy and fun to produce software in Python, it’s hard to produce software that people - especially laypeople who are not professional software developers - can use.

In the modern software ecosystem, there are a few places that you might want to use a program:

  1. On a server, by loading a web page.
  2. On a web page, by running it in your browser.
  3. As a command-line tool, on Mac,
  4. ... Windows,
  5. ... or Linux.
  6. As a desktop application, on Mac,
  7. ... Windows,
  8. ... or Linux.
  9. As a mobile application, on iOS,
  10. ... or Android,
  11. ... Or Windows Phone. (Just kidding.)

Out of these 10 scenarios, Python currently has half of a good deployment story for one of them: running an application on a server, as a back-end. This is a serious problem for the future of Python and one we need to figure out how to face as a community.

Even the “good” deployment story is somewhat convoluted, as you need to know about at least some Linux distribution’s package manager, and native dependencies, and Pip, and virtualenv, and wheels, and probably docker too.

If you want to run a Python application in your browser, your best bet right now is probably Brython. However, brython is still in its infancy, and basic faciltiies like preparing your code for production to achieve acceptable start-up performance, and an explanation of how to use libraries are missing. With big chunks like that missing it’s hard to advocate for Brython’s use in production.

Moving on to scenario 3, this may be one of the best-supported configurations; py2app actually works surprisingly well. But it’s still incredibly confusing for new users. Which native objects (dylibs and frameworks and data files) to bundle are options to py2app itself, and not something that can be handled automatically by libraries. So if, for example, PyGame depends on SDL.framework or libSDL.dylib, you as an application developer need to understand how to figure that out and specify that list.

On Windows, the situation gets worse. To work as a Windows exectuable, you need to bundle the Python interpreter, but unlike in an OS X application, you can’t just copy in a whole directory. So you end up needing a tool like PyInstaller or cx_Freeze. PyInstaller hasn’t seen a release in the last 2 years; it doesn’t support Python 3. It also doesn’t work: if I try to package the most basic Twisted program possible, with pyinstaller 2.1 I get “no module named zope.interface”, and if I try to package it with pyinstaller trunk, I get “no module named itertools”. cx_Freeze similarly can’t figure out how to include zope.interface no matter what I tell it to do. This problem isn’t specific to libraries that I use; most Python projects will run into it.

py2exe, on the other hand, only supports Python 3.3+, and so is unusable with a lot of important python libraries.

For a GUI application for Linux, you might have some small hope of building a distro-specific package that users could install, but that would involve using a distro-specific toolchain that had nothing to do with Python, and you need to repeat that work for Debian, Ubuntu, Fedora, and whatever other distros you want to support.

All of these same tools are what I would use to build a stand-alone command-line executable for Windows, Mac, or Linux, and they all break down in similar ways.

In the mobile space, there is absolutely zero tooling included with the language to even get started there. It might be possible to use Kivy to get a build onto iOS or Android. I haven’t had an opportunity to test those. But they still require you to install Homebrew, and a C compiler, and a whole bunch of fairly specific platform tooling to get started, and there are lots of different ways that can go wrong.

So how do other languages stack up?

  1. In JavaScript, if you want an application in the browser, it’s as simple as ... writing some JavaScript.
  2. In JavaScript, if you want a desktop application, you can just grab Electron and be up and running in a few minutes.
  3. In JavaScript, if you want a command-line UNIX tool, you can grab nar and build something self-contained almost immediately.
  4. And of course, in Go, there’s no way to get anything but a fully functional self-contained executable at the end of the build process. Everything is fully redistributable by default.

As a community, Python needs a clear, well-documented, well-supported, modern way to produce build artifacts that are easy to create and easy to share. We need to have this for all popular platforms and the browser. This is a tricky problem: it requires knowledge of lots of fiddly build details.

This wheel has been re-invented, poorly, a dozen or so times. My list above was just a subset. In addition to py2app, py2exe, pyinstaller, cx_Freeze, and the Kivy bundling tools, we’ve also got terrarium, bbfreeze (which is unmaintained), pipsi, pex, and probably some others I don’t know about.

In order to compete with JavaScript and Go for developers’ attention, Python must be able to become an implementation detail and disappear when the user is running the program. This means that some of these tools (terrarium, pipsi, pex) are not suitable for this purpose because they are envelopes for deployment into an environment with an installed Python interpreter.

All of the tools I’m aware of that are trying to provide fully self-contained execution, though (pyinstaller, cx_freeze, bb-freeze, py2app) are poorly designed because they value optimized distribution size over actually working by default. Rather than reading setuptools metadata and discovering the full set of dependencies which have been declared to be required, all of these tools use weird AST-parsing heuristics and buggy path-traversal hacks to try to construct a guess as to the minimal set of files that might be required, then require the poor application developer to fill in the gaps. This means none of them work with namespace packages, none of them work properly with plugin systems or runtime configuration systems; generally, they don’t work correctly with late binding, which is one of Python’s greatest strengths. Of course, a full Python interpreter with the whole standard library is quite large. If we had a tool that worked well but produced very large executables, we could of course start adding an “optimized mode” to try to crunch things down for production.

And all this is to say nothing of the insanely intricate and detailed knowledge that every Python programmer eventually acquires about the C runtime semantics of their chosen platform. When a C compiler is required but missing, most tools still just emit tracebacks. When a shared library goes missing dues to an OS upgrade or package removal, you just see whatever the dynamic linker thinks to report, no explanation of how to fix it or what to do next.

The Python packaging ecosystem has made great strides in the last few years; Pip, in particular, has gone from a buggy and insecure mess to a mostly workable software delivery mechanism for developers. There are still bugs, but they are getting dealt with at a reasonable clip. However, Pip only delivers software to developers, and still requires you to have a Python runtime, a build environment, and tricky command-line tools to get things in place for development. The Python community has effectively no tools to deliver software to users.

To sum up, we need a tool which:

  1. works by default, including with “tricky” packages with namespace packages, data files, and native dependencies
  2. produces useful, actionable error messages when something is missing and the build can’t be completed (like “you don’t have a C compiler installed” or “you need to install Homebrew and then brew install openssl”)
  3. can produce both command-line and GUI executables for the mac, windows, and linux (and, for bonus points, a web browser)

The bad news is that I don’t have the time to start this project myself, and I’m not sure who does. The worse news is that every day we don’t have this, more and more people are re-writing their user-facing tools and applications in JavaScript or Go or Swift or Java, to suit their target platform, because it is honestly easier to learn an entirely new programming language and toolchain, and rewrite an entire application than to figure out how to build a self-contained executable in Python right now.

The good news, though, is that it’s a simple matter of programming, and that all the core technologies for doing all the really hard things that need to be done (pip, and zipimport and macholib, for example) already exist. It’s just a simple matter of programming: wiring together the metadata from setuptools, determining native dependencies with something like otool or ldd (or whatever the equivalent is on Windows, I still haven’t figured that out myself), pulling them all into a bundle, tacking the Python interpreter on.