Python macOS Framework Builds

Building Python with --enable-framework changes some stuff around; should you care?

When you build Python, you can pass various options to ./configure that change aspects of how it is built. There is documentation for all of these options, and they are things like --prefix to tell the build where to install itself, --without-pymalloc if you have some esoteric need for everything to go through a custom memory allocator, or --with-pydebug.

One of these options only matters on macOS, and its effects are generally poorly understood. The official documentation just says “Create a Python.framework rather than a traditional Unix install.” But… do you need a Python.framework? If you’re used to running Python on Linux, then a “traditional Unix install” might sound pretty good; more consistent with what you are used to.

If you use a non-Framework build, most stuff seems to work, so why should anyone care? I have mentioned it as a detail in my previous post about Python on macOS, but even I didn’t really explain why you’d want it, just that it was generally desirable.

The traditional answer to this question is that you need a Framework build “if you want to use a GUI”, but this is demonstrably not true. At first it might not seem so, since the go-to Python GUI test is “run IDLE”; many non-Framework builds also omit Tkinter because they don’t ship a Tk dependency, so IDLE won’t start. But other GUI libraries work fine. For example, uv tool install runsnakerun / runsnake will happily pop open a GUI window, Framework build or not. So it bears some explaining

Wait, what is a “Framework” anyway?

Let’s back up and review an important detail of the mac platform.

On macOS, GUI applications are not just an executable file, they are organized into a bundle, which is a directory with a particular layout, that includes metadata, that launches an executable. A thing that, on Linux, might live in a combination of /bin/foo for its executable and /share/foo/ for its associated data files, is instead on macOS bundled together into Foo.app, and those components live in specified locations within that directory.

A framework is also a bundle, but one that contains a library. Since they are directories, Applications can contain their own Frameworks and Frameworks can contain helper Applications. If /Applications is roughly equivalent to the Unix /bin, then /Library/Frameworks is roughly equivalent to the Unix /lib.

App bundles are contained in a directory with a .app suffix, and frameworks are a directory with a .framework suffix.

So what do you need a Framework for in Python?

The truth about Framework builds is that there is not really one specific thing that you can point to that works or doesn’t work, where you “need” or “don’t need” a Framework build. I was not able to quickly construct an example that trivially fails in a non-framework context for this post, but I didn’t try that many different things, and there are a lot of different things that might fail.

The biggest issue is not actually the Python.framework itself. The metadata on the framework is not used for much outside of a build or linker context. However, Python’s Framework builds also ship with a stub application bundle, which places your Python process into a normal application(-ish) execution context all the time, which allows for various platform APIs like [NSBundle mainBundle] to behave in the normal, predictable ways that all of the numerous, various frameworks included on Apple platforms expect.

Various Apple platform features might want to ask a process questions like “what is your unique bundle identifier?” or “what entitlements are you authorized to access” and even beginning to answer those questions requires information stored in the application’s bundle.

Python does not ship with a wrapper around the core macOS “cocoa” API itself, but we can use pyobjc to interrogate this. After installing pyobjc-framework-cocoa, I can do this

>>> import AppKit
>>> AppKit.NSBundle.mainBundle()

On a non-Framework build, it might look like this:

1	`NSBundle </Users/glyph/example/.venv/bin> (loaded)`

But on a Framework build (even in a venv in a similar location), it might look like this:

NSBundle </Library/Frameworks/Python.framework/Versions/3.12/Resources/Python.app> (loaded)

This is why, at various points in the past, GUI access required a framework build, since connections to the window server would just be rejected for Unix-style executables. But that was an annoying restriction, so it was removed at some point, or at least, the behavior was changed. As far as I can tell, this change was not documented. But other things like user notifications or geolocation might need to identity an application for preferences or permissions purposes, respectively. Even something as basic as “what is your app icon” for what to show in alert dialogs is information contained in the bundle. So if you use a library that wants to make use of any of these features, it might work, or it might behave oddly, or it might silently fail in an undocumented way.

This might seem like undocumented, unnecessary cruft, but it is that way because it’s just basic stuff the platform expects to be there for a lot of different features of the platform.

`/etc/` builds

Still, this might seem like a strangely vague description of this feature, so it might be helpful to examine it by a metaphor to something you are more familiar with. If you’re familiar with more Unix style application development, consider a junior developer — let’s call him Jim — asking you if they should use an “/etc build” or not as a basis for their Docker containers.

What is an “/etc build”? Well, base images like ubuntu come with a bunch of files in /etc, and Jim just doesn’t see the point of any of them, so he likes to delete everything in /etc just to make things simpler. It seems to work so far. More experienced Unix engineers that he has asked react negatively and make a face when he tells them this, and seem to think that things will break. But their app seems to work fine, and none of these engineers can demonstrate some simple function breaking, so what’s the problem?

Off the top of your head, can you list all the features that all the files that /etc is needed for? Why not? Jim thinks it’s weird that all this stuff is undocumented, and it must just be unnecessary cruft.

If Jim were to come back to you later with a problem like “it seems like hostname resolution doesn’t work sometimes” or “ls says all my files are owned by 1001 rather than the user name I specified in my Dockerfile” you’d probably say “please, put /etc back, I don’t know exactly what file you need but lots of things just expect it to be there”.

This is what a framework vs. a non-Framework build is like. A Framework build just includes all the pieces of the build that the macOS platform expects to be there. What pieces do what features need? It depends. It changes over time. And the stub that Python’s Framework builds include may not be sufficient for some more esoteric stuff anyway. For example, if you want to use a feature that needs a bundle that has been signed with custom entitlements to access something specific, like the virtualization API, you might need to build your own app bundle. To extend our analogy with Jim, the fact that /etc exists and has the default files in it won’t always be sufficient; sometimes you have to add more files to /etc, with quite specific contents, for some features to work properly. But “don’t get rid of /etc (or your application bundle)” is pretty good advice.

Do you ever want a non-Framework build?

macOS does have a Unix subsystem, and many Unix-y things work, for Unix-y tasks. If you are developing a web application that mostly runs on Linux anyway and never care about using any features that touch the macOS-specific parts of your mac, then you probably don’t have to care all that much about Framework builds. You’re not going to be surprised one day by non-framework builds suddenly being unable to use some basic Unix facility like sockets or files. As long as you are aware of these limitations, it’s fine to install non-Framework builds. I have a dozen or so Pythons on my computer at any given time, and many of them are not Framework builds.

Framework builds do have some small drawbacks. They tend to be larger, they can be a bit more annoying to relocate, they typically want to live in a location like /Library or ~/Library. You can move Python.framework into an application bundle according to certain rules, as any bundling tool for macOS will have to do, but it might not work in random filesystem locations. This may make managing really large number of Python versions more annoying.

Most of all, the main reason to use a non-Framework build is if you are building a tool that manages a fleet of Python installations to perform some automation that needs to know about Python installs, and you want to write one simple tool that does stuff on Linux and on macOS. If you know you don’t need any platform-specific features, don’t want to spend the (not insignificant!) effort to cover those edge cases, and you get a lot of value from that level of consistency (for example, a teaching environment or interdisciplinary development team with a lot of platform diversity) then a non-framework build might be a better option.

Why do I care?

Personally, I think it’s important for Framework builds to be the default for most users, because I think that as much stuff should work out of the box as possible. Any user who sees a neat library that lets them get control of some chunk of data stored on their mac - map data, health data, game center high scores, whatever it is - should be empowered to call into those APIs and deal with that data for themselves.

Apple already makes it hard enough with their thicket of code-signing and notarization requirements for distributing software, aggressive privacy restrictions which prevents API access to some of this data in the first place, all these weird Unix-but-not-Unix filesystem layout idioms, sandboxing that restricts access to various features, and the use of esoteric abstractions like mach ports for communications behind the scenes. We don't need to make it even harder by making the way that you install your Python be a surprise gotcha variable that determines whether or not you can use an API like “show me a user notification when my data analysis is done” or “don’t do a power-hungry data analysis when I’m on battery power”, especially if it kinda-sorta works most of the time, but only fails on certain patch-releases of certain versions of the operating system, becuase an implementation detail of a proprietary framework changed in the meanwhile to require an application bundle where it didn’t before, or vice versa.

More generally, I think that we should care about empowering users with local computation and platform access on all platforms, Linux and Windows included. This just happens to be one particular quirk of how native platform integration works on macOS specifically.

Acknowledgments

Thank you to my patrons who are supporting my writing on this blog. For this one, thanks especially to long-time patron Hynek who requested it specifically. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor! I am also available for consulting work if you think your organization could benefit from expertise on topics like “how can we set up our Mac developers’ laptops with Python”.

On The Defense Of Heroes

How should we defend those people who have done great work that has inspired us, when they stand accused?

politics Friday August 16, 2024

If a high-status member of a community that you participate in is accused of misbehavior, you may want to defend them. You may even write a long essay in their defense.

In that essay, it may seem only natural to begin with a lengthy enumeration of the accused’s positive personal qualities. To extol the quality of their career and their contributions to your community. To talk about how nice they are. To be a character witness in the court of public opinion.

If you do this, you are not defending them. You are proving the point. This is exactly how missing stairs come to exist. People don’t get away with bad behavior if they don’t have high status and a good reputation already.

Sometimes, someone with antisocial inclinations seeks out status, in order to facilitate their bad behavior. Sometimes, a good, but, flawed person does a lot of really good work and thereby accidentally ends up with more status than they were expecting to have, and they don’t know how to handle it. In either case, bad behavior may ensue.

If you truly believe that your fave is being accused or punished unjustly, focus on the facts. What, specifically, has been alleged? How are these allegations substantiated? What verifiable evidence exists to the contrary? If you feel that someone is falsely accusing them to ruin their reputation, is there evidence to support your claim that the accusation is false? Ask yourself the question: what information do you have, that is leading to your correct analysis of the situation, that the people making the accusations do not have, which might be leading them into error?

But, also, maybe just… don’t?

The urge to defend someone like this is much more likely to come from a sense of personal grievance than justice. Consider: does it feel like you are being attacked, when your fave has been attacked? Is there a tightness in your chest, heat rising on your cheeks? Do you feel suddenly defensive?

Do you think that defensiveness is likely to lead to you making good, rational decisions about what steps to take next?

Let your heroes face accountability. If they are really worth your admiration, they might accept responsibility and make amends. Or they might fight the accusations with their own real evidence — evidence that you, someone peripheral to their situation, are unlikely to have — and prove the accusations wrong.

They might not want your defense. Even if they feel like they do want it in the moment — they are human too, after all, and facing accountability does not feel good to us humans — is the intensified feeling that they can’t let down their supporters who believe in them likely to make them feel less defensive and panicked?

In either case, your character defense is unlikely to serve them. At best it helps them stay on an ego trip, at worst it muddies the waters and might confuse the collection of facts that would, if considered dispassionately, properly exonerate them.

Do you think that I am pretending to speak in generalities but really talking about one specific recent event?

Wrong!

Just in this last week, I have read 2 different blog posts about 2 completely different people in completely unrelated communities and both of their authors need to read this. But each of those were already of a type, one that I’ve read dozens of instances of in the past.

It is a very human impulse to perceive a threat to someone we think well of, and to try to defend against that threat. But the consequences of someone’s own actions are not a threat you can defend them from.

Against Innovation Tokens

The “innovation token” model for selecting technologies is bad, and here’s why.

programming deployment ops architecture Wednesday July 03, 2024

Updated 2024-07-04: After some discussion, added an epilogue going into more detail about the value of the distinction between the two types of tokens.

In 2015, Dan McKinley laid out a model for software teams selecting technologies. He proposed that each team have a limited supply of “innovation tokens”, and, when selecting a technology, they can choose boring ones for free but “innovative” ones cost a token. This implies that we all know which technologies are innovative, and we assume that they are inherently costly, so we want to restrict their supply.

That model has become popular to the point that it is now part of the vernacular. In many discussions, it is accepted as received wisdom, or even common sense.

In this post I aim to show you that despite being superficially helpful, this model is wrong, and in fact, may be counterproductive. I believe it is an attractive nuisance in computer programming discourse.

In fairness to Mr. McKinley, the model he described in this post is:

nearly a decade old at this point, and
much more nuanced in its description of the problem with “innovation” than the subsequent memetic mutation of the concept.

While I will be referencing McKinley’s post, and I do take some issue with it, I am reacting more strongly to the life of its own that this idea has taken on once it escaped its original context. There are a zillion worse posts rehashing this concept, on blogs and LinkedIn, but I won’t be linking to them because the goal is not to call anybody out.

To some extent I am re-raising McKinley’s own caveats and reinforcing them. So I may be arguing with a strawman, but it’s a strawman I have seen deployed with some regularity over the years.

To reduce it to its core, this strawman is “don’t use new or interesting technology, and if you have to, only use a little bit”.

Within the broader culture of programmers, an “innovation token” has become a shorthand to smear any technology perceived — almost always based on vibes, not data — as risky, and the adoption of novel approaches as pretentious and unserious. Speaking of programmer culture though, I do have to acknowledge there is also a pervasive tendency for us to get distracted by novelty and waste time on puzzles rather than problem-solving, so I understand where the reactionary attitude represented by the concept of an innovation token comes from.

But it is reactionary.

At its worst, it borders on anti-intellectualism. I have heard it used on more than one occasion as a thought-terminating cliche to discard a potentially promising new tool. But before I get into that, let me try to give a sympathetic summary of the idea, because the model is not entirely bad.

It has been popular for a long time because it does work okay as an heuristic.

The real problem that McKinley is describing is operational overhead. When programmers make a technology selection, we are often considering how difficult it will make the programming. Innovative technology selections are, by definition, less mature.

That lack of maturity — particularly in the open source world — often means that the project is in a part of its lifecycle where it is concerned with development affordances more than operational ones. Therefore, the stereotypical innovative project, even one which might legitimately be a big improvement to development velocity, will create more operational overhead. That operational overhead creates a hidden cost for the operations team later on.

This is a point I emphatically agree with. When selecting a technology, you should consider its ease of operation more than its ease of development. If your team is successful, they will be operating and maintaining it far longer than they are initially integrating and deploying it.

Furthermore, some operational overhead is inevitable. You will need to hire people to mitigate it. More popular, more mature projects will have a bigger talent pool to hire from, so your training costs will be lower, and those training costs are part of your operational cost too.

Rationing innovation tokens therefore can work as a reasonable heuristic, or proxy metric, for avoiding a mess of complex operational problems associated with dependencies that are expensive to operate and hard to hire for.

There are some minor issues I want to point out before getting to the overarching one.

“has a lot of operational overhead” is a stereotype of a new technology, not an inherent property. If you want to reject a technology on the basis of being too high-overhead, at least look into its actual overhead a little bit. Sometimes, especially in 2024 as opposed to 2015, the point of a new, shiny piece of tech is to address operational issues that the more boring, older one had.
“hard to learn” is also a stereotype; if “newer” meant “harder” then we would all be using troff rather than Google Docs. Actually ask if the innovativeness is making things harder or easier; don’t assume.
You are going to have to train people on your stack no matter what. If a technology is adding a lot of value, it’s absolutely worth hiring for general ability and making a plan to teach people about it. You are going to have to do this with the core technology of your product anyway.

As I said, though, these are minor issues. The big problem with modeling operational overhead as an “innovation token” is that an even bigger concern than selecting an innovative tool is selecting too many tools.

The impulse to select more tools and make your operational environment more complex can be made worse by trying to avoid innovative tools. The important thing is not “less innovation”, but more consistency. To illustrate this, let’s do a simple thought experiment.

Let’s say you’re going to make a web app. There’s a tool in Haskell that you really like for a critical part of your app’s problem domain. You don’t want to spend more than one innovation token though, and everything in Haskell is inherently innovative, so you write a little service that just does that one part and you write the rest of your app in Ruby, calling into that service whenever you need to use that thing. This will appropriately restrict your “innovation token” expenditure.

Does doing this actually reduce your operational overhead, though?

First, you will have to find a team that likes both Ruby and Haskell and sees no problem using both. If you are not familiar with the cultural proclivities of these languages, suffice it to say that this is unlikely. Hiring for Haskell programmers is hard because there are fewer of them than Ruby programmers, but hiring for polyglot Haskell/Ruby programmers who are happy to do either is going to be really hard.

Since you will need to find different people to write in the different languages, even in the best case scenario, you will have two teams: the Haskell team and the Ruby team. Even if you are incredibly disciplined about inter-service responsibilities, there will be some areas where duplication of code is necessary across those services. Disagreements will arise and every one of these disagreements will be a source of social friction and software defects.

Then, you need to set up separate CI pipelines for each language, separate deployment systems, and of course, separate databases. Right away you are effectively doubling your workload.

In the worse, and unfortunately more likely scenario, there will be enormous infighting between these two teams. Operational incidents will be more difficult to manage because rather than learning the Haskell tools for operational visibility and disseminating that institutional knowledge amongst your team, you will be half-learning the lessons from two separate ecosystems and attempting to integrate them. Every on-call engineer will be frantically trying to learn a language ecosystem they don’t use regularly, or you will double the size of your on-call rotation. The Ruby team may start to resent the Haskell team for getting to exclusively work on the fun parts of the problem while they are doing things that look more like rote grunt work.

A better way to think about the problem of managing operational overhead is, rather than “innovation tokens”, consider “boundary tokens”.

That is to say, rather than evaluating the general sense of weird vibes from your architecture, consider the consistency of that architecture. If you’re using Haskell, use Haskell. You should be all-in on Haskell web frameworks, Haskell ORMs, Haskell OAuth integrations, and so on.¹ To cross the boundary out of Haskell, you need to spend a boundary token, and you shouldn’t have many of those.

I submit that the increased operational overhead that you might experience with an all-Haskell tool selection will be dwarfed by the savings that you get by having a team that is aligned with each other, that can communicate easily, and that can share programs with each other without needing to first strategize about a channel for the two pieces of work to establish bidirectional communication. The ability to simply call a function when you need to call it is very powerful, and extremely underrated.

Consistency ought to apply at each layer of the stack; it is perhaps most obvious with programming languages, but it is true of web frameworks, test frameworks, cryptographic libraries, you name it. Make a choice and stick with it, because every deviation from that choice carries a significant cost. Moreover this cost is a hidden cost, in the same way that the operational downsides of an “innovative” tool that hasn’t seen much production use might be hidden.

Discarding a more standard tool in favor of a tool more consistent with your architecture extends even to fairly uncontroversial, ubiquitous tools. For example, one of my favorite architectural patterns is to forego the use of the venerable — and very boring – Cron, the UNIX task-scheduler. Instead of Cron, it can make a lot of sense to have hand-written bespoke code for scheduling tasks within the application. Within the “innovation tokens” model, this is a very silly waste of a token!

Just use Cron! Everybody knows how to use Cron!

Except… does everybody know how to use Cron? Here are some questions to consider, if you’re about to roll out a big dependency on Cron:

How do you write a unit test for a scheduling rule with Cron?
Can you even remember how to write a cron rule that runs at the times you want?
How do you inject secrets and configuration variables into the distinct and somewhat idiosyncratic runtime execution environment of Cron?
How do you know that you did that variable-injection properly until the job actually runs, possibly in the middle of the night?
How do you deploy your monitoring and error-logging frameworks to observe your scripts run under Cron?

Granted, this architectural choice is less controversial than it once was. Cron used to be ambiently available on whatever servers you happened to be running. As container-based deployments have increased in popularity, this sense that Cron is just kinda around has gone away, and if you need to run a container that just runs Cron, much of the jankiness of its deployment is a lot more immediately visible.

There is friction at the boundary between things. That friction is a cost, but sometimes it’s a cost worth paying.

If there’s a really good library in Haskell and a really good library in Ruby and you really do want to use them both, maybe it makes sense to actually have multiple services. As your team gets larger and more mature, the need to bring in more tools, and the ability to handle the associated overhead, will only increase over time. But the place that the cost comes in the most is at the boundary between tools, not in the operational deficiencies of any one particular tool.

Even in a bog-standard web application with the most boring, least innovative tech stack imaginable (PHP, MySQL, HTML, CSS, JavaScript), many of the annoying points of friction are where different, inconsistent technologies make contact. If you are a programmer working on the web yourself, consider your own impression of the level of controversy of these technologies:

CSS frameworks that attempt to bridge the gap between CSS and JavaScript, like Bootstrap or Tailwind.
ORMs that attempt to bridge the gap between SQL and your backend language of choice
RPC systems that attempt to connect disparate services together using simple abstractions.

Consider that there are far more complex technical tools in terms of required skills to implement them, like computer vision or physics simulation, tools which are also pretty widely used, which consistently generate lower levels of controversy. People do have strong feelings about these things as well, of course, and it’s hard to find things to link to that show “this isn’t controversial”, but, like, search your feelings, you know it to be true.

You can see the benefits of the boundary token approach in programming language design. Many of the most influential and best-loved programming languages had an impact not by bundling together lots of tools, but by making everything into one thing:

LISP: everything is a list
Smalltalk: everything is an object
ML: everything is an algebraic data type
Forth: everything is a stack

There is a tremendous power in thinking about everything as a single kind of thing, because then you don’t have to juggle lots of different ideas about different kinds of things; you can just think about your problem.

When people complain about programming languages, they’re often complaining about how many different kinds of thing they have to remember in order to use it.

If you keep your boundary-token budget small, and allow your developers to accomplish as much as possible while staying within a solution space delineated by a single, clean cognitive boundary, I promise you can innovate as much as you want and your operational costs will remain manageable.

Epilogue

In subsequent Mastodon discussion of this post on with Matt Campbell and Meejah, I realized that I may not have made it entirely clear why I feel the distinction between “boundary” and “innovation” tokens is important. I do say above that the “innovation token” model can be a useful heuristic, so why bother with a new, but slightly different heuristic? Especially since most experienced engineers - indeed, McKinley himself - would budget “innovation” quite similarly to “boundaries”, and might even consider the use of more “innovative” Haskell tools in my hypothetical scenario to not even be an expenditure of innovation tokens at all.

To answer that, I need to highlight the purpose of having heuristics like this in the first place. These are vague, nebulous guidelines, not hard and fast rules. I cannot give you a token calculator to plug your technical decisions into. The purpose of either token heuristic is to facilitate discussions among a team.

With a team of skilled and experienced engineers, the distinction is meaningless. Senior and staff engineers (at least, the ones who deserve their level) will intuit the goals behind “innovation tokens” and inherently consider things like operational overhead anyway. In practice, a high-performing, well-aligned team discussing innovation tokens and one discussing boundary tokens will look functionally indistinguishable.

The distinction starts to be important when you have management pressures, nervous executives, inexperienced engineers, a fresh team without existing consensus about core technology choices, and so on. That is to say, most teams that exist in the messy, perpetually in medias res world of the software industry.

If you are just getting started on a project and you have a bunch of competent but disagreeable engineers, the words “innovation” and “boundaries” function very differently.

If you ask, “is this an innovation” about a particular technical tool, you are asking your interlocutor to pull in a bunch of their skills and experience to subjectively evaluate the relative industry-wide, or maybe company-wide, or maybe team-wide² newness of the thing being discussed. The discussion of whether it counts as boring or innovative is immediately fraught with a ton of subjective, difficult-to-quantify information about costs of hiring, difficulty of learning, and your impression of the feelings of hundreds or thousands of people outside of your team. And, yes, ultimately you do need to have an estimate of all that stuff, but starting your “is it OK to use this” conversation by simultaneously arguing about all those subjective judgments is setting yourself up for failure.

Instead, if you ask “does this introduce a boundary between two different technologies with different conceptual models”, while that is not a perfectly objective question, it is much easier for your team to answer, with much crisper intermediary factual questions. What are the two technologies? What are the models? How much do they differ? You can just hash out the answers to each one within the team directly, rather than needing to sift through the last few years of Stack Overflow developer surveys to determine relative adoption or popularity of technologies in the world at large.

Restricting your supply of either boundary or innovation tokens is a good idea, but achieving unanimity within your team about what your boundaries are is always going to be easier than deciding what your innovations are.

Acknowledgments

Thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor! I am also available for consulting work if you think your organization could benefit from expertise on topics like “how can we make our architecture more consistent”.

I gave a talk about this once, a very long time ago, where Haskell was Python. ↩
It’s not clear, that’s a big part of the problem. ↩

A Grand Unified Theory of the AI Hype Cycle

I’m sorry, but as an AI language model, I cannot repeat history exactly. However, I can rhyme with it.

ai llm hype history semasiology Wednesday May 22, 2024

The Cycle

The history of AI goes in cycles, each of which looks at least a little bit like this:

Scientists do some basic research and develop a promising novel mechanism, N. One important detail is that N has a specific name; it may or may not be carried out under the general umbrella of “AI research” but it is not itself “AI”. N always has a few properties, but the most common and salient one is that it initially tends to require about 3x the specifications of the average computer available to the market at the time; i.e., it requires three times as much RAM, CPU, and secondary storage as is shipped in the average computer.
Research and development efforts begin to get funded on the hypothetical potential of N. Because N is so resource intensive, this funding is used to purchase more computing capacity (RAM, CPU, storage) for the researchers, which leads to immediate results, as the technology was previously resource constrained.
Initial successes in the refinement of N hint at truly revolutionary possibilities for its deployment. These revolutionary possibilities include a dimension of cognition that has not previously been machine-automated.
Leaders in the field of this new development — specifically leaders, like lab administrators, corporate executives, and so on, as opposed to practitioners like engineers and scientists — recognize the sales potential of referring to this newly-“thinking” machine as “Artificial Intelligence”, often speculating about science-fictional levels of societal upheaval (specifically in a period of 5-20 years), now that the “hard problem” of machine cognition has been solved by N.
Other technology leaders, in related fields, also recognize the sales potential and begin adopting elements of the novel mechanism to combine with their own areas of interest, also referring to their projects as “AI” in order to access the pool of cash that has become available to that label. In the course of doing so, they incorporate N in increasingly unreasonable ways.
The scope of “AI” balloons to include pretty much all of computing technology. Some things that do not even include N start getting labeled this way.
There’s a massive economic boom within the field of “AI”, where “the field of AI” means any software development that is plausibly adjacent to N in any pitch deck or grant proposal.
Roughly 3 years pass, while those who control the flow of money gradually become skeptical of the overblown claims that recede into the indeterminate future, where N precipitates a robot apocalypse somewhere between 5 and 20 years away. Crucially, because of the aforementioned resource-intensiveness, the gold owners skepticism grows slowly over this period, because their own personal computers or the ones they have access to do not have the requisite resources to actually run the technology in question and it is challenging for them to observe its performance directly. Public critics begin to appear.
Competent practitioners — not leaders — who have been successfully using N in research or industry quietly stop calling their tools “AI”, or at least stop emphasizing the “artificial intelligence” aspect of them, and start getting funding under other auspices. Whatever N does that isn’t “thinking” starts getting applied more seriously as its limitations are better understood. Users begin using more specific terms to describe the things they want, rather than calling everything “AI”.
Thanks to the relentless march of Moore’s law, the specs of the average computer improve. The CPU, RAM, and disk resources required to actually run the software locally come down in price, and everyone upgrades to a new computer that can actually run the new stuff.
The investors and grant funders update their personal computers, and they start personally running the software they’ve been investing in. Products with long development cycles are finally released to customers as well, but they are disappointing. The investors quietly get mad. They’re not going to publicly trash their own investments, but they stop loudly boosting them and they stop writing checks. They pivot to biotech for a while.
The field of “AI” becomes increasingly desperate, as it becomes the label applied to uses of N which are not productive, since the productive uses are marketed under their application rather than their mechanism. Funders lose their patience, the polarity of the “AI” money magnet rapidly reverses. Here, the AI winter is finally upon us.
The remaining AI researchers who still have funding via mechanisms less vulnerable to hype, who are genuinely thinking about automating aspects of cognition rather than simply N, quietly move on to the next impediment to a truly thinking machine, and in the course of doing so, they discover a new novel mechanism, M. Go to step 1, with M as the new N, and our current N as a thing that is now “not AI”, called by its own, more precise name.

The History

A non-exhaustive list of previous values of N have been:

Neural networks and symbolic reasoning in the 1950s.
Theorem provers in the 1960s.
Expert systems in the 1980s.
Fuzzy logic and hidden Markov models in the 1990s.
Deep learning in the 2010s.

Each of these cycles has been larger and lasted longer than the last, and I want to be clear: each cycle has produced genuinely useful technology. It’s just that each follows the progress of a sigmoid curve that everyone mistakes for an exponential one. There is an initial burst of rapid improvement, followed by gradual improvement, followed by a plateau. Initial promises imply or even state outright “if we pour more {compute, RAM, training data, money} into this, we’ll get improvements forever!” The reality is always that these strategies inevitably have a limit, usually one that does not take too long to find.

Where Are We Now?

So where are we in the current hype cycle?

Here’s a Computerphile video which explains some recent research into LLM performance. I’d highly encourage you to have a look at the paper itself, particularly Figure 2, “Log-linear relationships between concept frequency and CLIP zero-shot performance”.
Here’s a series of posts by Simon Willison explaining the trajectory of the practicality of actually-useful LLMs on personal devices. He hasn’t written much about it recently because it is now fairly pedestrian for an AI-using software developer to have a bunch of local models, and although we haven’t quite broken through the price floor of the gear-acquisition-syndrome prosumer market in terms of the requirements of doing so, we are getting close.
The Rabbit R1 and Humane AI Pin were both released; were they disappointments to their customers and investors? I think we all know how that went at this point.
I hear Karius just raised a series C, and they’re an “emerging unicorn”.
It does appear that we are all still resolutely calling these things “AI” for now, though, much as I wish, as a semasiology enthusiast, that we would be more precise.

Some Qualifications

History does not repeat itself, but it does rhyme. This hype cycle is unlike any that have come before in various ways. There is more money involved now. It’s much more commercial; I had to phrase things above in very general ways because many previous hype waves have been based on research funding, some really being exclusively a phenomenon at one department in DARPA, and not, like, the entire economy.

I cannot tell you when the current mania will end and this bubble will burst. If I could, you’d be reading this in my $100,000 per month subscribers-only trading strategy newsletter and not a public blog. What I can tell you is that computers cannot think, and that the problems of the current instantation of the nebulously defined field of “AI” will not all be solved within “5 to 20 years”.

Acknowledgments

Thank you to my patrons who are supporting my writing on this blog. Special thanks also to Ben Chatterton for a brief pre-publication review; any errors remain my own. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor! I am also available for consulting work if you think your organization could benefit from expertise on topics like “what are we doing that history will condemn us for”. Or, you know, Python programming.

How To PyCon

Since I am headed to PyCon tomorrow, let’s talk about conference tips.

pycon Wednesday May 15, 2024

These tips are not the “right” way to do PyCon, but they are suggestions based on how I try to do PyCon. Consider them reminders to myself, an experienced long-time attendee, which you are welcome to overhear.

See Some Talks

The hallway track is awesome. But the best version of the hallway track is not just bumping into people and chatting; it’s the version where you’ve all recently seen the same thing, and thereby have a shared context of something to react to. If you aren’t going to talks, you aren’t going to get a good hallway track.. Therefore: choose talks that interest you, attend them and pay close attention, then find people to talk to about them.

Given that you will want to see some of the talks, make sure that you have the schedule downloaded and available offline on your mobile device, or printed out on a piece of paper.

Make a list of the talks you think you want to see, but have that schedule with you in case you want to call an audible in the middle of the conference, switching to a different talk you didn’t notice based on some of those “hallway track” conversations.

Participate In Open Spaces

The name “hallway track” itself is antiquated, in a way which is relevant and important to modern conferences. It used to be that conferences were exclusively oriented around their scheduled talks; it was called the “hallway” track because the way to access it was to linger in the hallways, outside the official structure of the conference, and just talk to people.

But however, at PyCon and many other conferences, this unofficial track is now much more of an integrated, official part of the program. In particular, open spaces are not only a more official hallway track, they are considerably better than the historical “hallway” experience, because these ad-hoc gatherings can be convened with a prepared topic and potentially a loose structure to facilitate productive discussion.

With open spaces, sessions can have an agenda and so conversations are easier to start. Rooms are provided, which is more useful than you might think; literally hanging out in a hallway is actually surprisingly disruptive to speakers and attendees at talks; us nerds tend to get pretty loud and can be quite audible even through a slightly-cracked door, so avail yourself of these rooms and don’t be a disruptive jerk outside somebody’s talk.

Consult the open space board, and put up your own proposed sessions. Post them as early as you can, to maximize the chance that they will get noticed. Post them on social media, using the conference's official hashtag, and ask any interested folks you bump into help boost it.¹

Remember that open spaces are not talks. If you want to give a mini-lecture on a topic and you can find interested folks you could do that, but the format lends itself to more peer-to-peer, roundtable-style interactions. Among other things, this means that, unlike proposing a talk, where you should be an expert on the topic that you are proposing, you can suggest open spaces where you are curious — but ignorant — about something, in the hopes that some experts will show up and you can listen to their discussion.

Be prepared for this to fail; there’s a lot going on and it’s always possible that nobody will notice your session. Again, maximize your chances by posting it as early as you can and promoting it, but be prepared to just have a free 30 minutes to check your email. Sometimes that’s just how it goes. The corollary here is not to always balance attending others’ spaces with proposing your own. After all if someone else proposed it you know at least one other person is gonna be there.

Take Care of Your Body

Conferences can be surprisingly high-intensity physical activities. It’s not a marathon, but you will be walking quickly from one end of a large convention center to another, potentially somewhat anxiously.

Hydrate, hydrate, hydrate. Bring a water bottle, and have it with you at all times. It might be helpful to set repeating timers on your phone to drink water, since it can be easy to forget in the middle of engaging conversations. If you take advantage of the hallway track as much as you should, you will talk more than you expect; talking expels water from your body. All that aforementioned walking might make you sweat a bit more than you realize.

Hydrate.

More generally, pay attention to what you are eating and drinking. Conference food isn’t always the best, and in a new city you might be tempted to load up on big meals and junk food. You should enjoy yourself and experience the local cuisine, but do it intentionally. While you enjoy the local fare, do so in whatever moderation works best for you. Similarly for boozy night-time socializing. Nothing stings quite as much as missing a morning of talks because you’ve got a hangover or a migraine.

This is worth emphasizing because in the enthusiasm of an exciting conference experience, it’s easy to lose track and overdo it.

Meet Both New And Old Friends: Plan Your Socializing

A lot of the advice above is mostly for first-time or new-ish conferencegoers, but this one might be more useful for the old heads. As we build up a long-time clique of conference friends, it’s easy to get a bit insular and lose out on one of the bits of magic of such an event: meeting new folks and hearing new perspectives.

While open spaces can address this a little bit, there's one additional thing I've started doing in the last few years: dinners are for old friends, but lunches are for new ones. At least half of the days I'm there, I try to go to a new table with new folks that I haven't seen before, and strike up a conversation. I even have a little canned icebreaker prompt, which I would suggest to others as well, because it’s worked pretty nicely in past years: “what is one fun thing you have done with Python recently?”².

Given that I have a pretty big crowd of old friends at these things, I actually tend to avoid old friends at lunch, since it’s so easy to get into multi-hour conversations, and meeting new folks in a big group can be intimidating. Lunches are the time I carve out to try and meet new folks.

I’ll See You There

I hope some of these tips were helpful, and I am looking forward to seeing some of you at PyCon US 2024!

In PyCon2024's case, #PyConUS on Mastodon is probably the way to go. Note, also, that it is #PyConUS and not #pyconus, which is much less legible for users of screen-readers. ↩
Obviously that is specific to this conference. At the O’Reilly Software Architecture conference, my prompt was “What is software architecture?” which had some really fascinating answers. ↩

Python macOS Framework Builds

Wait, what is a “Framework” anyway?

So what do you need a Framework for in Python?

/etc/ builds

Do you ever want a non-Framework build?

Why do I care?

Acknowledgments

On The Defense Of Heroes

Against Innovation Tokens

Epilogue

Acknowledgments

A Grand Unified Theory of the AI Hype Cycle

The Cycle

The History

Where Are We Now?

Some Qualifications

Acknowledgments

How To PyCon

See Some Talks

Participate In Open Spaces

Take Care of Your Body

Meet Both New And Old Friends: Plan Your Socializing

I’ll See You There

`/etc/` builds