The Next Thing Will Not Be Big

Disruption, too, will be disrupted.

startups ai open-source programming politics Thursday January 01, 2026

The dawning of a new year is an opportune moment to contemplate what has transpired in the old year, and consider what is likely to happen in the new one.

Today, I’d like to contemplate that contemplation itself.

The 20th century was an era characterized by rapidly accelerating change in technology and industry, creating shorter and shorter cultural cycles of changes in lifestyles. Thus far, the 21st century seems to be following that trend, at least in its recently concluded first quarter.

The early half of the twentieth century saw the massive disruption caused by electrification, radio, motion pictures, and then television.

In 1971, Intel poured gasoline on that fire by releasing the 4004, a microchip generally recognized as the first general-purpose microprocessor. Popular innovations rapidly followed: the computerized cash register, the personal computer, credit cards, cellular phones, text messaging, the Internet, the web, online games, mass surveillance, app stores, social media.

These innovations have arrived faster than previous generations, but also, they have crossed a crucial threshold: that of the human lifespan.

While the entire second millennium A.D. has been characterized by a gradually accelerating rate of technological and social change — the printing press and the industrial revolution were no slouches, in terms of changing society, and those predate the 20th century — most of those changes had the benefit of unfolding throughout the course of a generation or so.

Which means that any individual person in any given century up to the 20th might remember one major world-altering social shift within their lifetime, not five to ten of them. The diversity of human experience is vast, but most people would not expect that the defining technology of their lifetime was merely the latest in a progression of predictable civilization-shattering marvels.

Along with each of these successive generations of technology, we minted a new generation of industry titans. Westinghouse, Carnegie, Sarnoff, Edison, Ford, Hughes, Gates, Jobs, Zuckerberg, Musk. Not just individual rich people, but entire new classes of rich people that did not exist before. “Radio DJ”, “Movie Star”, “Rock Star”, “Dot Com Founder”, were all new paths to wealth opened (and closed) by specific technologies. While most of these people did come from at least some level of generational wealth, they no longer came from a literal hereditary aristocracy.

To describe this new feeling of constant acceleration, a new phrase was coined: “The Next Big Thing”. In addition to denoting that some Thing was coming and that it would be Big (i.e.: that it would change a lot about our lives), this phrase also carries the strong implication that such a Thing would be a product. Not a development in social relationships or a shift in cultural values, but some new and amazing form of conveying salted meat paste or what-have-you, that would make whatever lucky tinkerer who stumbled into it into a billionaire — along with any friends and family lucky enough to believe in their vision and get in on the ground floor with an investment.

In the latter part of the 20th century, our entire model of capital allocation shifted to account for this widespread belief. No longer were mega-businesses built by bank loans, stock issuances, and reinvestment of profit, the new model was “Venture Capital”. Venture capital is a model of capital allocation explicitly predicated on the idea that carefully considering each bet on a likely-to-succeed business and reducing one’s risk was a waste of time, because the return on the equity from the Next Big Thing would be so disproportionately huge — 10x, 100x, 1000x – that one could afford to make at least 10 bad bets for each good one, and still come out ahead.

The biggest risk was in missing the deal, not in giving a bunch of money to a scam. Thus, value investing and focus on fundamentals have been broadly disregarded in favor of the pursuit of the Next Big Thing.

If Americans of the twentieth century were temporarily embarrassed millionaires, those of the twenty-first are all temporarily embarrassed FAANG CEOs.

The predicament that this tendency leaves us in today is that the world is increasingly run by generations — GenX and Millennials — with the shared experience that the computer industry, either hardware or software, would produce some radical innovation every few years. We assume that to be true.

But all things change, even change itself, and that industry is beginning to slow down. Physically, transistor density is starting to brush up against physical limits. Economically, most people are drowning in more compute power than they know what to do with anyway. Users already have most of what they need from the Internet.

The big new feature in every operating system is a bunch of useless junk nobody really wants and is seeing remarkably little uptake. Social media and smartphones changed the world, true, but… those are both innovations from 2008. They’re just not new any more.

So we are all — collectively, culturally — looking for the Next Big Thing, and we keep not finding it.

It wasn’t 3D printing. It wasn’t crowdfunding. It wasn’t smart watches. It wasn’t VR. It wasn’t the Metaverse, it wasn’t Bitcoin, it wasn’t NFTs¹.

It’s also not AI, but this is why so many people assume that it will be AI. Because it’s got to be something, right? If it’s got to be something then AI is as good a guess as anything else right now.

The fact is, our lifetimes have been an extreme anomaly. Things like the Internet used to come along every thousand years or so, and while we might expect that the pace will stay a bit higher than that, it is not reasonable to expect that something new like “personal computers” or “the Internet”³ will arrive again.

We are not going to get rich by getting in on the ground floor of the next Apple or the next Google because the next Apple and the next Google are Apple and Google. The industry is maturing. Software technology, computer technology, and internet technology are all maturing.

There Will Be Next Things

Research and development is happening in all fields all the time. Amazing new developments quietly and regularly occur in pharmaceuticals and in materials science. But these are not predictable. They do not inhabit the public consciousness until they’ve already happened, and they are rarely so profound and transformative that they change everybody’s life.

There will even be new things in the computer industry, both software and hardware. Foldable phones do address a real problem (I wish the screen were even bigger but I don’t want to carry around such a big device), and would probably be more popular if they got the costs under control. One day somebody’s going to crack the problem of volumetric displays, probably. Some VR product will probably, eventually, hit a more realistic price/performance ratio where the niche will expand at least a little more.

Maybe there will even be something genuinely useful, which is recognizably adjacent to the current “AI” fad, but if it is, it will be some new development that we haven’t seen yet. If current AI technology were sufficient to drive some interesting product, it would already be doing it, not using marketing disguised as science to conceal diminishing returns on current investments.

But They Will Not Be Big

The impulse to find the One Big Thing that will dominate the next five years is a fool’s errand. Incremental gains are diminishing across the board. The markets for time and attention² are largely saturated. There’s no need for another streaming service if 100% of your leisure time is already committed to TikTok, YouTube and Netflix; famously, Netflix has already considered sleep its primary competitor for close to a decade - years before the pandemic.

Those rare tech markets which aren’t saturated are suffering from pedestrian economic problems like wealth inequality, not technological bottlenecks.

For example, the thing preventing the development of a robot that can do your laundry and your dishes without your input is not necessarily that we couldn’t build something like that, but that most households just can’t afford it without wage growth catching up to productivity growth. It doesn’t make sense for anyone to commit to the substantial R&D investment that such a thing would take, if the market doesn’t exist because the average worker isn’t paid enough to afford it on top of all the other tech which is already required to exist in society.

The projected income from the tiny, wealthy sliver of the population who could pay for the hardware, cannot justify an investment in the software past a fake version remotely operated by workers in the global south, only made possible by Internet wage arbitrage, i.e. a more palatable, modern version of indentured servitude.

Even if we were to accept the premise of an actually-“AI” version of this, that is still just a wish that ChatGPT could somehow improve enough behind the scenes to replace that worker, not any substantive investment in a novel, proprietary-to-the-chores-robot software system which could reliably perform specific functions.

What, Then?

The expectation for, and lack of, a “big thing” is a big problem. There are others who could describe its economic, political, and financial dimensions better than I can. So then let me speak to my expertise and my audience: open source software developers.

When I began my own involvement with open source, a big part of the draw for me was participating in a low-cost (to the corporate developer) but high-value (to society at large) positive externality. None of my employers would ever have cared about many of the applications for which Twisted forms a core bit of infrastructure; nor would I have been able to predict those applications’ existence. Yet, it is nice to have contributed to their development, even a little bit.

However, it’s not actually a positive externality if the public at large can’t directly benefit from it.

When real world-changing, disruptive developments are occurring, the bean-counters are not watching positive externalities too closely. As we discovered with many of the other benefits that temporarily accrued to labor in the tech economy, Open Source that is usable by individuals and small companies may have been a ZIRP. If you know you’re gonna make a billion dollars you’re not going to worry about giving away a few hundred thousand here and there.

When gains are smaller and harder to realize, and margins are starting to get squeezed, it’s harder to justify the investment in vaguely good vibes.

But this, itself, is not a call to action. I doubt very much that anyone reading this can do anything about the macroeconomic reality of higher interest rates. The technological reality of “development is happening slower” is inherently something that you can’t change on purpose.

However, what we can do is to be aware of this trend in our own work.

Fight Scale Creep

It seems to me that more and more open source infrastructure projects are tools for hyper-scale application development, only relevant to massive cloud companies. This is just a subjective assessment on my part — I’m not sure what tools even exist today to measure this empirically — but I remember a big part of the open source community when I was younger being things like Inkscape, Themes.Org and Slashdot, not React, Docker Hub and Hacker News.

This is not to say that the hobbyist world no longer exists. There is of course a ton of stuff going on with Raspberry Pi, Home Assistant, OwnCloud, and so on. If anything there’s a bit of a resurgence of self-hosting. But the interests of self-hosters and corporate developers are growing apart; there seems to be far less of a beneficial overflow from corporate infrastructure projects into these enthusiast or prosumer communities.

This is the concrete call to action: if you are employed in any capacity as an open source maintainer, dedicate more energy to medium- or small-scale open source projects.

If your assumption is that you will eventually reach a hyper-scale inflection point, then mimicking Facebook and Netflix is likely to be a good idea. However, if we can all admit to ourselves that we’re not going to achieve a trillion-dollar valuation and a hundred thousand engineer headcount, we can begin to consider ways to make our Next Thing a bit smaller, and to accommodate the world as it is rather than as we wish it would be.

Be Prepared to Scale Down

Here are some design guidelines you might consider, for just about any open source project, particularly infrastructure ones:

Don’t assume that your software can sustain an arbitrarily large fixed overhead because “you just pay that cost once” and you’re going to be running a billion instances so it will always amortize; maybe you’re only going to be running ten.
Remember that such fixed overhead includes not just CPU, RAM, and filesystem storage, but also the learning curve for developers. Front-loading a massive amount of conceptual complexity to accommodate the problems of hyper-scalers is a common mistake. Try to smooth out these complexities and introduce them only when necessary.
Test your code on edge devices. This means supporting Windows and macOS, and even Android and iOS. If you want your tool to help empower individual users, you will need to meet them where they are, which is not on an EC2 instance.
This includes considering Desktop Linux as a platform, as opposed to Server Linux as a platform, which (while they certainly have plenty in common) they are also distinct in some details. Consider the highly specific example of secret storage: if you are writing something that intends to live in a cloud environment, and you need to configure it with a secret, you will probably want to provide it via a text file or an environment variable. By contrast, if you want this same code to run on a desktop system, your users will expect you to support the Secret Service. This will likely only require a few lines of code to accommodate, but it is a massive difference to the user experience.
Don’t rely on LLMs remaining cheap or free. If you have LLM-related features⁴, make sure that they are sufficiently severable from the rest of your offering that if ChatGPT starts costing $1000 a month, your tool doesn’t break completely. Similarly, do not require that your users have easy access to half a terabyte of VRAM and a rack full of 5090s in order to run a local model.

Even if you were going to scale up to infinity, the ability to scale down and consider smaller deployments means that you can run more comfortably on, for example, a developer’s laptop. So even if you can’t convince your employer that this is where the economy and the future of technology in our lifetimes is going, it can be easy enough to justify this sort of design shift, particularly as individual choices. Make your onboarding cheaper, your development feedback loops tighter, and your systems generally more resilient to economic headwinds.

So, please design your open source libraries, applications, and services to run on smaller devices, with less complexity. It will be worth your time as well as your users’.

But if you can fix the whole wealth inequality thing, do that first.

Acknowledgments

These sorts of lists are pretty funny reads, in retrospect. ↩
Which is to say, “distraction”. ↩
... or even their lesser-but-still-profound aftershocks like “Social Media”, “Smartphones”, or “On-Demand Streaming Video” ... secondary manifestations of the underlying innovation of a packet-switched global digital network ... ↩
My preference would of course be that you just didn’t have such features at all, but perhaps even if you agree with me, you are part of an organization with some mandate to implement LLM stuff. Just try not to wrap the chain of this anchor all the way around your code’s neck. ↩

The “Dependency Cutout” Workflow Pattern, Part I

It’s important to be able to fix bugs in your open source dependencies, and not just work around them.

programming python deployment open-source Monday November 10, 2025

Tell me if you’ve heard this one before.

You’re working on an application. Let’s call it “FooApp”. FooApp has a dependency on an open source library, let’s call it “LibBar”. You find a bug in LibBar that affects FooApp.

To envisage the best possible version of this scenario, let’s say you actively like LibBar, both technically and socially. You’ve contributed to it in the past. But this bug is causing production issues in FooApp today, and LibBar’s release schedule is quarterly. FooApp is your job; LibBar is (at best) your hobby. Blocking on the full upstream contribution cycle and waiting for a release is an absolute non-starter.

What do you do?

There are a few common reactions to this type of scenario, all of which are bad options.

I will enumerate them specifically here, because I suspect that some of them may resonate with many readers:

Find an alternative to LibBar, and switch to it.

This is a bad idea because a transition to a core infrastructure component could be extremely expensive.
Vendor LibBar into your codebase and fix your vendored version.

This is a bad idea because carrying this one fix now requires you to maintain all the tooling associated with a monorepo¹: you have to be able to start pulling in new versions from LibBar regularly, reconcile your changes even though you now have a separate version history on your imported version, and so on.
Monkey-patch LibBar to include your fix.

This is a bad idea because you are now extremely tightly coupled to a specific version of LibBar. By modifying LibBar internally like this, you’re inherently violating its compatibility contract, in a way which is going to be extremely difficult to test. You can test this change, of course, but as LibBar changes, you will need to replicate any relevant portions of its test suite (which may be its entire test suite) in FooApp. Lots of potential duplication of effort there.
Implement a workaround in your own code, rather than fixing it.

This is a bad idea because you are distorting the responsibility for correct behavior. LibBar is supposed to do LibBar’s job, and unless you have a full wrapper for it in your own codebase, other engineers (including “yourself, personally”) might later forget to go through the alternate, workaround codepath, and invoke the buggy LibBar behavior again in some new place.
Implement the fix upstream in LibBar anyway, because that’s the Right Thing To Do, and burn credibility with management while you anxiously wait for a release with the bug in production.

This is a bad idea because you are betraying your users — by allowing the buggy behavior to persist — for the workflow convenience of your dependency providers. Your users are probably giving you money, and trusting you with their data. This means you have both ethical and economic obligations to consider their interests.

As much as it’s nice to participate in the open source community and take on an appropriate level of burden to maintain the commons, this cannot sustainably be at the explicit expense of the population you serve directly.

Even if we only care about the open source maintainers here, there’s still a problem: as you are likely to come under immediate pressure to ship your changes, you will inevitably relay at least a bit of that stress to the maintainers. Even if you try to be exceedingly polite, the maintainers will know that you are coming under fire for not having shipped the fix yet, and are likely to feel an even greater burden of obligation to ship your code fast.

Much as it’s good to contribute the fix, it’s not great to put this on the maintainers.

The respective incentive structures of software development — specifically, of corporate application development and open source infrastructure development — make options 1-4 very common.

On the corporate / application side, these issues are:

it’s difficult for corporate developers to get clearance to spend even small amounts of their work hours on upstream open source projects, but clearance to spend time on the project they actually work on is implicit. If it takes 3 hours of wrangling with Legal² and 3 hours of implementation work to fix the issue in LibBar, but 0 hours of wrangling with Legal and 40 hours of implementation work in FooApp, a FooApp developer will often perceive it as “easier” to fix the issue downstream.
it’s difficult for corporate developers to get clearance from management to spend even small amounts of money sponsoring upstream reviewers, so even if they can find the time to contribute the fix, chances are high that it will remain stuck in review unless they are personally well-integrated members of the LibBar development team already.
even assuming there’s zero pressure whatsoever to avoid open sourcing the upstream changes, there’s still the fact inherent to any development team that FooApp’s developers will be more familiar with FooApp’s codebase and development processes than they are with LibBar’s. It’s just easier to work there, even if all other things are equal.
systems for tracking risk from open source dependencies often lack visibility into vendoring, particularly if you’re doing a hybrid approach and only vendoring a few things to address work in progress, rather than a comprehensive and disciplined approach to a monorepo. If you fully absorb a vendored dependency and then modify it, Dependabot isn’t going to tell you that a new version is available any more, because it won’t be present in your dependency list. Organizationally this is bad of course but from the perspective of an individual developer this manifests mostly as fewer annoying emails.

But there are problems on the open source side as well. Those problems are all derived from one big issue: because we’re often working with relatively small sums of money, it’s hard for upstream open source developers to consume either money or patches from application developers. It’s nice to say that you should contribute money to your dependencies, and you absolutely should, but the cost-benefit function is discontinuous. Before a project reaches the fiscal threshold where it can be at least one person’s full-time job to worry about this stuff, there’s often no-one responsible in the first place. Developers will therefore gravitate to the issues that are either fun, or relevant to their own job.

These mutually-reinforcing incentive structures are a big reason that users of open source infrastructure, even teams who work at corporate users with zillions of dollars, don’t reliably contribute back.

The Answer We Want

All those options are bad. If we had a good option, what would it look like?

It is both practically necessary³ and morally required⁴ for you to have a way to temporarily rely on a modified version of an open source dependency, without permanently diverging.

Below, I will describe a desirable abstract workflow for achieving this goal.

Step 0: Report the Problem

Before you get started with any of these other steps, write up a clear description of the problem and report it to the project as an issue; specifically, in contrast to writing it up as a pull request. Describe the problem before submitting a solution.

You may not be able to wait for a volunteer-run open source project to respond to your request, but you should at least tell the project what you’re planning on doing.

If you don’t hear back from them at all, you will have at least made sure to comprehensively describe your issue and strategy beforehand, which will provide some clarity and focus to your changes.

If you do hear back from them, in the worst case scenario, you may discover that a hard fork will be necessary because they don’t consider your issue valid, but even that information will save you time, if you know it before you get started. In the best case, you may get a reply from the project telling you that you’ve misunderstood its functionality and that there is already a configuration parameter or usage pattern that will resolve your problems with no new code. But in all cases, you will benefit from early coordination on what needs fixing before you get to how to fix it.

Step 1: Source Code and CI Setup

Fork the source code for your upstream dependency to a writable location where it can live at least for the duration of this one bug-fix, and possibly for the duration of your application’s use of the dependency. After all, you might want to fix more than one bug in LibBar.

You want to have a place where you can put your edits, that will be version controlled and code reviewed according to your normal development process. This probably means you’ll need to have your own main branch that diverges from your upstream’s main branch.

Remember: you’re going to need to deploy this to your production, so testing gates that your upstream only applies to final releases of LibBar will need to be applied to every commit here.

Depending on your LibBar’s own development process, this may result in slightly unusual configurations where, for example, your fixes are written against the last LibBar release tag, rather than its current⁵ main; if the project has a branch-freshness requirement, you might need two branches, one for your upstream PR (based on main) and one for your own use (based on the release branch with your changes).

Ideally for projects with really good CI and a strong “keep main release-ready at all times” policy, you can deploy straight from a development branch, but it’s good to take a moment to consider this before you get started. It’s usually easier to rebase changes from an older HEAD onto a newer one than it is to go backwards.

Speaking of CI, you will want to have your own CI system. The fact that GitHub Actions has become a de-facto lingua franca of continuous integration means that this step may be quite simple, and your forked repo can just run its own instance.

Optional Bonus Step 1a: Artifact Management

If you have an in-house artifact repository, you should set that up for your dependency too, and upload your own build artifacts to it. You can often treat your modified dependency as an extension of your own source tree and install from a GitHub URL, but if you’ve already gone to the trouble of having an in-house package repository, you can pretend you’ve taken over maintenance of the upstream package temporarily (which you kind of have) and leverage those workflows for caching and build-time savings as you would with any other internal repo.

Step 2: Do The Fix

Now that you’ve got somewhere to edit LibBar’s code, you will want to actually fix the bug.

Step 2a: Local Filesystem Setup

Before you have a production version on your own deployed branch, you’ll want to test locally, which means having both repositories in a single integrated development environment.

At this point, you will want to have a local filesystem reference to your LibBar dependency, so that you can make real-time edits, without going through a slow cycle of pushing to a branch in your LibBar fork, pushing to a FooApp branch, and waiting for all of CI to run on both.

This is useful in both directions: as you prepare the FooApp branch that makes any necessary updates on that end, you’ll want to make sure that FooApp can exercise the LibBar fix in any integration tests. As you work on the LibBar fix itself, you’ll also want to be able to use FooApp to exercise the code and see if you’ve missed anything - and this, you wouldn’t get in CI, since LibBar can’t depend on FooApp itself.

In short, you want to be able to treat both projects as an integrated development environment, with support from your usual testing and debugging tools, just as much as you want your deployment output to be an integrated artifact.

Step 2b: Branch Setup for PR

However, for continuous integration to work, you will also need to have a remote resource reference of some kind from FooApp’s branch to LibBar. You will need 2 pull requests: the first to land your LibBar changes to your internal LibBar fork and make sure it’s passing its own tests, and then a second PR to switch your LibBar dependency from the public repository to your internal fork.

At this step it is very important to ensure that there is an issue filed on your own internal backlog to drop your LibBar fork. You do not want to lose track of this work; it is technical debt that must be addressed.

Until it’s addressed, automated tools like Dependabot will not be able to apply security updates to LibBar for you; you’re going to need to manually integrate every upstream change. This type of work is itself very easy to drop or lose track of, so you might just end up stuck on a vulnerable version.

Step 3: Deploy Internally

Now that you’re confident that the fix will work, and that your temporarily-internally-maintained version of LibBar isn’t going to break anything on your site, it’s time to deploy.

Some deployment heritage should help to provide some evidence that your fix is ready to land in LibBar, but at the next step, please remember that your production environment isn’t necessarily emblematic of that of all LibBar users.

Step 4: Propose Externally

You’ve got the fix, you’ve tested the fix, you’ve got the fix in your own production, you’ve told upstream you want to send them some changes. Now, it’s time to make the pull request.

You’re likely going to get some feedback on the PR, even if you think it’s already ready to go; as I said, despite having been proven in your production environment, you may get feedback about additional concerns from other users that you’ll need to address before LibBar’s maintainers can land it.

As you process the feedback, make sure that each new iteration of your branch gets re-deployed to your own production. It would be a huge bummer to go through all this trouble, and then end up unable to deploy the next publicly released version of LibBar within FooApp because you forgot to test that your responses to feedback still worked on your own environment.

Step 4a: Hurry Up And Wait

If you’re lucky, upstream will land your changes to LibBar. But, there’s still no release version available. Here, you’ll have to stay in a holding pattern until upstream can finalize the release on their end.

Depending on some particulars, it might make sense at this point to archive your internal LibBar repository and move your pinned release version to a git hash of the LibBar version where your fix landed, in their repository.

Before you do this, check in with the LibBar core team and make sure that they understand that’s what you’re doing and they don’t have any wacky workflows which may involve rebasing or eliding that commit as part of their release process.

Step 5: Unwind Everything

Finally, you eventually want to stop carrying any patches and move back to an official released version that integrates your fix.

You want to do this because this is what the upstream will expect when you are reporting bugs. Part of the benefit of using open source is benefiting from the collective work to do bug-fixes and such, so you don’t want to be stuck off on a pinned git hash that the developers do not support for anyone else.

As I said in step 2b⁶, make sure to maintain a tracking task for doing this work, because leaving this sort of relatively easy-to-clean-up technical debt lying around is something that can potentially create a lot of aggravation for no particular benefit. Make sure to put your internal LibBar repository into an appropriate state at this point as well.

Up Next

This is part 1 of a 2-part series. In part 2, I will explore in depth how to execute this workflow specifically for Python packages, using some popular tools. I’ll discuss my own workflow, standards like PEP 517 and pyproject.toml, and of course, by the popular demand that I just know will come, uv.

Acknowledgments

if you already have all the tooling associated with a monorepo, including the ability to manage divergence and reintegrate patches with upstream, you already have the higher-overhead version of the workflow I am going to propose, so, never mind. but chances are you don’t have that, very few companies do. ↩
In any business where one must wrangle with Legal, 3 hours is a wildly optimistic estimate. ↩
c.f. @mcc@mastodon.social ↩
c.f. @geofft@mastodon.social ↩
In an ideal world every project would keep its main branch ready to release at all times, no matter what but we do not live in an ideal world. ↩
In this case, there is no question. It’s 2b only, no not-2b. ↩

Software Needs To Be More Expensive

Software, like coffee, is too artificially cheap, and we need to make it more expensive. I have one suggestion for how to do that.

software supported open-source Saturday March 30, 2024

The Cost of Coffee

One of the ideas that James Hoffmann — probably the most influential… influencer in the coffee industry — works hard to popularize is that “coffee needs to be more expensive”.

The coffee industry is famously exploitative. Despite relatively thin margins for independent café owners¹, there are no shortage of horrific stories about labor exploitation and even slavery in the coffee supply chain.

To summarize a point that Mr. Hoffman has made over a quite long series of videos and interviews², some of this can be fixed by regulatory efforts. Enforcement of supply chain policies both by manufacturers and governments can help spot and avoid this type of exploitation. Some of it can be fixed by discernment on the part of consumers. You can try to buy fair-trade coffee, avoid brands that you know have problematic supply-chain histories.

Ultimately, though, even if there is perfect, universal, zero-cost enforcement of supply chain integrity… consumers still have to be willing to, you know, pay more for the coffee. It costs more to pay wages than to have slaves.

The Price of Software

The problem with the coffee supply chain deserves your attention in its own right. I don’t mean to claim that the problems of open source maintainers are as severe as those of literal child slaves. But the principle is the same.

Every tech company uses huge amounts of open source software, which they get for free.

I do not want to argue that this is straightforwardly exploitation. There is a complex bargain here for the open source maintainers: if you create open source software, you can get a job more easily. If you create open source infrastructure, you can make choices about the architecture of your projects which are more long-term sustainable from a technology perspective, but would be harder to justify on a shorter-term commercial development schedule. You can collaborate with a wider group across the industry. You can build your personal brand.

But, in light of the recent xz Utils / SSH backdoor scandal, it is clear that while the bargain may not be entirely one-sided, it is not symmetrical, and significant bad consequences may result, both for the maintainers themselves and for society.

To fix this problem, open source software needs to get more expensive.

A big part of the appeal of open source is its implicit permission structure, which derives both from its zero up-front cost and its zero marginal cost.

The zero up-front cost means that you can just get it to try it out. In many companies, individual software developers do not have the authority to write a purchase order, or even a corporate credit card for small expenses.

If you are a software engineer and you need a new development tool or a new library that you want to purchase for work, it can be a maze of bureaucratic confusion in order to get that approved. It might be possible, but you are likely to get strange looks, and someone, probably your manager, is quite likely to say “isn’t there a free option for this?” At worst, it might just be impossible.

This makes sense. Dealing with purchase orders and reimbursement requests is annoying, and it only feels worth the overhead if you’re dealing with a large enough block of functionality that it is worth it for an entire team, or better yet an org, to adopt. This means that most of the purchasing is done by management types or “architects”, who are empowered to make decisions for larger groups.

When individual engineers need to solve a problem, they look at open source libraries and tools specifically because it’s quick and easy to incorporate them in a pull request, where a proprietary solution might be tedious and expensive.

That’s assuming that a proprietary solution to your problem even exists. In the infrastructure sector of the software economy, free options from your operating system provider (Apple, Microsoft, maybe Amazon if you’re in the cloud) and open source developers, small commercial options have been marginalized or outright destroyed by zero-cost options, for this reason.

If the zero up-front cost is a paperwork-reduction benefit, then the zero marginal cost is almost a requirement. One of the perennial complaints of open source maintainers is that companies take our stuff, build it into a product, and then make a zillion dollars and give us nothing. It seems fair that they’d give us some kind of royalty, right? Some tiny fraction of that windfall? But once you realize that individual developers don’t have the authority to put $50 on a corporate card to buy a tool, they super don’t have the authority to make a technical decision that encumbers the intellectual property of their entire core product to give some fraction of the company’s revenue away to a third party. Structurally, there’s no way that this will ever happen.

Despite these impediments, keeping those dependencies maintained does cost money.

Some Solutions Already Exist

There are various official channels developing to help support the maintenance of critical infrastructure. If you work at a big company, you should probably have a corporate Tidelift subscription. Maybe ask your employer about that.

But, as they will readily admit there are a LOT of projects that even Tidelift cannot cover, with no official commercial support, and no practical way to offer it in the short term. Individual maintainers, like yours truly, trying to figure out how to maintain their projects, either by making a living from them or incorporating them into our jobs somehow. People with a Ko-Fi or a Patreon, or maybe just an Amazon wish-list to let you say “thanks” for occasional maintenance work.

Most importantly, there’s no path for them to transition to actually making a living from their maintenance work. For most maintainers, Tidelift pays a sub-hobbyist amount of money, and even setting it up (and GitHub Sponsors, etc) is a huge hassle. So even making the transition from “no income” to “a little bit of side-hustle income” may be prohibitively bureaucratic.

Let’s take myself as an example. If you’re a developer who is nominally profiting from my infrastructure work in your own career, there is a very strong chance that you are also a contributor to the open source commons, and perhaps you’ve even contributed more to that commons than I have, contributed more to my own career success than I have to yours. I can ask you to pay me³, but really you shouldn’t be paying me, your employer should.

What To Do Now: Make It Easy To Just Pay Money

So if we just need to give open source maintainers more money, and it’s really the employers who ought to be giving it, then what can we do?

Let’s not make it complicated. Employers should just give maintainers money. Let’s call it the “JGMM” benefit.

Specifically, every employer of software engineers should immediately institute the following benefits program: each software engineer should have a monthly discretionary budget of $50 to distribute to whatever open source dependency developers they want, in whatever way they see fit. Venmo, Patreon, PayPal, Kickstarter, GitHub Sponsors, whatever, it doesn’t matter. Put it on a corp card, put the project name on the line item, and forget about it. It’s only for open source maintenance, but it’s a small enough amount that you don’t need intense levels of approval-gating process. You can do it on the honor system.

This preserves zero up-front cost. To start using a dependency, you still just use it⁴. It also preserves zero marginal cost: your developers choose which dependencies to support based on perceived need and popularity. It’s a fixed overhead which doesn’t scale with revenue or with profit, just headcount.

Because the whole point here is to match the easy, implicit, no-process, no-controls way in which dependencies can be added in most companies. It should be easier to pay these small tips than it is to use the software in the first place.

This sub-1% overhead to your staffing costs will massively de-risk the open source projects you use. By leaving the discretion up to your engineers, you will end up supporting those projects which are really struggling and which your executives won’t even hear about until they end up on the news. Some of it will go to projects that you don’t use, things that your engineers find fascinating and want to use one day but don’t yet depend upon, but that’s fine too. Consider it an extremely cheap, efficient R&D expense.

A lot of the options for developers to support open source infrastructure are already tax-deductible, so if they contribute to something like one of the PSF’s fiscal sponsorees, it’s probably even more tax-advantaged than a regular business expense.

I also strongly suspect that if you’re one of the first employers to do this, you can get a round of really positive PR out of the tech press, and attract engineers, so, the race is on. I don’t really count as the “tech press” but nevertheless drop me a line to let me know if your company ends up doing this so I can shout you out.

Acknowledgments

Thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor! I am also available for consulting work if you think your organization could benefit from expertise on topics such as “How do I figure out which open source projects to give money to?”.

I don’t have time to get into the margins for Starbucks and friends, their relationship with labor, economies of scale, etc. ↩
While this is a theme that pervades much of his work, the only place I can easily find where he says it in so many words is on a podcast that sometimes also promotes right-wing weirdos and pseudo-scientific quacks spreading misinformation about autism and ADHD. So, I obviously don’t want to link to them; you’ll have to take my word for it. ↩
and I will, since as I just recently wrote about, I need to make sure that people are at least aware of the option ↩
Pending whatever legal approval program you have in place to vet the license. You do have a nice streamlined legal approvals process, right? You’re not just putting WTFPL software into production, are you? ↩

The Hat

If you want me to keep doing… whatever this is… now’s the time to support it!

patreon open-source blogging supported Friday March 29, 2024

This year I will be going to two conferences: PyCon 2024, and North Bay Python 2024.

At PyCon, I will be promoting my open source work and my writing on this blog. As I’m not giving a talk this year, I am looking forward to organizing and participating in some open spaces about topics like software architecture, open source sustainability, framework interoperability, and event-driven programming.

At North Bay Python, though, I will either be:

doing a lot more of that, or
looking for new full-time employment, pausing the patreon, and winding down this experiment.

If you’d like to see me doing the former, now would be a great time to sign up to my Patreon to support the continuation of my independent open source work and writing.

The Bad News

It has been nearly a year since I first mentioned that I have a Patreon on this blog. That year has been a busy one, with consulting work and personal stuff consuming more than enough time that I have never been full time on independent coding & blogging. As such, I’ve focused more on small infrastructure projects and less on user-facing apps than I’d like, but I have spent the plurality of my time on it.

For that to continue, let alone increase, this work needs to—at the very least—pay for health insurance. At my current consulting rate, a conservative estimate based on some time tracking is that I am currently doing this work at something like a 98.5% discount. I do love doing it! I would be happy to continue doing it at a significant discount! But “significant” and “catastrophic” are different thresholds.

As I have said previously, this is an appeal to support my independent work; not to support me. I will be fine; what you will be voting for with your wallet is not my well-being but a choice about where I spend my time.

Hiding The Hat

When people ask me what I do these days, I sometimes struggle to explain. It is confusing. I might say I have a Patreon, I do a combination of independent work and consulting, or if I’m feeling particularly sarcastic I might say I’m an ✨influencer✨. But recently I saw the very inspiring Matt Ricardo describing the way he thinks about his own Patreon, and I realized what I am actually trying to do, which is software development busking.

Previously, when I’ve been thinking about making this “okay, it’s been a year of Patreon, let’s get serious now” post, I’ve been thinking about adding more reward products to my Patreon, trying to give people better value for their money before asking for anything more, trying to finish more projects to make a better sales pitch, maybe making merch available for sale, and so on. So aside from irregular weekly posts on Friday and acknowledgments sections at the bottom of blog posts, I’ve avoided mentioning this while I think about adding more private rewards.

But busking is a public performance, and if you want to support my work then it is that public aspect that you presumably want to support. And thus, an important part of the busking process is to actually pass the hat at the end. The people who don’t chip in still get to see the performance, but everyone else need to know that they can contribute if they liked it.¹

I’m going to try to stop hiding the metaphorical hat. I still don’t want to overdo it, but I will trust that you’ll tell me if these reminders get annoying. For my part today, in addition to this post, I’m opening up a new $10 tier on Patreon for people who want to provide a higher level of support, and officially acknowledging the rewards that I already provide.

What’s The Deal?

So, what would you be supporting?

What You Give (The Public Part)

I have tended to focus on my software, and there has been a lot of it! You’d be supporting me writing libraries and applications and build infrastructure to help others do the same with Python, as well as maintaining existing libraries (like the Twisted ecosystem libraries) sometimes. If I can get enough support together to more than bare support for myself, I’d really like to be able to do things like pay people to others to help with aspects of applications that I would struggle to complete myself, like accessibility or security audits.
I also do quite a bit of writing though, about software and about other things. To make it easier to see what sort of writing I’m talking about, I’ve collected just the stuff that I’ve written during the period where I have had some patrons, under the supported tag.
Per my earlier sarcastic comment about being an “influencer”, I also do quite a bit of posting on Mastodon about software and the tech industry.

What You Get (Just For Patrons)

As I said above, I will be keeping member benefits somewhat minimal.

I will add you to SponCom so that your name will be embedded in commit messages like this one on a cadence appropriate to your support level.
You will get access to my private Patreon posts where I discuss what I’ve been working on. As one of my existing patrons put it:

I figure I’m getting pretty good return on it, getting not only targeted project tracking, but also a peek inside your head when it comes to Sores Business Development. Maybe some of that stuff will rub off on me :)
This is a somewhat vague and noncommittal benefit, but it might be the best one: if you are interested in the various different projects that I am doing, you can help me prioritize! I have a lot of things going on. What would you prefer that I focus on? You can make suggestions in the comments of Patreon posts, which I pay a lot more attention to than other feedback channels.
In the near future² I am also planning to start doing some “office hours” type live-streaming, where I will take audience questions and discuss software design topics, or maybe do some live development to showcase my process and the tools I use. When I figure out the mechanics of this, I plan to add some rewards to the existing tiers to select topics or problems for me to work on there.

If that sounds like a good deal to you, please sign up now. If you’re already supporting me, sharing this and giving a brief testimonial of something good I’ve done would be really helpful. Github is not an algorithmic platform like YouTube, despite my occasional jokey “remember to like and subscribe”, nobody is getting recommended DBXS, or Fritter, or Twisted, or Automat, or this blog unless someone goes out and shares it.

A year into this, after what feels like endlessly repeating this sales pitch to the point of obnoxiousness, I still routinely interact with people who do not realize that I have a Patreon at all. ↩
Not quite comfortable putting this on the official patreon itemized inventory of rewards yet, but I do plan to add it once I’ve managed to stream for a couple of weeks in a row. ↩

Telemetry Is Not Your Enemy

Not all data collection is the same, and not all of it is bad.

politics privacy open-source supported Sunday March 26, 2023

Part 1: A Tale of Two Metaphors

In software development “telemetry” is data collected from users of the software, almost always delivered to the authors of the software via the Internet.

In recent years, there has been a great deal of angry public discourse about telemetry. In particular, there is a lot of concern that every software vendor and network service operator collecting any data at all is spying on its users, surveilling every aspect of our lives. The media narrative has been that any tech company collecting data for any purpose is acting creepy as hell.

I am quite sympathetic to this view. In general, some concern about privacy is warranted whenever some new data-collection scheme is proposed. However it seems to me that the default response is no longer “concern and skepticism”; but rather “panic and fury”. All telemetry is seen as snooping and all snooping is seen as evil.

There’s a sense in which software telemetry is like surveillance. However, it is only like surveillance. Surveillance is a metaphor, not a description. It is far from a perfect metaphor.

In the discourse around user privacy, I feel like we have lost a lot of nuance about the specific details of telemetry when some people dismiss all telemetry as snooping, spying, or surveillance.

Here are some ways in which software telemetry is not like “snooping”:

The data may be aggregated. The people consuming the results of telemetry are rarely looking at individual records, and individual records may not even exist in some cases. There are tools, like Prio, to do this aggregation to be as privacy-sensitive as possible.
The data is rarely looked at by human beings. In the cases (such as ad-targeting) where the data is highly individuated, both the input (your activity) and the output (your recommendations) are both mainly consumed by you, in your experience of a product, by way of algorithms acting upon the data, not by an employee of the company you’re interacting with.¹
The data is highly specific. “Here’s a record with your account ID and the number of times you clicked the Add To Cart button without checking out” is not remotely the same class of information as “Here’s several hours of video and audio, attached to your full name, recorded without your knowledge or consent”. Emotional appeals calling any data “surveillance” tend to suggest that all collected data is the latter, where in reality most of it is much closer to the former.

There are other metaphors which can be used to understand software telemetry. For example, there is also a sense in which it is like voting.

I emphasize that voting is also a metaphor here, not a description. I will also freely admit that it is in many ways a worse metaphor for telemetry than “surveillance”. But it can illuminate other aspects of telemetry, the ones that the surveillance metaphor leaves out.

Data-collection is like voting because the data can represent your interests to a party that has some power over you. Your software vendor has the power to change your software, and you probably don’t, either because you don’t have access to the source code. Even if it’s open source, you almost certainly don’t have the resources to take over its maintenance.

For example, let’s consider this paragraph from some Microsoft documentation about telemetry:

We also use the insights to drive improvements and intelligence into some of our management and monitoring solutions. This improvement helps customers diagnose quality issues and save money by making fewer support calls to Microsoft.

“Examples of how Microsoft uses the telemetry data” from the Azure SDK documentation

What Microsoft is saying here is that they’re collecting the data for your own benefit. They’re not attempting to justify it on the basis that defenders of law-enforcement wiretap schemes might. Those who want literal mass surveillance tend to justify it by conceding that it might hurt individuals a little bit to be spied upon, but if we spy on everyone surely we can find the bad people and stop them from doing bad things. That’s best for society.

But Microsoft isn’t saying that.² What Microsoft is saying here is that if you’re experiencing a problem, they want to know about it so they can fix it and make the experience better for you.

I think that is at least partially true.

Part 2: I Qualify My Claims Extensively So You Jackals Don’t Lose Your Damn Minds On The Orange Website

I was inspired to write this post due to the recent discussions in the Go community about how to collect telemetry which provoked a lot of vitriol from people viscerally reacting to any telemetry as invasive surveillance. I will therefore heavily qualify what I’ve said above to try to address some of that emotional reaction in advance.

I am not suggesting that we must take Microsoft (or indeed, the Golang team) fully at their word here. Trillion dollar corporations will always deserve skepticism. I will concede in advance that it’s possible the data is put to other uses as well, possibly to maximize profits at the expense of users. But it seems reasonable to assume that this is at least partially true; it’s not like Microsoft wants Azure to be bad.

I can speak from personal experience. I’ve been in professional conversations around telemetry. When I have, my and my teams’ motivations were overwhelmingly focused on straightforwardly making the user experience good. We wanted it to be good so that they would like our products and buy more of them.

It’s hard enough to do that without nefarious ulterior motives. Most of the people who develop your software just don’t have the resources it takes to be evil about this.

Part 3: They Can’t Help You If They Can’t See You

With those qualifications out of the way, I will proceed with these axioms:

The developers of software will make changes to it.
These changes will benefit some users.
Which changes the developers select will be derived, at least in part, from the information that they have.
At least part of the information that the developers have is derived from the telemetry they collect.

If we can agree that those axioms are reasonable, then let us imagine two user populations:

Population A is privacy-sensitive and therefore sees telemetry as bad, and opts out of everything they possibly can.
Population B doesn’t care about privacy, and therefore ignores any telemetry and blithely clicks through any opt-in.

When the developer goes to make changes, they will have more information about Population B. Even if they’re vaguely aware that some users are opting out (or refusing to opt in), the developer will know far less about Population A. This means that any changes the developer makes will not serve the needs of their privacy-conscious users, which means fewer features that respect privacy as time goes on.

Part 4: Free as in Fact-Free Guesses

In the world of open source software, this problem is even worse. We often have fewer resources with which to collect and analyze telemetry in the first place, and when we do attempt to collect it, a vocal minority among those users are openly hostile, with feedback that borders on harassment. So we often have no telemetry at all, and are making changes based on guesses.

Meanwhile, in proprietary software, the user population is far larger and less engaged. Developers are not exposed directly to users and therefore cannot be harassed or intimidated into dropping their telemetry. Which means that proprietary software gains a huge advantage: they can know what most of their users want, make changes to accommodate it, and can therefore make a product better than the one based on uninformed guesses from the open source competition.

Proprietary software generally starts out with a panoply of advantages already — most of which boil down to “money” — but our collective knee-jerk reaction to any attempt to collect telemetry is a massive and continuing own-goal on the part of the FLOSS community. There’s no inherent reason why free software’s design cannot be based on good data, but our community’s history and self-selection biases make us less willing to consider it.

That does not mean we need to accept invasive data collection that is more like surveillance. We do not need to allow for stockpiled personally-identifiable information about individual users that lives forever. The abuses of indiscriminate tech data collection are real, and I am not suggesting that we forget about them.

The process for collecting telemetry must be open and transparent, the data collected needs to be continuously vetted for safety. Clear data-retention policies should always be in place to avoid future unanticipated misuses of data that is thought to be safe today but may be de-anonymized or otherwise abused in the future.

I want the collaborative feedback process of open source development to result in this kind of telemetry: thoughtful, respectful of user privacy, and designed with the principle of least privilege in mind. If we have this kind of process, then we could hold it up as an example for proprietary developers to follow, and possibly improve the industry at large.

But in order to be able to produce that example, we must produce criticism of telemetry efforts that is specific, grounded in actual risks and harms to users, rather than a series of emotional appeals to slippery-slope arguments that do not correspond to the actual data being collected. We must arrive at a consensus that there are benefits to users in allowing software engineers to have enough information to do their jobs, and telemetry is not uniformly bad. We cannot allow a few users who are complaining to stop these efforts for everyone.

After all, when those proprietary developers look at the hard data that they have about what their users want and need, it’s clear that those who are complaining don’t even exist.

Please note that I’m not saying that this automatically makes such collection ethical. Attempting to modify user behavior or conduct un-reviewed psychological experiments on your customers is also wrong. But it’s wrong in a way that is somewhat different than simply spying on them. ↩
I am not suggesting that data collected for the purposes of improving the users’ experience could not be used against their interest, whether by law enforcement or by cybercriminals or by Microsoft itself. Only that that’s not what the goal is here. ↩