Adversarial Communication

“AI” turns every conversation into a fight, because fighting is what they are good at.

ai llm programming Tuesday June 23, 2026

As I have discussed in previous posts, “AIs” can make mistakes. In fact, they do make mistakes, and their mistake-making patterns are such that where and how they will make mistakes is both uncertain and constantly changing.

Thus, in any scenario where you want to attempt to make “productive” use of “AI”, you must have a system in place for checking every result. Not checking some results; checking every result. If each result might have a consequence for you (and if it didn’t have a consequence, why bother automating it?) and you cannot predict in advance which kinds of results will need verification, then verification is always required.

The verification often ends up being just as expensive as doing the work in the first place, which means that if you want your usage of “AI” to be personally profitable, you have to find someone else to externalize the cost of verification onto. This person becomes your adversary, and, if you are successful, your “AI’s” victim.

The Ladder-Climber And Their Reverse-Centaur Rungs

One way that this constellation of facts can straightforwardly assemble themselves into a dystopian nightmare is the phenomenon, described by Cory Doctorow, of the reverse centaur. This is when your employer non-consensually turns you into the verification system. The “AI” does the fun part of initially performing the work, and then you do the boring part where you check if the robot is right and clean up its messes, even if everyone already knows that it would, in aggregate, be cheaper for you to do the work in the first place.

Reverse centaurs can be made from any automation, not only “AI” automation. I think that there is a reason that this term happens to have emerged in the “age of AI”, though, and not with earlier automation technologies (even those which were considerably more viscerally horrific). That reason is: the wrongness of “AI” output is not merely a technical feature that must be compensated for, it is a generalized externality.

As I mentioned above, if you are responsible for the entirety of the work, both extruding the “AI” output and checking it, it’s usually cheaper to have humans do the entirety of the work to begin with. When humans do the writing directly, we can check as we go, and thus verification doesn’t need to be as comprehensive.

When “AI” coding advocates say “code review is the bottleneck”, what they are observing is that the LLM is still rolling the dice for each PR, and a human is still necessary to verify that each of those rolls is a winner. But calling this process “code review” is a bit of a misnomer; it’s not really “code review” in the traditional sense, it’s human understanding.

Before the advent of “AI”, the human understanding was implicit in the process of writing the code in the first place¹, and the code review was a way of diffusing and extending that understanding. Now that the code can be authored with no initial understanding taking place, that cost has not gone away, it has moved.

Human understanding was always the bottleneck.

However, this is taking a collaborative view of a software project, where satisfying the needs and solving the problems of your customers are the goals. We can see that “AI” is a bad tool to satisfy those goals, because all it’s doing is converting the first half of the work, that of understanding the code as you write it, to understanding the agent’s output as you read it.

What if, instead, we were to take the view that every software company is a Hobbesian nightmare, red in tooth and claw? In this view, the only goal of a software project is for the individual developers to make their promo cycles and get their bonuses. Given that there is only a certain amount of money to go around, this is a zero-sum game where each programmer wants to look more productive than their colleagues.

Pretty much every organization finds it easy to reward “productivity” as expressed by lines of code emitted, but the benefits of doing thorough and thoughtful design, analysis, and code review very difficult to reward. In this world, an LLM is an invaluable tool for the sociopathic ladder-climber, particularly if your legacy organization is still structuring their workflows as if the person prompting the bot is “writing” the code, and then they get to foist off the act of “reviewing” the code onto someone else.

Here, the prompter effectively externalizes the cost of the LLM’s failures but internalizes any benefits. The prompter will vibe-code a big feature, so large that the assigned reviewer can’t possibly comprehend it all effectively. When this happens, the reviewer will, eventually, be pressured to approve it, even if they can try to spot a few problems along the way. The reviewer has their own work to get back to, after all, the obligation to review the prompter’s (read: the bot’s) code is a drain on their time that they are not going to get rewarded for.

If this feature is a big success, the prompter gets a promotion. If it causes a big issue, well, the reviewer must not have been careful enough.

This is why LLMs are “good for coding”, and also why their biggest promoters keep having outages.

The Generative Gish Galloper

Coding is the biggest “success story” of this type of adversarial communication, but it is by far not the only instance of such a thing. LLMs create a new form of leverage that can turn Brandolini’s law from a linear advantage into an exponential one. If you are engaged in a political debate where you want to overwhelm the other side in nonsense, an LLM can generate bullshit faster than it is physically possible for a human being to type, let alone respond thoughtfully. There is an asymmetry to the utility of this weapon as well: only one side of the political spectrum wants to flood the zone and destroy trust in institutions and the concept of truth. There’s a good reason that the fascists love it.

Straightforward Spam and Fraud

This is kind of obvious, but LLMs can generate lightly-customized, plausible-looking text much more quickly than any human being. This facilitates their use in fraud, spam, and scams. In a spamming or fraudulent interaction, once again, the costs are externalized onto the victim: the recipient of a spam message has to do all the work of “checking” the LLM’s output. Spammers already expect very low hit rates from boilerplate, and if the LLM can increase those percentages from 1% to 5% the technology will pay for itself; they don’t need anything like reliable accuracy.

Customer “Support”

If you have any kind of commercial relationship with a company, I probably don’t even need to mention this: customer “support” bots are a misery. Everybody knows it at this point. But customer support is usually conceptualized by businesses as an adversarial interaction, because it is a cost center. They maintain internal metrics on time-to-resolution and try to optimize them. Implicitly, this creates a dynamic where the goal of the customer service agent’s job is not to solve your problem, but to emit noise that will cause you to think your problem is resolved, or to give up, as fast as possible. Unsurprisingly, LLMs can emit this noise faster than humans can, getting those customers off the phone. But those customers will remember those interactions, and the story outside the TTR metrics is horrible.

Similarly to the situation in software development, LLMs can look very good on paper for customer support, but mostly what they are doing is illuminating the problems with the industry’s existing metrics, by turning “winning the metrics battle against the customer” into a more obvious and immediate defeat for the company’s long term reputation.

“Education”

In 2026 it is sadly a fact of life that students cheat all the time using “AI”, and that this cheating is very successful, in that the teachers find it very hard to detect.

LLMs are great for cheating on schoolwork because the student is externalizing the work of the checking onto the teachers, who are often starting at a disadvantage to begin with, at least in the US.

My view is that this is happening because of a divergence in the way that students vs. teachers (or, more accurately, “the broader educational system”) view grading.

When a student is asked to write an essay, the teachers see the effort as both intrinsically worthwhile for the student, as well as useful as a pedagogical tool to evaluate and react to the student’s progress. The student, by contrast, sees a stumbling block designed to knock them off the path to success and into a permanent underclass. It is no wonder that the student sees “AI” as useful to their own goals and has no compunction about deploying it.

There is a bitter irony that the ability to understand the inherent value of actually writing the essay on their own is the sort of thing that students can really only learn by writing a bunch of essays. There’s no way that I can think of which makes the benefit legible as long as a shortcut is available.

The net effect here is a downward spiral, where the already-wobbling educational system is sustaining an attack that it doesn’t have the resources to recover from. The individual students’ attacks against their teachers and their schools’ grading systems might appear to momentarily succeed, but they will win the battle and lose the war.

Spamming “For Good”?

Usually when we talk about someone unilaterally choosing to enter into an adversarial relationship, that’s an “attack” and for good reasons we have a negative impression of the attacker. However, I would be remiss if I did not point out that there are some cases where the relationship was already adversarial; just because you’re the attacker doesn’t mean that you are evil.

For example we might imagine use-cases like automatically filing appeals for prior authorizations against health insurance. It’s relatively well-known at this point that the main way for-profit insurers maintain their margins is by denying claims right up to the line of the policies themselves being fraud, so using a spamming tool to fight them might be entirely justifiable² in that case.

Similarly, using an LLM could be justified in a fight against a company refusing to honor a warranty. One could imagine using an LLM to immediately generate replies and escalations.

However, even in imagined cases like these, the underlying problem is that the insurers and the vendors already have a tremendous amount of structural power, so it is more likely that they will have the advantage in deploying a communications weapon like an LLM, as well as enacting policies to simply ignore any LLM-based communication that you might submit. Worse, if these strategies were to become widespread, they might provide an excuse to reject any communications by feeding them into an unreliable “LLM detector” and issuing an automated “computer says no” even to hand-written correspondence.

It is also worth stressing that these cases are imagined, as compared to the very real coworker-abuse, spam, scam, fraud, and disinformation campaigns being waged in real life today.

Therefore, while legitimate uses might exist, it’s hard to imagine that there’s anywhere they would be genuinely valuable and sustainable. In the best case “AI” will provide a temporary advantage for underdogs that will provoke an arms race which the resource-advantaged adversaries will win in the long run, in the worst case the arms race itself will cement permanent structural change that will make things worse.

“Search” By Stealing

Most of the adversarial utility of “AI” is on the “write” side, since write-amplification is more obviously aggressive than reading. But the “read” side of LLMs — summarization and question-answering — can be a form of attack as well.

To begin with, the act of reading itself is currently enormously destructive, but that’s arguably not a fundamental aspect of this technology. They could set reasonable rate-limits and respect things like robots.txt, as search engines have for decades now. They could also refrain from committing criminal levels of copyright infringement. But, today, using “AI” tools does suborn this sort of out-of-control crawling.

More insidiously, consider the scenario described in this YouTube video. The LTT Bros decided to try Linux again, and in the course of so doing, they had problems. When trying to solve these problems, they were faced with a choice: they could consult Reddit, or they could ask an LLM. Asking an LLM would “gaslight the heck out of” them, but they still found it preferable, because they would at least get an answer without getting yelled at.

Initially this sounds great. But it also means that you want to extract knowledge from a community, while mechanically eliding any values or norms that the community may want to impart as part of offering that knowledge. As someone who spent many years in a community tech support role, this is worrying. Many requests for support are people asking how to do things that will momentarily solve a superficial problem but create a long-term reliability problem or even an immediate security risk, that the question-asker doesn’t want to hear about. Consider the question “I’m tired of entering my password so much, how do I make it so my laptop unlocks automatically”. An obsequious chatbot will helpfully tell you how to do this without pushback.

But, this is also a sort of ethically murky area. The Linux community is somewhat famously, for many years now, a toxic cesspool of general hostility, misogyny, etc. It is certainly a good thing that people can get access to this knowledge without subjecting themselves to abuse. But it also means that the people with the power and the privilege to change the community for the better can just quietly withdraw, rather than fixing the problems. It also means that the positive elements of culture cannot be transmitted, and people will have no opportunity to learn about unknown unknowns.

In this case, the “adversarial” communication is with society. The thing that using an LLM for search lets you do is withdraw from society and avoid forming any personal connections. There are some personal connections which are painful and annoying, and so that can feel like a momentary balm. But the need to make connections in general is, like, the concept of society itself.

Who Am I Hurting?

LLMs are good at adversarial communication. They are so good at it, relative to their other benefits, that they will tend to make communications adversarial if you are not remaining vigilant about the possibility that it might do so. My request to you, dear reader, if you are going to use such tools, is to always ask yourself, “who might I be hurting, if I use an LLM for this?”

If you’re using an “AI”, who is its adversary? If you haven’t given it one yet, who might the “AI” turn into an adversary? Who might you overwhelm with an asymmetric amount of output, or, if you’re receiving information and not sending it, who are you taking that information from without consulting?

Figure out the answers to these questions and conduct yourself accordingly; the answer might be “yourself”.

Acknowledgments

Thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor!

One of the reasons that software developers tend to prefer greenfield development is that when you are given a blank page, you can project your own specific understanding onto it. You can structure the codebase in a way that works for your brain, down to the variable naming conventions and the module layouts. LLM-assisted development makes everything into instant brownfield work, which makes developers instantly miserable; even those who are excited about the technology will frequently complain about how it feels like their agency has been stolen and their joy in the work has been diminished. But I digress. ↩
Modulo the massive amount of other externalities involved in using LLMs, of course, but I don’t have the time or energy to get into those here. ↩

Opaque Types in Python

A proposed technique for exposing an opaque data structure with idiomatic modern Python.

python programming Thursday May 21, 2026

Let’s say you’re writing a Python library.

In this library, you have some collection of state that represents “options” or “configuration” for a bunch of operations. Such a set of options is a bundle of potentially ever-increasing complexity. Thus, you will want it to have an extremely minimal compatibility surface, with a very carefully chosen public interface, that is either small, or perhaps nothing at all. Such an object conveys state and might have some private behavior, but all you want consumers to be able to do is build it in very constrained, specific ways, and then pass it along as a parameter to your own APIs.

By way of example, imagine that you’re wrapping a library that handles shipping physical packages.

There are a zillion ways to do it ship a package. There are different carriers who can ship it for you. There’s air freight, and ground freight, and sea freight. There’s overnight shipping. There’s the option to require a signature. There’s package tracking and certified mail. Suffice it to say, lots of stuff.

If you are starting out to implement such a library, you might need an object called something like ShippingOptions that encapsulates some of this. At the core of your library you might have a function like this:

async def shipPackage(
        how: ShippingOptions,
        where: Address,
    ) -> ShippingStatus:
    ...

If you are starting out implementing such a library, you know that you’re going to get the initial implementation of ShippingOptions wrong; or, at the very least, if not “wrong”, then “incomplete”. You should not want to commit to an expansive public API with a ton of different attributes until you really understand the problem domain pretty well.

Yet, ShippingOptions is absolutely vital to the rest of your library. You’ll need to construct it and pass it to various methods like estimateShippingCost and shipPackage. So you’re not going to want a ton of complexity and churn as you evolve it to be more complex.

Worse yet, this object has to hold a ton of state. It’s got attributes, maybe even quite complex internal attributes that relate to different shipping services.

Right now, today, you need to add something so you can have “no rush”, “standard” and “expedited” options. You can’t just put off implementing that indefinitely until you can come up with the perfect shape. What to do?

The tool you want here is the opaque data type design pattern. C is lousy with such things (FILE, pthread_*_t, fd_set, etc). A typedef in a header file can easily achieve this.

But in Python, if you expose a dataclass — or any class, really — even if you keep all your fields private, the constructor is still, inherently, public. You can make it raise an exception or something, but your type checker still won’t help your users; it’ll still look like it’s a normal class.

Luckily, Python typing provides a tool for this: typing.NewType.

Let’s review our requirements:

We need a type that our client code can use in its type annotations; it needs to be public.
They need to be able to consruct it somehow, even if they shouldn’t be able to see its attributes or its internal constructor arguments.
To express high-level things (like “ship fast”) that should stay supported as we add more nuanced and complex configurations in the future (like “ship with the fastest possible option provided by the lowest-cost carrier that supports signature verification”).

In order to solve these problems respectively, we will use:

a public NewType, which gives us our public name...
which wraps a private class with entirely private attributes, to give us an actual data structure, while not exposing the constructor,
a set of public constructor functions, which returns our NewType.

When we put that all together, it looks like this:

from dataclasses import dataclass
from typing import Literal, NewType

@dataclass
class _RealShipOpts:
    _speed: Literal["fast", "normal", "slow"]

ShippingOptions = NewType("ShippingOptions", _RealShipOpts)

def shipFast() -> ShippingOptions:
    return ShippingOptions(_RealShipOpts("fast"))

def shipNormal() -> ShippingOptions:
    return ShippingOptions(_RealShipOpts("normal"))

def shipSlow() -> ShippingOptions:
    return ShippingOptions(_RealShipOpts("slow"))

As a snapshot in time, this is not all that interesting; we could have just exposed _RealShipOpts as a public class and saved ourselves some time. The fact that this exposes a constructor that takes a string is not a big deal for the present moment. For an initial quick and dirty implementation, we can just do checks like if options._speed == "fast" in our shipping and estimation code.

However, the main thing we are doing here is preserving our flexibility to evolve the related APIs into the future, so let’s see how we might do that. For example, let’s allow the shipping options to contain a concrete and specific carrier and freight method:

from dataclasses import dataclass
from enum import Enum, auto
from typing import NewType

class Carrier(Enum):
    FedEx = auto()
    USPS = auto()
    DHL = auto()
    UPS = auto()

class Conveyance(Enum):
    air = auto()
    truck = auto()
    train = auto()

@dataclass
class _RealShipOpts:
    _carrier: Carrier
    _freight: Conveyance

ShippingOptions = NewType("ShippingOptions", _RealShipOpts)

def shipFast() -> ShippingOptions:
    return ShippingOptions(_RealShipOpts(Carrier.FedEx, Conveyance.air))

def shipNormal() -> ShippingOptions:
    return ShippingOptions(_RealShipOpts(Carrier.UPS, Conveyance.truck))

def shipSlow() -> ShippingOptions:
    return ShippingOptions(_RealShipOpts(Carrier.USPS, Conveyance.train))

def shippingDetailed(
    carrier: Carrier, conveyance: Conveyance
) -> ShippingOptions:
    return ShippingOptions(_RealShipOpts(carrier, conveyance))

As a NewType, our public ShippingOptions type doesn’t have a constructor. Since _RealShipOpts is private, and all its attributes are private, we can completely remove the old versions.

Anything within our shipping library can still access the private variables on ShippingOptions; as a NewType, it’s the same type as its base at runtime, so it presents minimal¹ overhead.

Clients outside our shipping library can still call all of our public constructors: shipFast, shipNormal, and shipSlow all still work with the same (as far as calling code knows) signature and behavior.

If you need to build and convey some state within your public API, while avoiding breakages associated with compatibility churn, hopefully this technique can help you do that!

Acknowledgments

Thanks for reading, and thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor.

The overhead is minimal, but it is not completely zero. The suggested idiom for converting to a NewType is to call it like a function, as I’ve done in these examples, but if you are wanting to use this pattern inside of a hot loop, you can use # type: ignore[return-value] comments to avoid that small cost. ↩

What Is Code Review For?

Code review is not for catching bugs.

programming process ai llm Tuesday March 03, 2026

Humans Are Bad At Perceiving

Humans are not particularly good at catching bugs. For one thing, we get tired easily. There is some science on this, indicating that humans can’t even maintain enough concentration to review more than about 400 lines of code at a time..

We have existing terms of art, in various fields, for the ways in which the human perceptual system fails to register stimuli. Perception fails when humans are distracted, tired, overloaded, or merely improperly engaged.

Each of these has implications for the fundamental limitations of code review as an engineering practice:

Inattentional Blindness: you won’t be able to reliably find bugs that you’re not looking for.
Repetition Blindness: you won’t be able to reliably find bugs that you are looking for, if they keep occurring.
Vigilance Fatigue: you won’t be able to reliably find either kind of bugs, if you have to keep being alert to the presence of bugs all the time.
and, of course, the distinct but related Alert Fatigue: you won’t even be able to reliably evaluate reports of possible bugs, if there are too many false positives.

Never Send A Human To Do A Machine’s Job

When you need to catch a category of error in your code reliably, you will need a deterministic tool to evaluate — and, thanks to our old friend “alert fatigue” above — ideally, to also remedy that type of error. These tools will relieve the need for a human to make the same repetitive checks over and over. None of them are perfect, but:

to catch logical errors, use automated tests.
to catch formatting errors, use autoformatters.
to catch common mistakes, use linters.
to catch common security problems, use a security scanner.

Don’t blame reviewers for missing these things.

Code review should not be how you catch bugs.

What Is Code Review For, Then?

Code review is for three things.

First, code review is for catching process failures. If a reviewer has noticed a few bugs of the same type in code review, that’s a sign that that type of bug is probably getting through review more often than it’s getting caught. Which means it’s time to figure out a way to deploy a tool or a test into CI that will reliably prevent that class of error, without requiring reviewers to be vigilant to it any more.

Second — and this is actually its more important purpose — code review is a tool for acculturation. Even if you already have good tools, good processes, and good documentation, new members of the team won’t necessarily know about those things. Code review is an opportunity for older members of the team to introduce newer ones to existing tools, patterns, or areas of responsibility. If you’re building an observer pattern, you might not realize that the codebase you’re working in already has an existing idiom for doing that, so you wouldn’t even think to search for it, but someone else who has worked more with the code might know about it and help you avoid repetition.

You will notice that I carefully avoided saying “junior” or “senior” in that paragraph. Sometimes the newer team member is actually more senior. But also, the acculturation goes both ways. This is the third thing that code review is for: disrupting your team’s culture and avoiding stagnation. If you have new talent, a fresh perspective can also be an extremely valuable tool for building a healthy culture. If you’re new to a team and trying to build something with an observer pattern, and this codebase has no tools for that, but your last job did, and it used one from an open source library, that is a good thing to point out in a review as well. It’s an opportunity to spot areas for improvement to culture, as much as it is to spot areas for improvement to process.

Thus, code review should be as hierarchically flat as possible. If the goal of code review were to spot bugs, it would make sense to reserve the ability to review code to only the most senior, detail-oriented, rigorous engineers in the organization. But most teams already know that that’s a recipe for brittleness, stagnation and bottlenecks. Thus, even though we know that not everyone on the team will be equally good at spotting bugs, it is very common in most teams to allow anyone past some fairly low minimum seniority bar to do reviews, often as low as “everyone on the team who has finished onboarding”.

Oops, Surprise, This Post Is Actually About LLMs Again

Sigh. I’m as disappointed as you are, but there are no two ways about it: LLM code generators are everywhere now, and we need to talk about how to deal with them. Thus, an important corollary of this understanding that code review is a social activity, is that LLMs are not social actors, thus you cannot rely on code review to inspect their output.

My own personal preference would be to eschew their use entirely, but in the spirit of harm reduction, if you’re going to use LLMs to generate code, you need to remember the ways in which LLMs are not like human beings.

When you relate to a human colleague, you will expect that:

you can make decisions about what to focus on based on their level of experience and areas of expertise to know what problems to focus on; from a late-career colleague you might be looking for bad habits held over from legacy programming languages; from an earlier-career colleague you might be focused more on logical test-coverage gaps,
and, they will learn from repeated interactions so that you can gradually focus less on a specific type of problem once you have seen that they’ve learned how to address it,

With an LLM, by contrast, while errors can certainly be biased a bit by the prompt from the engineer and pre-prompts that might exist in the repository, the types of errors that the LLM will make are somewhat more uniformly distributed across the experience range.

You will still find supposedly extremely sophisticated LLMs making extremely common mistakes, specifically because they are common, and thus appear frequently in the training data.

The LLM also can’t really learn. An intuitive response to this problem is to simply continue adding more and more instructions to its pre-prompt, treating that text file as its “memory”, but that just doesn’t work, and probably never will. The problem — “context rot” is somewhat fundamental to the nature of the technology.

Thus, code-generators must be treated more adversarially than you would a human code review partner. When you notice it making errors, you always have to add tests to a mechanical, deterministic harness that will evaluates the code, because the LLM cannot meaningfully learn from its mistakes outside a very small context window in the way that a human would, so giving it feedback is unhelpful. Asking it to just generate the code again still requires you to review it all again, and as we have previously learned, you, a human, cannot review more than 400 lines at once.

To Sum Up

Code review is a social process, and you should treat it as such. When you’re reviewing code from humans, share knowledge and encouragement as much as you share bugs or unmet technical requirements.

If you must reviewing code from an LLM, strengthen your automated code-quality verification tooling and make sure that its agentic loop will fail on its own when those quality checks fail immediately next time. Do not fall into the trap of appealing to its feelings, knowledge, or experience, because it doesn’t have any of those things.

But for both humans and LLMs, do not fall into the trap of thinking that your code review process is catching your bugs. That’s not its job.

Acknowledgments

How To Argue With Me About AI, If You Must

If you insist we have a conversation, please come prepared.

ai meta Sunday January 04, 2026

As you already know if you’ve read any of this blog in the last few years, I am a somewhat reluctant — but nevertheless quite staunch — critic of LLMs. This means that I have enthusiasts of varying degrees sometimes taking issue with my stance.

It seems that I am not going to get away from discussions, and, let’s be honest, pretty intense arguments about “AI” any time soon. These arguments are starting to make me quite upset. So it might be time to set some rules of engagement.

I’ve written about all of these before at greater length, but this is a short post because it’s not about the technology or making a broader point, it’s about me. These are rules for engaging with me, personally, on this topic. Others are welcome to adopt these rules if they so wish but I am not encouraging anyone to do so.

Thus, I’ve made this post as short as I can so everyone interested in engaging can read the whole thing. If you can’t make it through to the end, then please just follow Rule Zero.

Rule Zero: Maybe Don’t

You are welcome to ignore me. You can think my take is stupid and I can think yours is. We don’t have to get into an Internet Fight about it; we can even remain friends. You do not need to instigate an argument with me at all, if you think that my analysis is so bad that it doesn’t require rebutting.

Rule One: No ‘Just’

As I explained in a post with perhaps the least-predictive title I’ve ever written, “I Think I’m Done Thinking About genAI For Now”, I’ve already heard a bunch of bad arguments. Don’t tell me to ‘just’ use a better model, use an agentic tool, use a more recent version, or use some prompting trick that you personally believe works better. If you skim my work and think that I must not have deeply researched anything or read about it because you don’t like my conclusion, that is wrong.

Rule Two: No ‘Look At This Cool Thing’

Purely as a productivity tool, I have had a terrible experience with genAI. Perhaps you have had a great one. Neat. That’s great for you. As I explained at great length in “The Futzing Fraction”, my concern with generative AI is that I believe it is probably a net negative impact on productivity, based on both my experience and plenty of citations. Go check out the copious footnotes if you’re interested in more detail.

Therefore, I have already acknowledged that you can get an LLM to do various impressive, cool things, sometimes. If I tell you that you will, on average, lose money betting on a slot machine, a picture of a slot machine hitting a jackpot is not evidence against my position.

Rule Two And A Half: Engage In Metacognition

I specifically didn’t title the previous rule “no anecdotes” because data beyond anecdotes may be extremely expensive to produce. I don’t want to say you can never talk to me unless you’re doing a randomized controlled trial. However, if you are going to tell me an anecdote about the way that you’re using an LLM, I am interested in hearing how you are compensating for the well-documented biases that LLM use tends to induce. Try to measure what you can.

Rule Three: Do Not Cite The Deep Magic To Me

As I explained in “A Grand Unified Theory of the AI Hype Cycle”, I already know quite a bit of history of the “AI” label. If you are tempted to tell me something about how “AI” is really such a broad field, and it doesn’t just mean LLMs, especially if you are trying to launder the reputation of LLMs under the banner of jumbling them together with other things that have been called “AI”, I assure you that this will not be convincing to me.

Rule Four: Ethics Are Not Optional

I have made several arguments in my previous writing: there are ethical arguments, efficacy arguments, structuralist arguments, efficiency arguments and aesthetic arguments.

I am happy to, for the purposes of a good-faith discussion, focus on a specific set of concerns or an individual point that you want to make where you think I got something wrong. If you convince me that I am entirely incorrect about the effectiveness or predictability of LLMs in general or as specific LLM product, you don’t need to make a comprehensive argument about whether one should use the technology overall. I will even assume that you have your own ethical arguments.

However, if you scoff at the idea that one should have any ethical boundaries at all, and think that there’s no reason to care about the overall utilitarian impact of this technology, that it’s worth using no matter what else it does as long as it makes you 5% better at your job, that’s sociopath behavior.

This includes extreme whataboutism regarding things like the water use of datacenters, other elements of the surveillance technology stack, and so on.

Consequences

These are rules, once again, just for engaging with me. I have no particular power to enact broader sanctions upon you, nor would I be inclined to do so if I could. However, if you can’t stay within these basic parameters and you insist upon continuing to direct messages to me about this topic, I will summarily block you with no warning, on mastodon, email, GitHub, IRC, or wherever else you’re choosing to do that. This is for your benefit as well: such a discussion will not be a productive use of either of our time.

The Next Thing Will Not Be Big

Disruption, too, will be disrupted.

startups ai open-source programming politics Thursday January 01, 2026

The dawning of a new year is an opportune moment to contemplate what has transpired in the old year, and consider what is likely to happen in the new one.

Today, I’d like to contemplate that contemplation itself.

The 20th century was an era characterized by rapidly accelerating change in technology and industry, creating shorter and shorter cultural cycles of changes in lifestyles. Thus far, the 21st century seems to be following that trend, at least in its recently concluded first quarter.

The early half of the twentieth century saw the massive disruption caused by electrification, radio, motion pictures, and then television.

In 1971, Intel poured gasoline on that fire by releasing the 4004, a microchip generally recognized as the first general-purpose microprocessor. Popular innovations rapidly followed: the computerized cash register, the personal computer, credit cards, cellular phones, text messaging, the Internet, the web, online games, mass surveillance, app stores, social media.

These innovations have arrived faster than previous generations, but also, they have crossed a crucial threshold: that of the human lifespan.

While the entire second millennium A.D. has been characterized by a gradually accelerating rate of technological and social change — the printing press and the industrial revolution were no slouches, in terms of changing society, and those predate the 20th century — most of those changes had the benefit of unfolding throughout the course of a generation or so.

Which means that any individual person in any given century up to the 20th might remember one major world-altering social shift within their lifetime, not five to ten of them. The diversity of human experience is vast, but most people would not expect that the defining technology of their lifetime was merely the latest in a progression of predictable civilization-shattering marvels.

Along with each of these successive generations of technology, we minted a new generation of industry titans. Westinghouse, Carnegie, Sarnoff, Edison, Ford, Hughes, Gates, Jobs, Zuckerberg, Musk. Not just individual rich people, but entire new classes of rich people that did not exist before. “Radio DJ”, “Movie Star”, “Rock Star”, “Dot Com Founder”, were all new paths to wealth opened (and closed) by specific technologies. While most of these people did come from at least some level of generational wealth, they no longer came from a literal hereditary aristocracy.

To describe this new feeling of constant acceleration, a new phrase was coined: “The Next Big Thing”. In addition to denoting that some Thing was coming and that it would be Big (i.e.: that it would change a lot about our lives), this phrase also carries the strong implication that such a Thing would be a product. Not a development in social relationships or a shift in cultural values, but some new and amazing form of conveying salted meat paste or what-have-you, that would make whatever lucky tinkerer who stumbled into it into a billionaire — along with any friends and family lucky enough to believe in their vision and get in on the ground floor with an investment.

In the latter part of the 20th century, our entire model of capital allocation shifted to account for this widespread belief. No longer were mega-businesses built by bank loans, stock issuances, and reinvestment of profit, the new model was “Venture Capital”. Venture capital is a model of capital allocation explicitly predicated on the idea that carefully considering each bet on a likely-to-succeed business and reducing one’s risk was a waste of time, because the return on the equity from the Next Big Thing would be so disproportionately huge — 10x, 100x, 1000x – that one could afford to make at least 10 bad bets for each good one, and still come out ahead.

The biggest risk was in missing the deal, not in giving a bunch of money to a scam. Thus, value investing and focus on fundamentals have been broadly disregarded in favor of the pursuit of the Next Big Thing.

If Americans of the twentieth century were temporarily embarrassed millionaires, those of the twenty-first are all temporarily embarrassed FAANG CEOs.

The predicament that this tendency leaves us in today is that the world is increasingly run by generations — GenX and Millennials — with the shared experience that the computer industry, either hardware or software, would produce some radical innovation every few years. We assume that to be true.

But all things change, even change itself, and that industry is beginning to slow down. Physically, transistor density is starting to brush up against physical limits. Economically, most people are drowning in more compute power than they know what to do with anyway. Users already have most of what they need from the Internet.

The big new feature in every operating system is a bunch of useless junk nobody really wants and is seeing remarkably little uptake. Social media and smartphones changed the world, true, but… those are both innovations from 2008. They’re just not new any more.

So we are all — collectively, culturally — looking for the Next Big Thing, and we keep not finding it.

It wasn’t 3D printing. It wasn’t crowdfunding. It wasn’t smart watches. It wasn’t VR. It wasn’t the Metaverse, it wasn’t Bitcoin, it wasn’t NFTs¹.

It’s also not AI, but this is why so many people assume that it will be AI. Because it’s got to be something, right? If it’s got to be something then AI is as good a guess as anything else right now.

The fact is, our lifetimes have been an extreme anomaly. Things like the Internet used to come along every thousand years or so, and while we might expect that the pace will stay a bit higher than that, it is not reasonable to expect that something new like “personal computers” or “the Internet”³ will arrive again.

We are not going to get rich by getting in on the ground floor of the next Apple or the next Google because the next Apple and the next Google are Apple and Google. The industry is maturing. Software technology, computer technology, and internet technology are all maturing.

There Will Be Next Things

Research and development is happening in all fields all the time. Amazing new developments quietly and regularly occur in pharmaceuticals and in materials science. But these are not predictable. They do not inhabit the public consciousness until they’ve already happened, and they are rarely so profound and transformative that they change everybody’s life.

There will even be new things in the computer industry, both software and hardware. Foldable phones do address a real problem (I wish the screen were even bigger but I don’t want to carry around such a big device), and would probably be more popular if they got the costs under control. One day somebody’s going to crack the problem of volumetric displays, probably. Some VR product will probably, eventually, hit a more realistic price/performance ratio where the niche will expand at least a little more.

Maybe there will even be something genuinely useful, which is recognizably adjacent to the current “AI” fad, but if it is, it will be some new development that we haven’t seen yet. If current AI technology were sufficient to drive some interesting product, it would already be doing it, not using marketing disguised as science to conceal diminishing returns on current investments.

But They Will Not Be Big

The impulse to find the One Big Thing that will dominate the next five years is a fool’s errand. Incremental gains are diminishing across the board. The markets for time and attention² are largely saturated. There’s no need for another streaming service if 100% of your leisure time is already committed to TikTok, YouTube and Netflix; famously, Netflix has already considered sleep its primary competitor for close to a decade - years before the pandemic.

Those rare tech markets which aren’t saturated are suffering from pedestrian economic problems like wealth inequality, not technological bottlenecks.

For example, the thing preventing the development of a robot that can do your laundry and your dishes without your input is not necessarily that we couldn’t build something like that, but that most households just can’t afford it without wage growth catching up to productivity growth. It doesn’t make sense for anyone to commit to the substantial R&D investment that such a thing would take, if the market doesn’t exist because the average worker isn’t paid enough to afford it on top of all the other tech which is already required to exist in society.

The projected income from the tiny, wealthy sliver of the population who could pay for the hardware, cannot justify an investment in the software past a fake version remotely operated by workers in the global south, only made possible by Internet wage arbitrage, i.e. a more palatable, modern version of indentured servitude.

Even if we were to accept the premise of an actually-“AI” version of this, that is still just a wish that ChatGPT could somehow improve enough behind the scenes to replace that worker, not any substantive investment in a novel, proprietary-to-the-chores-robot software system which could reliably perform specific functions.

What, Then?

The expectation for, and lack of, a “big thing” is a big problem. There are others who could describe its economic, political, and financial dimensions better than I can. So then let me speak to my expertise and my audience: open source software developers.

When I began my own involvement with open source, a big part of the draw for me was participating in a low-cost (to the corporate developer) but high-value (to society at large) positive externality. None of my employers would ever have cared about many of the applications for which Twisted forms a core bit of infrastructure; nor would I have been able to predict those applications’ existence. Yet, it is nice to have contributed to their development, even a little bit.

However, it’s not actually a positive externality if the public at large can’t directly benefit from it.

When real world-changing, disruptive developments are occurring, the bean-counters are not watching positive externalities too closely. As we discovered with many of the other benefits that temporarily accrued to labor in the tech economy, Open Source that is usable by individuals and small companies may have been a ZIRP. If you know you’re gonna make a billion dollars you’re not going to worry about giving away a few hundred thousand here and there.

When gains are smaller and harder to realize, and margins are starting to get squeezed, it’s harder to justify the investment in vaguely good vibes.

But this, itself, is not a call to action. I doubt very much that anyone reading this can do anything about the macroeconomic reality of higher interest rates. The technological reality of “development is happening slower” is inherently something that you can’t change on purpose.

However, what we can do is to be aware of this trend in our own work.

Fight Scale Creep

It seems to me that more and more open source infrastructure projects are tools for hyper-scale application development, only relevant to massive cloud companies. This is just a subjective assessment on my part — I’m not sure what tools even exist today to measure this empirically — but I remember a big part of the open source community when I was younger being things like Inkscape, Themes.Org and Slashdot, not React, Docker Hub and Hacker News.

This is not to say that the hobbyist world no longer exists. There is of course a ton of stuff going on with Raspberry Pi, Home Assistant, OwnCloud, and so on. If anything there’s a bit of a resurgence of self-hosting. But the interests of self-hosters and corporate developers are growing apart; there seems to be far less of a beneficial overflow from corporate infrastructure projects into these enthusiast or prosumer communities.

This is the concrete call to action: if you are employed in any capacity as an open source maintainer, dedicate more energy to medium- or small-scale open source projects.

If your assumption is that you will eventually reach a hyper-scale inflection point, then mimicking Facebook and Netflix is likely to be a good idea. However, if we can all admit to ourselves that we’re not going to achieve a trillion-dollar valuation and a hundred thousand engineer headcount, we can begin to consider ways to make our Next Thing a bit smaller, and to accommodate the world as it is rather than as we wish it would be.

Be Prepared to Scale Down

Here are some design guidelines you might consider, for just about any open source project, particularly infrastructure ones:

Don’t assume that your software can sustain an arbitrarily large fixed overhead because “you just pay that cost once” and you’re going to be running a billion instances so it will always amortize; maybe you’re only going to be running ten.
Remember that such fixed overhead includes not just CPU, RAM, and filesystem storage, but also the learning curve for developers. Front-loading a massive amount of conceptual complexity to accommodate the problems of hyper-scalers is a common mistake. Try to smooth out these complexities and introduce them only when necessary.
Test your code on edge devices. This means supporting Windows and macOS, and even Android and iOS. If you want your tool to help empower individual users, you will need to meet them where they are, which is not on an EC2 instance.
This includes considering Desktop Linux as a platform, as opposed to Server Linux as a platform, which (while they certainly have plenty in common) they are also distinct in some details. Consider the highly specific example of secret storage: if you are writing something that intends to live in a cloud environment, and you need to configure it with a secret, you will probably want to provide it via a text file or an environment variable. By contrast, if you want this same code to run on a desktop system, your users will expect you to support the Secret Service. This will likely only require a few lines of code to accommodate, but it is a massive difference to the user experience.
Don’t rely on LLMs remaining cheap or free. If you have LLM-related features⁴, make sure that they are sufficiently severable from the rest of your offering that if ChatGPT starts costing $1000 a month, your tool doesn’t break completely. Similarly, do not require that your users have easy access to half a terabyte of VRAM and a rack full of 5090s in order to run a local model.

Even if you were going to scale up to infinity, the ability to scale down and consider smaller deployments means that you can run more comfortably on, for example, a developer’s laptop. So even if you can’t convince your employer that this is where the economy and the future of technology in our lifetimes is going, it can be easy enough to justify this sort of design shift, particularly as individual choices. Make your onboarding cheaper, your development feedback loops tighter, and your systems generally more resilient to economic headwinds.

So, please design your open source libraries, applications, and services to run on smaller devices, with less complexity. It will be worth your time as well as your users’.

But if you can fix the whole wealth inequality thing, do that first.

Acknowledgments

These sorts of lists are pretty funny reads, in retrospect. ↩
Which is to say, “distraction”. ↩
... or even their lesser-but-still-profound aftershocks like “Social Media”, “Smartphones”, or “On-Demand Streaming Video” ... secondary manifestations of the underlying innovation of a packet-switched global digital network ... ↩
My preference would of course be that you just didn’t have such features at all, but perhaps even if you agree with me, you are part of an organization with some mandate to implement LLM stuff. Just try not to wrap the chain of this anchor all the way around your code’s neck. ↩