Potato Programming

One potato, two potato, three potato, four…

One potato, two potato, three potato, four
Five potato, six potato, seven potato, more.

Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.

Knuth, Donald
“Structured Programming with go to statements”
Computing Surveys, Vol. 6, No. 4, December 1974
(p. 268)
(Emphasis mine)

Knuth’s admonition about premature optimization is such a cliché among software developers at this point that even the correction to include the full context of the quote is itself a a cliché.

Still, it’s a cliché for a reason: the speed at which software can be written is in tension — if not necessarily in conflict — with the speed at which it executes. As Nelson Elhage has explained, software can be qualitatively worse when it is slow, but spending time optimizing an algorithm before getting any feedback from users or profiling the system as a whole can lead one down many blind alleys of wasted effort.

In that same essay, Nelson further elaborates that performant foundations simplify architecture¹. He then follows up with several bits of architectural advice that is highly specific to parsing—compilers and type-checkers specifically—which, while good, is hard to generalize beyond “optimizing performance early can also be good”.

So, here I will endeavor to generalize that advice. How does one provide a performant architectural foundation without necessarily wasting a lot of time on early micro-optimization?

Enter The Potato

Many years before Nelson wrote his excellent aforementioned essay, my father coined a related term: “Potato Programming”.

In modern vernacular, a potato is very slow hardware, and “potato programming” is the software equivalent of the same.

The term comes from the rhyme that opened this essay, and is meant to evoke a slow, childlike counting of individual elements as an algorithm operates upon them. it is an unfortunately quite common software-architectural idiom whereby interfaces are provided in terms of scalar values. In other words, APIs that require you to use for loops or other forms of explicit, individual, non-parallelized iteration. But this is all very abstract; an example might help.

For a generic business-logic example, let’s consider the problem of monthly recurring billing. Every month, we pull in the list of all of all subscriptions to our service, and we bill them.

Since our hypothetical company has an account-management team that owns the UI which updates subscriptions and a billing backend team that writes code to interface with 3rd-party payment providers, we’ll create 2 backends, here represented by some Protocols.

Finally, we’ll have an orchestration layer that puts them together to actually run the billing. I will use async to indicate which things require a network round trip:

class SubscriptionService(Protocol):
    async def all_subscriptions(self) -> AsyncIterable[Subscription]:
        ...

class Subscription(Protocol):
    account_id: str
    to_charge_per_month: money

class BillingService(Protocol):
    async def bill_amount(self, account_id: str, amount: money) -> None:
        ...

To many readers, this may look like an entirely reasonable interface specification; indeed, it looks like a lot of real, public-facing “REST” APIs. An equally apparently-reasonable implementation of our orchestration between them might look like this:

async def billing(s: SubscriptionService, b: BillingService) -> None:
    async for sub in s.all_subscriptions():
        await b.bill_amount(sub.account_id, sub.to_charge_per_month)

This is, however, just about the slowest implementation of this functionality that it’s possible to implement. So, this is the bad version. Let’s talk about the good version: no-tato programming, if you will. But first, some backstory.

Some Backstory

My father began his career as an APL programmer, and one of the key insights he took away from APL’s architecture is that, as he puts it:

Computers like to do things over and over again. They like to do things on arrays. They don’t want to do things on scalars. So, in fact, it’s not possible to write a program that only does things on a scalar. [...] You can’t have an ‘integer’ in APL, you can only have an ‘array of integers’. There’s no ‘loop’s, there’s no ‘map’s.

APL, like Python², is typically executed via an interpreter. Which means, like Python, execution of basic operations like calling functions can be quite slow. However, unlike Python, its pervasive reliance upon arrays meant that almost all of its operations could be safely parallelized, and would only get more and more efficient as more and more parallel hardware was developed.

I said ‘unlike Python’ there, but in fact, my father first related this concept to me regarding a part of the Python ecosystem which follows APL’s design idiom: NumPy. NumPy takes a similar approach: it cannot itself do anything to speed up Python’s fundamental interpreted execution speed³, but it can move the intensive numerical operations that it implements into operations on arrays, rather than operations on individual objects, whether numbers or not.

The performance difference involved in these two styles is not small. Consider this case study which shows a 5828% improvement⁴ when taking an algorithm from idiomatic pure Python to NumPy.

This idiom is also more or less how GPU programming works. GPUs cannot operate on individual values. You submit a program⁵ to the GPU, as well as a large array of data⁶, and the GPU executes the program on that data in parallel across hundreds of tiny cores. Submitting individual values for the GPU to work on would actually be much slower than just doing the work on the CPU directly, due to the bus latency involved to transfer the data back and forth.

Back from the Backstory

This is all interesting for a class of numerical software — and indeeed it works very well there — but it may seem a bit abstract to web backend developers just trying to glue together some internal microservice APIs, or indeed most app developers who aren’t working in those specialized fields. It’s not like Stripe is going to let you run their payment service on your GPU.

However, the lesson generalizes quite well: anywhere you see an API defined in terms of one-potato, two-potato iteration, ask yourself: “how can this be turned into an array”? Let’s go back to our example.

The simplest change that we can make, as a consumer of these potato-shaped APIs, is to submit them in parallel. So if we have to do the optimization in the orchestration layer, we might get something more like this:

from asyncio import Semaphore, AbstractEventLoop

async def one_bill(
    loop: AbstractEventLoop,
    sem: Semaphore,
    sub: Subscription,
    b: BillingService,
) -> None:
    await sem.acquire()
    async def work() -> None:
        try:
            await b.bill_amount(sub.account_id, sub.to_charge_per_month)
        finally:
            sem.release()
    loop.create_task(work)

async def billing(
    loop: AbstractEventLoop,
    s: SubscriptionService,
    b: BillingService,
    batch_size: int,
) -> None:
    sem = Semaphore(batch_size)
    async for sub in s.all_subscriptions():
        await one_bill(loop, sem, sub, b)

This is an improvement, but it’s a bit of a brute-force solution; a multipotato, if you will. We’ve moved the work to the billing service faster, but it still has to do just as much work. Maybe even more work, because now it’s potentially got a lot more lock-contention on its end. And we’re still waiting for the Subscription objects to dribble out of the SubscriptionService potentially one request/response at a time.

In other words, we have used network concurrency as a hack to simulate a performant design. But the back end that we have been given here is not actually optimizable; we do not have a performant foundation. As you can see, we have even had to change our local architecture a little bit here, to include a loop parameter and a batch_size which we had not previously contemplated.

A better-designed interface in the first place would look like this:

class SubscriptionService(Protocol):
    async def all_subscriptions(
        self, batch_size: int,
    ) -> AsyncIterable[Sequence[Subscription]]:
        ...

class Subscription(Protocol):
    account_id: str
    to_charge_per_month: money

@dataclass
class BillingRequest:
    account_id: str
    amount: money

class BillingService(Protocol):
    async def submit_bills(
        self,
        bills: Sequence[BillingRequest],
    ) -> None:
        ...

Superficially, the implementation here looks slightly more awkward than our naive first attempt:

async def billing(s: SubscriptionService, b: BillingService, batch_size: int) -> None:
    async for sub_batch in s.all_subscriptions(batch_size):
        await b.submit_bills(
            [
                BillingRequest(sub.account_id, sub.to_charge_per_month)
                for sub in sub_batch
            ]
        )

However, while the implementation with batching in the backend is approximately as performant as our parallel orchestration implementation, backend batching has a number of advantages over parallel orchestration.

First, backend batching has less internal complexity; no need to have a Semaphore in the orchestration layer, or to create tasks on an event loop. There’s less surface area here for bugs.

Second, and more importantly: backend batching permits for future optimizations within the backend services, which are much closer to the relevant data and can achieve more substantial gains than we can as a client without knowledge of their implementation.

There are many ways this might manifest, but consider that each of these services has their own database, and have got to submit queries and execute transactions on those databases.

In the subscription service, it’s faster to run a single SELECT statement that returns a bunch of results than to select a single result at a time. On the billing service’s end, it’s much faster to issue a single INSERT or UPDATE and then COMMIT for N records at once than to concurrently issue a ton of potentially related modifications in separate transactions.

Potato No Mo

The initial implementation within each of these backends can be as naive and slow as necessary to achieve an MVP. You can do a SELECT … LIMIT 1 internally, if that’s easier, and performance is not important at first. There can be a mountain of potatoes hidden behind the veil of that batched list. In this way, you can avoid the potential trap of premature optimization. Maybe this is a terrible factoring of services for your application in the first place; best to have that prototype in place and functioning quickly so that you can throw it out faster!

However, by initially designing an interface based on lists of things rather than individual things, it’s much easier to hide irrelevant implementation details from the client, and to achieve meaningful improvements when optimizing.

Acknowledgements

This is the first post supported by my patrons, with a topic suggested by a member of my Patreon!

It’s a really good essay, you should read it. ↩
Yes, I know it’s actually bytecode compiled and then run on a custom interpreting VM, but for the purposes of comparing these performance characteristics “interpreted” is a more accurate approximation. Don’t @ me. ↩
Although, thankfully, a lot of folks are now working very hard on that problem. ↩
No, not a typo, that’s a 4-digit improvement. ↩
Typically called a “shader” due to its origins in graphically shading polygons. ↩
The data may rerepresenting vertices in a 3-D mesh, pixels in texture data, or, in the case of general-purpose GPU programming, “just a bunch of floating-point numbers”. ↩

Super Swing Districts

Donate now to save democracy. Please. I like democracy.

Wednesday October 26, 2022

In my corner of the social graph, when we talk about politics today, we tend to use a lot of moralizing language. A lot of emotive language. And that makes sense; overt fascist are repeating the strategy of using the right of trans people to, like, be alive, as a wedge issue to escalate to full-blown eugenics and antisemitism. There’s a lot of moral stuff and a lot of emotional stuff happening there.

But when we get down to it, politics is a highly technical discipline that requires a lot of work. You don’t need to just have the right opinion, you have to actually do a lot of math to figure out efficient ways to deploy resources, effective strategies to convince the undecided and to command the attention of the disengaged. It’s also adversarial: the bad guys are trying to do the same thing, so if you do find some efficient way to campaign, they will soon find out and try to dismantle it.

So while we might talk abstractly about “doing the work”, a lot of the work is tedious and difficult analysis of a lot of very confusing numbers. Not to mention the fact that it requires maintaining the tenacious mindset of a happy Sisyphus due to its adversarial nature. To be frank, I’m not great at either of those things.

Luckily, my uncle is. He is a professor of political science who — beyond the obvious familial bias I might have — I tend to think is a really smart guy with a lot of good ideas. More importantly, however, is that he does do “the work” I’m talking about here.

So here is some of that work: SuperSwingDistricts.org. This is a slate of democratic downballot candidates for office across the USA who need your support right now. Specifically it is a carefully curated slate to maximize spend efficiency via the reverse-coattails effect, multiplied by finding the areas where there are the most overlapping high-leverage elections. You can read more about the specifics on the website, and the specifics of the vetting of the candidates bona-fides, but you can also just take my word for it and Donate Now via ActBlue. Just like... gobs of money.

Political fundraising is not really my wheelhouse, and I am not that comfortable doing it. I hope that we can stop this “democracy” machine from constantly falling apart all the time so I can work on fixing the other broken systems in my life like Python-langauge native application packaging for various platforms. But this one is really, really important. Many of these candidates are in pivotal positions that will help prevent authoritarians from seizing the actual physical mechanisms of elections themselves, and attempting a more successful coup in 2024.

So: donate now.

Dates And Times And Types

Get a TypeError when using a datetime when you wanted a date.

python programming datetime Monday June 06, 2022

Python’s standard datetime module is very powerful. However, it has a couple of annoying flaws.

Firstly, datetimes are considered a kind of date¹, which causes problems. Although datetime is a literal subclass of date so Mypy and isinstance believe a datetime “is” a date, you cannot substitute a datetime for a date in a program without provoking errors at runtime.

To put it more precisely, here are two programs which define a function with type annotations, that mypy finds no issues with. The first of which even takes care to type-check its arguments at run-time. But both raise TypeErrors at runtime:

Comparing `datetime` to `date`:

from datetime import date, datetime

def is_after(before: date, after: date) -> bool | None:
    if not isinstance(before, date):
        raise TypeError(f"{before} isn't a date")
    if not isinstance(after, date):
        raise TypeError(f"{after} isn't a date")
    if before == after:
        return None
    if before > after:
        return False
    return True

is_after(date.today(), datetime.now())

Traceback (most recent call last):
  File ".../date_datetime_compare.py", line 14, in <module>
    is_after(date.today(), datetime.now())
  File ".../date_datetime_compare.py", line 10, in is_after
    if before > after:
TypeError: can't compare datetime.datetime to datetime.date

Comparing “naive” and “aware” `datetime`:

from datetime import datetime, timezone, timedelta

def compare(a: datetime, b: datetime) -> timedelta:
    return a - b

compare(datetime.now(), datetime.now(timezone.utc))

Traceback (most recent call last):
  File ".../naive_aware_compare.py", line 6, in <module>
    compare(datetime.now(), datetime.now(timezone.utc))
  File ".../naive_aware_compare.py", line 4, in compare
    return a - b
TypeError: can't subtract offset-naive and offset-aware datetimes

In some sense, the whole point of using Mypy - or, indeed, of runtime isinstance checks - is to avoid TypeError getting raised. You specify all the types, the type-checker yells at you, you fix it, and then you can know your code is not going to blow up in unexpected ways.

Of course, it’s still possible to avoid these TypeErrors with runtime checks, but it’s tedious and annoying to need to put a check for .tzinfo is not None or not isinstance(..., datetime) before every use of - or >.

The problem here is that datetime is trying to represent too many things with too few types. datetime should not be inheriting from date, because it isn’t a date, which is why > raises an exception when you compare the two.

Naive datetimes represent an abstract representation of a hypothetical civil time which are not necessarily tethered to specific moments in physical time. You can’t know exactly what time “today at 2:30 AM” is, unless you know where on earth you are and what the rules are for daylight savings time in that place. However, you can still talk about “2:30 AM” without reference to a time zone, and you can even say that “3:30 AM” is “60 minutes after” that time, even if, given potential changes to wall clock time, that may not be strictly true in one specific place during a DST transition. Indeed, one of those times may refer to multiple points in civil time at a particular location, when attached to different sides of a DST boundary.

By contrast, Aware datetimes represent actual moments in time, as they combine civil time with a timezone that has a defined UTC offset to interpret them in.

These are very similar types of objects, but they are not in fact the same, given that all of their operators have slightly different (albeit closely related) semantics.

Using `datetype`

I created a small library, datetype, which is (almost) entirely type-time behavior. At runtime, despite appearances, there are no instances of new types, not even wrappers. Concretely, everything is a date, time, or datetime from the standard library. However, when type-checking with Mypy, you will now get errors reported from the above scenarios if you use the types from datetype.

Consider this example, quite similar to our first problematic example:

Comparing `AwareDateTime` or `NaiveDateTime` to `date`:

from datetype import Date, NaiveDateTime

def is_after(before: Date, after: Date) -> bool | None:
    if before == after:
        return None
    if before > after:
        return False
    return True

is_after(Date.today(), NaiveDateTime.now())

Now, instead of type-checking cleanly, it produces this error, letting you know that this call to is_after will give you a TypeError.

date_datetime_datetype.py:10: error: Argument 2 to "is_after" has incompatible type "NaiveDateTime"; expected "Date"
Found 1 error in 1 file (checked 1 source file)

Similarly, attempting to compare naive and aware objects results in errors now. We can even use the included AnyDateTime type variable to include a bound similar to AnyStr from the standard library to make functions that can take either aware or naive datetimes, as long as you don’t mix them up:

Comparing `AwareDateTime` to `NaiveDateTime`:

from datetime import datetime, timezone, timedelta
from datetype import AwareDateTime, NaiveDateTime, AnyDateTime


def compare_same(a: AnyDateTime, b: AnyDateTime) -> timedelta:
    return a - b


def compare_either(
    a: AwareDateTime | NaiveDateTime,
    b: AwareDateTime | NaiveDateTime,
) -> timedelta:
    return a - b


compare_same(NaiveDateTime.now(), AwareDateTime.now(timezone.utc))

compare_same(AwareDateTime.now(timezone.utc), AwareDateTime.now(timezone.utc))
compare_same(NaiveDateTime.now(), NaiveDateTime.now())

naive_aware_datetype.py:13: error: No overload variant of "__sub__" of "_GenericDateTime" matches argument type "NaiveDateTime"
...
naive_aware_datetype.py:13: error: No overload variant of "__sub__" of "_GenericDateTime" matches argument type "AwareDateTime"
...
naive_aware_datetype.py:16: error: Value of type variable "AnyDateTime" of "compare_same" cannot be "_GenericDateTime[Optional[tzinfo]]"
Found 3 errors in 1 file (checked 1 source file)

Telling the Difference

Although the types in datetype are Protocols, there’s a bit of included magic so that you can use them as type guards with isinstance like regular types. For example:

from datetype import NaiveDateTime, AwareDateTime
from datetime import datetime, timezone

nnow = NaiveDateTime.now()
anow = AwareDateTime.now(timezone.utc)


def check(d: AwareDateTime | NaiveDateTime) -> None:
    if isinstance(d, NaiveDateTime):
        print("Naive!", d - nnow)
    elif isinstance(d, AwareDateTime):
        print("Aware!", d - anow)


check(NaiveDateTime.now())
check(AwareDateTime.now(timezone.utc))

Try it out, carefully

This library is very much alpha-quality; in the process of writing this blog post, I made a half a dozen backwards-incompatible changes, and there are still probably a few more left as I get feedback. But if this is a problem you’ve had within your own codebases - ensuring that dates and datetimes don’t get mixed up, or requiring that all datetimes crossing some API boundary are definitely aware and not naive, give it a try with pip install datetype and let me know if it catches any bugs!

But, in typical fashion, not a kind of time... ↩

Leave The Frog For Last

Neurotypical advice for ADHD is not always great.

adhd productivity Thursday May 12, 2022

This was originally a thread on Twitter; you can read the original here, but this one has been lightly edited for grammar and clarity, plus I added a pretty rad picture of a frog to it.

Update 2022-05-16: Thanks to some reader feedback I have updated the conclusion to note an example where this advice can productively apply to some ADHDers.

I’m in the midst of trying to unlearn a few things about neurotypical productivity advice but this is one I’ve been thinking about a lot:

Most productivity advice is toxic for ADHDers because it was written by a neurotypical brain for a neurotypical reader.

One of the best things you can do is unlearn axioms like "eat the frog first" and find what actually works for you.@adhddesigner https://t.co/lcd74rIDwb
— Jesse J. Anderson • ADHD Creative (@jessejanderson) February 3, 2022

“Eat the frog first” is particularly toxic advice for ADHDers.

A frog on a flower, nervously looking at you as you contemplate whether to eat it.

Photo by Stephanie LeBlanc on Unsplash

First, for anyone who happens not to know already: “eat the frog first” is a technique which involves finding the task you’re most likely to ignore or put off, and doing it first in your day to ensure that you don’t avoid it.

For a neurotypical person, eating the frog first makes sense, which is of course why this advice exists in the first place. If you’ve been avoiding a task, put it first in your day when you’re going to have the most energy, and use the allure of the more fun tasks later to push through it.

This makes intuitive sense.

The premise of this advice is that you rely on the promise of delayed gratification—and the anticipated inherent satisfaction of having completed the boring and unpleasant thing—in order to motivate you to do it.

Here’s the problem for ADHDers: ADHD is literally the condition of not generating enough dopamine, which means delayed gratification is inherently more difficult for us. The anticipated inherent satisfaction is less motivating because it’s less intense, on a physical level.

An ADHD brain powering through tasks needs momentum. You need to be in a sufficiently excited state to begin doing things. A bored, dopamine-starved ADHD brain is going to be clawing at the walls looking for ANY dopamine-generating distraction to avoid thinking about the frog.

Of course where dopamine won’t do, there’s always adrenaline. Panic can trigger sufficient states of activity as well, although the stress is unhealthy and it’s less reliable in the absence of a real, immediate threats that you can’t ignore.

So what frog-first ADHD days often look like (particularly for adult ADHDers) is a slow slog of not really doing anything useful, while stewing in increasingly negative self-talk, attempting to generate the necessary anger and self-loathing required to truly panic about the frog.

Unfortunately this type of attempt at internal motivation is more likely to result in depression than motivation, which creates a spiral that makes the problem worse.

The neurotypical’s metaphorical frog is just sitting there, waiting to be eaten. Maybe they’ve been avoiding it because it’s a little gross, but fine, they can exert a little willpower and just do it, and move on to more pleasant activities. But the ADHD frog is running away.

Trying to use the same technique just results in the ADHDer sitting in the swamp where the frog used to be, chugging ever-increasing volumes of toxic mud in the hopes that we’ll find a frog in there. Sometimes we even find one! But that’s not success.

At the end of the day, the metaphorical frog does need eating; that’s what makes it a frog. What is the conscientious ADHDer to do?

Unfortunately, there is no singular, snappy answer; difficulty with this type of task is the impenetrable core of the “disorder” part of ADHD. It’ll always be difficult. But there are definitely strategies which can make it relatively easier.

None of these are guaranteed to work, but I am at least reasonably sure that they won’t build a spiral into guilt and depression:

start with a fun task, and build momentum until the frog seems like no big deal
use hype music; yell; get excited to an embarrassing degree.¹
exercise; i.e. “go for a walk”

It might literally be better to start the day with something actively unproductive, but fun, like a video game, although this can obviously be risky. For this to work, you need to have very good systems in place.

Start the frog at the end of the day and deliberately interrupt yourself when you stop work. Leave it lingering so some aspect of it annoys you and it distracts you at night. Start the next day pissed off at and obsessing over murdering that piece of shit frog as soon as you can get your hands on it.

This technique is also good because at the end of the day you only need to push yourself just hard enough to load the task into your brain, not all the way through it.

Remember that while “stimulated” doesn’t have to mean “panicked”, it also doesn’t need to mean “happy”. Sometimes, annoyance or irritation is the best way to ensure that you go do something. Consider, for example, the compelling motivation of reading a comment on the Internet that you disagree with.

Overall the distinguishing characteristic of toxic productivity advice is that it makes you spend more time feeling bad than doing stuff. It substitutes panic for healthy motivation, and low self-esteem for a feeling of accomplishment.

The most important point I am trying to make is this: when you take productivity advice — even, or perhaps especially, from me – try to measure its impact on your work and your mental health.

To that point, one piece of feedback I received on an earlier iteration of this article was that, for some ADHDers on stimulant medication², eating the frog first can work: if you take your medication early in the morning and experience a big, but temporary, increase to executive-function 30 minutes later, being prepared to do your frog-eating at that specific moment can have similar results as for someone more neurotypical. This very much depends on how you specifically react to your medication, however.

So, if eating the frog first is working for you, by all means keep doing it, but you have to ask yourself: are you actually getting more done?

One of the advantages of working from home is that you can really lean into this without provoking an intervention from your coworkers. ↩
I personally take a slightly unusual kind of ADHD medication, which does help but not in the typical fashion. ↩

Inbox Zero, Cost: Zero

Updated guidance for getting out of an email overwhelm trap, with practical, concrete, free examples.

email productivity updates Monday May 02, 2022

One consistent bit of feedback that I’ve received on my earlier writing about email workflow is that I didn’t include a concrete enough set of instructions for getting started with task-management workflow, particularly with low-friction options that are available for people who don’t necessarily have $100 per year to drop on the cadillac of task-management applications.

Given that the piece seems to be enjoying a small resurgence of attention, I’ve significantly expanded the “Make A Place For Tasks” section of that article, with:

more no-cost, low-friction options for getting started (if you’re stuck on this step “if you use Gmail, just start using Google Tasks” is the main takeaway)
a guide for how to evaluate a task-management application for yourself, if you are trying to pick something that fits your work style better
several links to the specific “create a task from an email” tools and workflows for each app

It was nice to be doing this update now, because in the years since that piece was published, almost every major email application has added task-management features, or upgraded them into practical usability; gone are the times when properly filing your emails into clearly-described tasks was an esoteric feature that you needed expensive custom software for.