The “Active Enum” Pattern

Have you ever written some Python code that looks like this?

from enum import Enum, auto

class SomeNumber(Enum):
    one = auto()
    two = auto()
    three = auto()

def behavior(number: SomeNumber) -> int:
    match number:
        case SomeNumber.one:
            print("one!")
            return 1
        case SomeNumber.two:
            print("two!")
            return 2
        case SomeNumber.three:
            print("three!")
            return 3

That is to say, have you written code that:

defined an enum with several members
associated custom behavior, or custom values, with each member of that enum,
needed one or more match / case statements (or, if you’ve been programming in Python for more than a few weeks, probably a big if/elif/elif/else tree) to do that association?

In this post, I’d like to submit that this is an antipattern; let’s call it the “passive enum” antipattern.

For those of you having a generally positive experience organizing your discrete values with enums, it may seem odd to call this an “antipattern”, so let me first make something clear: the path to a passive enum is going in the correct direction.

Typically - particularly in legacy code that predates Python 3.4 - one begins with a value that is a bare int constant, or maybe a str with some associated values sitting beside in a few global dicts.

Starting from there, collecting all of your values into an enum at all is a great first step. Having an explicit listing of all valid values and verifying against them is great.

But, it is a mistake to stop there. There are problems with passive enums, too:

The behavior can be defined somewhere far away from the data, making it difficult to:
1. maintain an inventory of everywhere it’s used,
2. update all the consumers of the data when the list of enum values changes, and
3. learn about the different usages as a consumer of the API
Logic may be defined procedurally (via if/elif or match) or declaratively (via e.g. a dict whose keys are your enum and whose values are the required associated value).
1. If it’s defined procedurally, it can be difficult to build tools to interrogate it, because you need to parse the AST of your Python program. So it can be difficult to build interactive tools that look at the associated data without just calling the relevant functions.
2. If it’s defined declaratively, it can be difficult for existing tools that do know how to interrogate ASTs (mypy, flake8, Pyright, ruff, et. al.) to make meaningful assertions about it. Does your linter know how to check that a dict whose keys should be every value of your enum is complete?

To refactor this, I would propose a further step towards organizing one’s enum-oriented code: the active enum.

An active enum is one which contains all the logic associated with the first-party provider of the enum itself.

You may recognize this as a more generalized restatement of the object-oriented lens on the principle of “separation of concerns”. The responsibilities of a class ought to be implemented as methods on that class, so that you can send messages to that class via method calls, and it’s up to the class internally to implement things. Enums are no different.

More specifically, you might notice it as a riff on the Active Nothing pattern described in this excellent talk by Sandi Metz, and, yeah, it’s the same thing.

The first refactoring that we can make is, thus, to mechanically move the method from an external function living anywhere, to a method on SomeNumber . At least like this, we present an API to consumers externally that shows that SomeNumber has a behavior method that can be invoked.

from enum import Enum, auto

class SomeNumber(Enum):
    one = auto()
    two = auto()
    three = auto()

    def behavior(self) -> int:
        match self:
            case SomeNumber.one:
                print("one!")
                return 1
            case SomeNumber.two:
                print("two!")
                return 2
            case SomeNumber.three:
                print("three!")
                return 3

However, this still leaves us with a match statement that repeats all the values that we just defined, with no particular guarantee of completeness. To continue the refactoring, what we can do is change the value of the enum itself into a simple dataclass to structurally, by definition, contain all the fields we need:

from dataclasses import dataclass
from enum import Enum
from typing import Callable

@dataclass(frozen=True)
class NumberValue:
    result: int
    effect: Callable[[], None]

class SomeNumber(Enum):
    one = NumberValue(1, lambda: print("one!"))
    two = NumberValue(2, lambda: print("two!"))
    three = NumberValue(3, lambda: print("three!"))

    def behavior(self) -> int:
        self.value.effect()
        return self.value.result

Here, we give SomeNumber members a value of NumberValue, a dataclass that requires a result: int and an effect: Callable to be constructed. Mypy will properly notice that if x is a SomeNumber, that x will have the type NumberValue and we will get proper type checking on its result (a static value) and effect (some associated behaviors)¹.

Note that the implementation of behavior method - still conveniently discoverable for callers, and with its signature unchanged - is now vastly simpler.

But what about...

Lookups?

You may be noticing that I have hand-waved over something important to many enum users, which is to say, by-value lookup. enum.auto will have generated int values for one, two, and three already, and by transforming those into NumberValue instances, I can no longer do SomeNumber(1).

For the simple, string-enum case, one where you might do class MyEnum: value = “value” so that you can do name lookups via MyEnum("value"), there’s a simple solution: use square brackets instead of round ones. In this case, with no matching strings in sight, SomeNumber["one"] still works.

But, if we want to do integer lookups with our dataclass version here, there’s a simple one-liner that will get them back for you; and, moreover, will let you do lookups on whatever attribute you want:

by_result = {each.value.result: each for each in SomeNumber}

`enum.Flag`?

You can do this with Flag more or less unchanged, but in the same way that you can’t expect all your list[T] behaviors to be defined on T, the lack of a 1-to-1 correspondence between Flag instances and their values makes it more complex and out of scope for this pattern specifically.

3rd-party usage?

Sometimes an enum is defined in library L and used in application A, where L provides the data and A provides the behavior. If this is the case, then some amount of version shear is unavoidable; this is a situation where the data and behavior have different vendors, and this means that other means of abstraction are required to keep them in sync. Object-oriented modeling methods are for consolidating the responsibility for maintenance within a single vendor’s scope of responsibility. Once you’re not responsible for the entire model, you can’t do the modeling over all of it, and that is perfectly normal and to be expected.

The goal of the Active Enum pattern is to avoid creating the additional complexity of that shear when it does not serve a purpose, not to ban it entirely.

A Case Study

I was inspired to make this post by a recent refactoring I did from a more obscure and magical² version of this pattern into the version that I am presenting here, but if I am going to call passive enums an “antipattern” I feel like it behooves me to point at an example outside of my own solo work.

So, for a more realistic example, let’s consider a package that all Python developers will recognize from their day-to-day work, python-hearthstone, the Python library for parsing the data files associated with Blizzard’s popular computerized collectible card game Hearthstone.

As I’m sure you already know, there are a lot of enums in this library, but for one small case study, let’s look a few of the methods in hearthstone.enums.GameType.

GameType has already taken the “step 1” in the direction of an active enum, as I described above: as_bnet is an instancemethod on GameType itself, making it at least easy to see by looking at the class definition what operations it supports. However, in the implementation of that method (among many others) we can see the worst of both worlds:

class GameType(IntEnum):
    def as_bnet(self, format: FormatType = FormatType.FT_STANDARD):
        if self == GameType.GT_RANKED:
            if format == FormatType.FT_WILD:
                return BnetGameType.BGT_RANKED_WILD
            elif format == FormatType.FT_STANDARD:
                return BnetGameType.BGT_RANKED_STANDARD
            # ...
            else:
                raise ValueError()
        # ...
        return {
            GameType.GT_UNKNOWN: BnetGameType.BGT_UNKNOWN,
            # ...
            GameType.GT_BATTLEGROUNDS_DUO_FRIENDLY: BnetGameType.BGT_BATTLEGROUNDS_DUO_FRIENDLY,
        }[self]

We have procedural code mixed with a data lookup table; raise ValueError mixed together with value returns. Overall, it looks like this might be hard to maintain this going forward, or to see what’s going on without a comprehensive understanding of the game being modeled. Of course for most python programmers that understanding can be assumed, but, still.

If GameType were refactored in the manner above³, you’d be able to look at the member definition for GT_RANKED and see a mapping of FormatType to BnetGameType, or GT_BATTLEGROUNDS_DUO_FRIENDLY to see an unconditional value of BGT_BATTLEGROUNDS_DUO_FRIENDLY. Given that this enum has 40 elements, with several renamed or removed, it seems reasonable to expect that more will be added and removed as the game is developed.

Conclusion

If you have large enums that change over time, consider placing the responsibility for the behavior of the values alongside the values directly, and any logic for processing the values as methods of the enum. This will allow you to quickly validate that you have full coverage of any data that is required among all the different members of the enum, and it will allow API clients a convenient surface to discover the capabilities associated with that enum.

Acknowledgments

Thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor!

You can get even fancier than this, defining a typing.Protocol as your enum’s value, but it’s best to keep things simple and use a very simple dataclass container if you can. ↩
derogatory ↩
I did not submit such a refactoring as a PR before writing this post because I don’t have full context for this library and I do not want to harass the maintainers or burden them with extra changes just to make a rhetorical point. If you do want to try that yourself, please file a bug first and clearly explain how you think it would benefit their project’s maintainability, and make sure that such a PR would be welcome. ↩