Some of you read my previous post on typing.Protocol
s and
probably wondered: “what about zope.interface
?” I’ve advocated strongly for it
in the past —
but now that we have Mypy and Protocol
s, is it simply a relic of an earlier
time? Can we entirely replace it with Protocol
?
Let’s have a look.
Typing in 2 dimensions
In the previous post I discussed structural versus nominal typing. In Mypy’s
type system, most classes are checked nominally whereas Protocol
is checked
structurally. However, there’s another way that Protocol
is distinct from a
normal class: normal classes are concrete types, and Protocol
s are
abstract.
Abstract types:
- cannot be instantiated: every instance of an abstract type is an instance of some concrete sub-type, and
- do not include (complete) implementation logic.
Concrete types:
- can be instantiated: they are complete descriptions of a type, and
- must include all their own implementation logic.
Protocol
s and Interface
s are both abstract, but Interface
s are nominal.
The highest level distinction between the two is that when you have a problem
that requires an abstract type, but nominal checking is preferable to
structural, Interface
s are a better solution.
Python’s built-in Abstract Base Classes are technically abstract-and-nominal as well, but they’re in a strange halfway space; they’re formally “abstract” because they can’t be instantiated, but they’re partially concrete in that they can contain any amount of implementation logic themselves, and thereby making an object which is a subtype of multiple ABCs drags in all the usual problems of the conflicting namespaces within multiple inheritance.
Theoretically, there’s a way to treat ABCs as purely abstract — which is to use
ABCMeta.register
— but as of this writing (March 2021) it doesn’t work with
Mypy, so within the context of
“static typing in Python” we presently have to ignore it.
Practicalities
The first major advantage that Protocol
has is that since it is now built in
to Python itself, there’s no reason not to use it. When Protocol
didn’t
even exist, regardless of all the advantages of adding explicit abstract types
to your project with zope.interface
, it did still have the small down-side of
requiring a new dependency, with all the minor headaches that might imply.
beyond the theoretical distinctions, there’s a question of how well tooling
supports zope.interface
. There are some clear gaps; there is not a ton of
great built-in IDE support for zope.interface
; less-sophisticated linters
will sometimes still complain that Interface
s don’t take self
as their
first argument. Indeed, Mypy itself does this by default — although more on
that in a moment. Less mainstream performance-focused type-checkers like
Pyre and
Pyright don’t support zope.interface
,
either, although their lack of support for zope.interface
is just a part of a
broader problem of their lack of extensibility; they also can’t support
SQLAlchemy or the Django ORM without special-casing in the tools themselves.
But what about Mypy itself — if we have to discount ABCMeta.register
due to
practical tooling deficiencies even if they provide a built-in way to declare a
nominal-but-abstract type in principle, we need to be able to use
zope.interface
within Mypy as well for a fair comparison with Protocol
.
Can we?
Luckily, yes! Thanks to Shoobx, there’s a fairly actively maintained Mypy
plugin that supports
zope.interface
which you can use to statically check your Interface
s.
However, this plugin does have a few key
limitations as of this writing
(Again, March 2021), which makes its safety guarantees a bit lower-quality
than Protocol
.
The net result of this is that Protocol
s have the “home-field advantage” in
most cases; out of the box, they’ll work more smoothly with your existing
editor / linter setup, and as long as your project supports Python 3.6+, at
worst (if you can’t use Python 3.7, where Protocol
is built in to typing
)
you have to take a type-check-time dependency on the typing_extensions
package, whereas with zope.interface
you’ll need both the run-time dependency
of zope.interface
itself and the Mypy plugin at type-checking time.
So in a situation where both are roughly equivalent, Protocol
tends to win by
default. There are undeniably big areas where Interface
s and Protocol
s
overlap, and in plenty of them, using Protocol
is a fine idea. But there are
still some clear places that zope.interface
shines.
First, let’s look at a case which Interface
s handle more gracefully than
Protocol
s: opting out of matching a simple shape, where the shape doesn’t
fully describe its own meaning.
Where Interface
s work best: hidden and complex meanings
The string is a stark data structure and everywhere it is passed there is much duplication of process. It is a perfect vehicle for hiding information.
Alan Perlis, “Epigrams in Programming”, Epigram 34.
The place where structural typing has the biggest advantage is when the type
system is expressive enough to fully encode the meaning of the desired
behavior within the structure of the type itself. Consider a Protocol
which
describes an object that can add some integers together:
1 2 3 |
|
It’s fairly unambiguous what adherents to this Protocol
should do, and anyone
implementing such a thing should be able to clearly tell that the method is
supposed to add a couple of integers together; there’s nothing hidden about the
structure of the integers, no constraints the type system won’t let us specify.
It would be quite surprising if anything that didn’t have the intended behavior
would match this Protocol
.
A the other end of the spectrum, we might have a plugin Interface
that has a
lot of hidden structure. For this example, we have an Interface
called
IPlugin
containing a method with an easy-to-conflict-with name (“name
”)
overloaded with very specific constraints on its return type: the string must
contain the dotted-path name of a Python object in an import-able module (like,
for example, "os.path.join"
).
1 2 3 |
|
With Protocol
s, you can work around these limitations, by manually making
it harder to match; adding elements to the structure that embed names relevant
to its semantics and thereby making the type behave more as if it were
nominally typed.
You could make the method’s name long and ugly instead (plugin_name_to_load
,
let’s say) or add unused additional attributes (yep_i_am_a_plugin =
Literal[True]
) in order to reduce the risk of accidental matches, but these
workarounds look hacky, and they have to be manually namespaced; if you want to
mark it as having semantics associated with your specific plugin system, you
have to embed the name of that system in your attributes themselves; here we’re
just saying “plugin” but if we want to be truly careful, we have to embed the
whole name of our project in there.
With Interface
s, the maintainer of each implementation must explicitly opt
in, by choosing whether to specify that they are an @implementer(IPlugin)
.
Since they had to import IPlugin
from somewhere, this annotation carries
with it a specific, namespaced declaration of semantic intent: “I know what
the Interface
IPlugin
means, and I promise that I can provide it”.
This is the most salient distinction between Protocol
s and Interface
s: if
you have strong reasons to want adherents to the abstract type to opt in, you
want an Interface
; if you want them to match automatically, you want a
Protocol
.
Runtime support
Interfaces also provide a more nuanced set of runtime checks.
You can say that an object
directlyProvides
an interface, allowing for some level of (at least runtime) type safety, and
ask if IPlugin
is .providedBy
some object.
You can do most of this with Protocol
, but it’s awkward. The
@runtime_checkable
decorator allows your Protocol
to make isinstance(x, MyProtocol)
work like
IMyInterface.providedBy(x)
, but:
- you’re still missing
directlyProvides
; the runtime checking is all by type, not by the individual properties of the instance; - it’s not the default, so if you’re not the one defining the
Protocol
, there’s no guarantee you’ll be able to use it.
With Interface
s, there’s also no mandatory relationship between the
implementer (i.e. the type whose instances fit the specified shape) and the
provider (the specific object which can fit the specified shape). This means
you get features like
classProvides
and
moduleProvides
“for free”.
Interface
s work particularly well for communication between frameworks and
application code. For example, let’s say you’re evolving the meaning of an
Interface
implemented by applications over time — EventHandler
,
EventHandler2
, EventHandler3
— which have similarly named and typed
methods, but subtly different expectations on their lifecycle or when precisely
the methods will be called. A framework facing this problem can use a series
of Interface
s, and check at runtime to see which of these the application
implements
, and be secure in the knowledge that the application has properly
intentionally adopted the new interface, and doesn’t just happen to have a
matching method name against an older version.
Finally, zope.interface
gives you adaptation and adapter
registries, which
can be a useful mechanism for doing things like templating, like a much more
powerful version of
singledispatch
from the standard library.
Adapter registries are nuanced, complex tools and unfortunately an example that
captures the full utility of their power would itself be commensurately
complex. However, the core of adaptation is the idea that if you have an
arbitrary object x
, and you want a provider of the interface IY
, you can do
the following:
1 |
|
This performs a multi-stage check:
- If
x
already providesIY
(either viaimplementer
,provider
,directlyProvides
,classProvides
, ormoduleProvides
), it’s simply returned; so you don’t need to special-case the case where you’ve already got what you want. - If
x
has a__conform__(interface)
method, it’ll be called withIY
as theinterface
, and if__conform__
returns anything non-None
that result will be returned from the call toIY
. - If
IY
has a specially-defined__adapt__
method, it can implement its own logic for this hook directly. - Each globally-registered function in
zope.interface
’sadapter_hooks
will be invoked to find a function that can transformx
into anIY
provider. Twisted has its own global registry in this list, which is whatregisterAdapter
manipulates.
But from the perspective of the caller, you can just say “I want an IY
”.
With Protocol
s, you can emulate this with
functools.singledispatch
by making a function which returns your Protocol
type and registers various
types to do conversion. The place that adapter registries have an advantage is
their central nature and consistent idiom for converting to the target type;
you can use adaptation for any Interface
in the same way, and any type can
participate in adaptation in the ways listed above via flexible mechanisms
depending on where it makes sense to put your implementation, whereas any
singledispatch
function to convert to a Protocol
needs to be bespoke
per-Protocol
.
Describing and restricting existing shapes
There are still several scenarios where Protocol
’s semantics apply more
cleanly.
Unlike Interface
s, Protocol
s can describe the types of things that already
exist. To see when that’s an advantage, consider a sprawling application that
uses tons of libraries and manipulates 3D spatial data points.
There’s a convention among these disparate libraries where they all represent a
“point” as an object with .x
, .y
, and .z
attributes which are all
float
s. This is a natural enough shape, given the domain, that lots of your
libraries just fit it by accident. You want to write functions that can work
with data output by any of these libraries as long as it plausibly looks like
your own concept of a Point
:
1 2 3 4 |
|
In this case, the thing defining the Protocol
is your application; the
thing implementing the Protocol
is your collection of libraries. Since the
libraries don’t and can’t know about the application — the dependency arrow
points the other way — they can’t reference the Protocol
to note that they
implement it.
Using Protocol
, you can also restrict an existing type to preserve future
flexibility.
For example, let’s say we’re implementing a “mailbox” type pattern, where some
systems deliver messages and other systems retrieve them later. To avoid
mix-ups, the system that sends the messages shouldn’t retrieve them and vice
versa - receivers only receive, and senders only send. With Protocol
s, we
can describe this without having any new custom concrete types, like so:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
All of that code is just telling Mypy our intentions; there’s no behavior here yet.
The actual implementation is even shorter:
1 2 3 |
|
Literally no code of our own - set
already does the job we described. And
how do we use this?
1 2 3 4 5 6 7 8 9 10 11 |
|
For its initial implementation, this system requires nothing beyond types
available in the standard library; just a set
. However, by treating their
parameter as a Sender
and a Receiver
respectively rather than a Set
,
send
and receive
prevent themselves from using any functionality from the
set
passed in aside from the one method that their respective roles are
supposed to “see”. As a result, Mypy will now tell us if any code which
receives the sender
object tries to remove objects.
This allows us to use existing data structures in libraries without the usual attendant problem of advertising to all clients that every tiny implementation detail of those existing structures is an intended part of the public interface. Python has always tried to make these sort of distinctions by leaving certain things undocumented or saying narratively which things you should rely on, but it’s always hit-or-miss (usually miss) whether library consumers will see those admonitions or not; by making it a feature of the programming environment, Mypy makes it harder to ignore.
Conclusions
In modern Python code, when you have an abstract collection of behavior, you
should probably consider using a Protocol
to describe it by default.
However, Interface
is also staying up to date with modern Python tooling by
with Mypy support, and it can be worthwhile for more sophisticated consumers
that want support for nominal typing, or that want to draw on its reach
adaptation and component registration feature-set.