This post recommends calling pygame.display.flip
from a thread, which I
tested extensively on mac, windows, and linux before posting, but after some
feedback from readers, I realize that this strategy is not in fact
cross-platform; specifically, the nvidia drivers on linux appear to either
crash or display a black window if you try to do this. The SDL
FAQ does say that you can’t call
“video functions” from multiple threads, and flip
does do that under the
hood. I do plan to update this post again, either with a method to make it
safe, or a method to use slightly more complex timing heuristics to accomplish
the same thing. In the meanwhile, please be aware that this may cause
portability problems for your code.
I’ve written about this
before, but in that
context I was writing mainly about frame-rate independence, and only gave a
brief mention of vertical sync; the title also mentioned Twisted, and upon
re-reading it I realized that many folks who might get a lot of use out of its
technique would not have bothered to read it, just because I made it sound like
an aside in the context of an animation technique in a game that already
wanted to use Twisted for some reason, rather than a comprehensive best
practice. Now that Pygame 2.0 is out, though, and the vsync=1
flag is more
reliably available to everyone, I thought it would be worth revisiting.
Per the many tutorials out there, including the official one, most Pygame mainloops look like this:
1 2 3 4 5 6 7 8 |
|
Obviously that works okay, or folks wouldn’t do it, but it can give an impression of a certain lack of polish for most beginner Pygame games.
The thing that’s always bothered me personally about this idiom is: where does the networking go? After spending many years trying to popularize event loops in Python, I’m sad to see people implementing loops over and over again that have no way to get networking, or threads, or timers scheduled in a standard way so that libraries could be written without the application needing to manually call them every frame.
But, who cares how I feel about it? Lots of games don’t have networking1. There are more general problems with it. Specifically, it is likely to:
- waste power, and
- look bad.
Wasting Power
Why should anyone care about power when they’re making a video game? Aren’t games supposed to just gobble up CPUs and GPUs for breakfast, burning up as much power as they need for the most gamer experience possible?
Chances are, if you’re making a game that you expect anyone that you don’t personally know to play, they’re going to be playing it on a laptop2. Pygame might have a reputation for being “slow”, but for a simple 2D game with only a few sprites, Python can easily render several thousand frames per second. Even the fastest display in the world can only refresh at 360Hz3. That’s less than one thousand frames per second. The average laptop display is going to be more like 60Hz, or — if you’re lucky — maybe 120. By rendering thousands of frames that the user never even sees, you warm up their CPU uncomfortably4, and you waste 10x (or more) of their battery doing useless work.
At some point your game might have enough stuff going on that it will run the CPU at full tilt, and if it does, that’s probably fine; at least then you’ll be using up that heat and battery life in order to make their computer do something useful. But even if it is, it’s probably not doing that all of the time, and battery is definitely a use-over-time sort of problem.
Looking Bad
If you’re rendering directly to the screen without regard for vsync, your players are going to experience Screen Tearing, where the screen is in the middle of updating while you’re in the middle of drawing to it. This looks especially bad if your game is panning over a background, which is a very likely scenario for the usual genre of 2D Pygame game.
How to fix it?
Pygame lets you turn on
VSync,
and in Pygame 2, you can do this simply by passing the pygame.SCALED
flag and
the vsync=1
argument to
set_mode()
.
Now your game will have silky smooth animations and scrolling5! Solved!
But... if the fix is so simple, why doesn’t everybody — including, notably, the official documentation — recommend doing this?
The solution creates another problem: pygame.display.flip
may now block
until the next display refresh, which may be many milliseconds.
Even worse: note the word “may”. Unfortunately, behavior of vsync is quite
inconsistent between platforms and
drivers,
so for a properly cross-platform game it may be necessary to allow the user to
select a frame rate and wait on an asyncio.sleep
than running flip
in a
thread. Using the techniques from the answers to this stack overflow
answer
you can establish a reasonable heuristic for the refresh rate of the relevant
display, but if adding those libraries and writing that code is too complex,
“60” is probably a good enough value to start with, even if the user’s monitor
can go a little faster. This might save a little power even in the case where
you can rely on flip
to tell you when the monitor is actually ready again;
if your game can only reliably render 60FPS anyway because there’s too much
Python game logic going on to consistently go faster, it’s better to achieve
a consistent but lower framerate than to be faster but inconsistent.
The potential for blocking needs to be dealt with though, and it has several knock-on effects.
For one thing, it makes my “where do you put the networking” problem even worse: most networking frameworks expect to be able to send more than one packet every 16 milliseconds.
More pressingly for most Pygame users, however, it creates a minor performance
headache. You now spend a bunch of time blocked in the now-blocking flip
call, wasting precious milliseconds that you could be using to do stuff
unrelated to drawing, like handling user input, updating animations, running
AI, and so on.
The problem is that your Pygame mainloop has 3 jobs:
- drawing
- game logic (AI and so on)
- input handling
What you want to do to ensure the smoothest possible frame rate is to draw
everything as fast as you possibly can at the beginning of the frame and then
call flip
immediately to be sure that the graphics have been delivered to the
screen and they don’t have to wait until the next screen-refresh. However,
this is at odds with the need to get as much done as possible before you call
flip
and possibly block for 1/60th of a second.
So either you put off calling flip
, potentially risking a dropped frame if
your AI is a little slow, or you call flip
too eagerly and waste a bunch of
time waiting around for the display to refresh. This is especially true of
things like animations, which you can’t update before drawing, because you
have to draw this frame before you worry about the next one, but waiting until
after flip
wastes valuable time; by the time you are starting your next
frame draw, you possibly have other code which now needs to run, and you’re
racing to get it done before that next flip
call.
Now, if your Python game logic is actually saturating your CPU — which is not hard to do — you’ll drop frames no matter what. But there are a lot of marginal cases where you’ve mostly got enough CPU to do what you need to without dropping frames, and it can be a lot of overhead to constantly check the clock to see if you have enough frame budget left to do one more work item before the frame deadline - or, for that matter, to maintain a workable heuristic for exactly when that frame deadline will be.
The technique to avoid these problems is deceptively simple, and in fact it was
covered with the deferToThread
trick presented in my earlier
post. But again,
we’re not here to talk about Twisted. So let’s do this the
no-additional-dependencies, stdlib-only way, with asyncio:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
|
Go Forth and Loop Better
At some point I will probably release my own wrapper library6 which does something similar to this, but I really wanted to present this as a technique rather than as some packaged-up code to use, since do-it-yourself mainloops, and keeping dependencies to a minimum, are such staples of Pygame community culture.
As you can see, this technique is only a few lines longer than the standard recipe for a Pygame main loop, but you now have access to a ton of additional functionality:
- You can manage your framerate independence in both animations and game logic by just setting some timers and letting the frames update at the appropriate times; stop worrying about doing math on the clock by yourself!
- Do you want to add networked multiplayer? No problem! Networking all happens inside the event loop, make whatever network requests you want, and never worry about blocking the game’s drawing on a network request!
- Now your players’ laptops run cool while playing, and the graphics don’t have ugly tearing artifacts any more!
I really hope that this sees broader adoption so that the description “indie game made in Python” will no longer imply “runs hot and tears a lot when the screen is panning”. I’m also definitely curious to hear from readers, so please let me know if you end up using this technique to good effect!7
-
And, honestly, a few fewer could stand to have it, given how much unnecessary always-online stuff there is in single-player experiences these days. But I digress. That’s why I’m in a footnote, this is a good place for digressing. ↩
-
“Worldwide sales of laptops have eclipsed desktops for more than a decade. In 2019, desktop sales totaled 88.4 million units compared to 166 million laptops. That gap is expected to grow to 79 million versus 171 million by 2023.” ↩
-
At least, Nvidia says that “the world’s fastest esports displays” are both 360Hz and also support G-Sync, and who am I to disagree? ↩
-
They’re playing on a laptop, remember? So they’re literally uncomfortable. ↩
-
Assuming you’ve made everything frame-rate independent, as mentioned in the aforementioned post. ↩
-
because of course I will ↩
-
And also, like, if there are horrible bugs in this code, so I can update it. It is super brief and abstract to show how general it is, but that also means it’s not really possible to test it as-is; my full-working-code examples are much longer and it’s definitely possible something got lost in translation. ↩