colugomusic

Dev Log #78 - Optimization work (boring) (Patreon)

Published:

2023-12-09 19:31:08

Imported:

2024-02

Tags:

devlog

Content

It makes me cringe a bit any time I see a video of someone using Blockhead when they start zooming or panning around in any marginally complex workspace and inevitably the interface starts updating at about 5fps. After uploading the last build I decided to take a quick look at things to figure out where the main performance issues are. That "quick look" turned into two and a half weeks of optimization work which is still ongoing.

In the past I have posted a bit about trying to optimize things but only now do I feel like I actually have a decent grasp on what I'm doing. In the latest build there are two places where performance problems are occurring:

In the Godot scene tree
In the Blockhead code

1. In the Godot scene tree

The Godot bottleneck is very easy to see by profiling. When the user moves the workspace around I calculate the new positions and sizes of all the visible blocks and send them to Godot. At some point in the call stack I am calling Godot's Control::set_size() and Control::set_global_position() functions and I can see in the profiler those functions are taking a lot of time.

Moving a whole bunch of objects around is usually the kind of thing that game engines can do really quickly, but importantly these are Godot control nodes I am moving around, not Spatial/Node2D nodes. These things are great for complicated custom UI elements but they are really intended to be used for things like game menu screens and inventory systems and that kind of thing. Blockhead isn't a game though and I'm not really using Godot in a typical way, and crucially every block is implemented using control nodes, and some of the block types are pretty complex, and a typical Blockhead workspace can have many many many blocks on the screen at once.

The Godot editor is really good for quickly prototyping UI ideas and then adapting those prototypes into working implementations. That is a double-edged sword because it's also very easy to end up with very deeply nested scene hierarchies and ultimately the more complex your control scenes are the longer those calls to set_size/set_position are going to take. I did take a quick look at the Godot source out of curiosity and from what I can tell any time a control node is resized it will recursively visit every one of its child controls to recalculate things even if they are not currently visible, so it makes sense that the more complex a block scene is, the slower it is to resize.

The good news is that I am currently using control nodes for all sorts of things for which they are not actually necessary, so I think there are a lot of easy wins to take. For example the plugin icon on the synth/effect blocks is currently rendered using a control node, and the border that is drawn around blocks, and many other visual elements. Even the background color of every block is drawn using a ColorRect control node. These kinds of visuals can instead be rendered entirely in code using Godot's VisualServer interface which in theory should be much faster. With a bit of time and effort I should be able to strip down the complexity of the block scenes quite a bit by migrating as much as possible over to the VisualServer.

2. In the Blockhead code

There is also a more subtle performance issue which doesn't really show up when profiling and that is cache locality. Blockhead from the start was written in a pretty object oriented style which tends to be bad for cache locality when processing large numbers of objects. Before passing the new position and size of each block to Godot, those values first have to be calculated. I try to filter out invisible blocks as early as possible and do the bare minimum amount of work to calculate everything but at the end of the day when you're iterating through an array of pointers to heap-allocated objects things aren't going to be super fast.

So for the past two and a half weeks the main thing I have been doing is moving a bunch of data out of Blockhead's object oriented world into a more data-oriented model, organizing everything so that the geometry of the workspace can be calculated as quickly as possible.

The main 'objects' of a Blockhead project are:

Workspaces
Tracks
Lanes
Blocks
Block instances

Workspaces, tracks and lanes are now no longer objects (in the object-oriented sense) at all. Instead they are just represented by unique IDs which are used to look up data from the model. Blocks and block instances have also had most of their guts ripped out and moved into the data-oriented model but their 'object' counterparts still exist for now. They will likely be refactored away entirely in the future but it's not as easy as it was for the other types of object.

Doing all this was a lot of work and involved deliberately breaking the code-base and ripping out huge sections of code so that they could be rebuilt and replaced. For a lot of the time the code was not compiling since it's sort of an "all or nothing" change. I had a bit of dread in the pit of my stomach while I was working on this since this "cache locality" idea was really just a theory that I had based on intuition and things I have read. I had no way to really know if I was correct until the code was all compiling and working again so I could test it.

Happily today I finally got to the point where things are compiling again and there does seem to be a clear performance improvement (both from profiling and just visual testing and comparing to v0.32.0.) So it was not wasted time after all.

You may have noticed that when blocks get small enough on the screen that Blockhead will switch to a sort of "low-detail" mode where it just represents it as a solid colored rectangle. This was mainly added for performance reasons because rendering those colored rectangles is much faster than having the full control node visible. Some have complained about this though as the threshold for switching to low-detail mode is quite high. As an experiment I decreased the threshold where Blockhead switches a block to low-detail from 32 to 16 pixels, and with the optimizations things are still relatively smooth, though it's definitely still possible to make things slow down if you push things.

I think continuing to move more data from the object-oriented world over to the data-oriented world might show more minor improvements but from here I think the main issue is the complexity of the block sub-scenes. Hopefully stripping down the number of control nodes nested inside each block will give the extra performance I'm looking for.