1 million items at 8 ms! Down from 100ms! Yay! \o/
The new 60 fps record is 1.7million! But How?
With two things:
- new data structure from 100ms to 32ms
- multithreading from 32ms to 8ms
The new data structure
Old data structure: Each belt does it’s thing
New data structure: belts get grouped together
- No more memory shuffling
- No more GC spikes
- Less code pathing
- Less iterations
- Less collision checks
Why didn’t I group them earlier? The thought crossed my mind but I didn’t want to implement an incomplete version that only works on exactly belts moving to the right. The same rulesets as in Factorio must be applied. Namely inserters taking or putting items on the belt, intersections from other belts and player interaction.
I couldn’t figure out a reduced version of this algorithm and my brain could only produce a lot of conditionals and dirty hacks.Both hard to maintain when things should change.
The final concept is really simple though. Intersection logic gets applied for building the groups. Each TransportBeltGroup has exactly 1 input and 1 output. Should this value exceed, a new TransportBeltGroup gets created. In the worst case it’s 1 per tile but you can’t help that fact either way. The path vector can be applied unless it’s the leading item or a previous item can’t be moved and has gone to sleep. A lot of expensive boundingBox.Contains were eliminated here.
With a safe method to handle intersections another data restriction could be eliminated.
The old data structure didn’t exploit the simplification of the spatial geometry. It was tile based with neighbours, so to apply each physic step a lot of iterations and jumping around in memory were necessary. Also every transportbelt could only hold 2 rows with 4 items. That decision came from not thinking through the intersection system, the quadtree coordinate system for rendering and the typical low lvl OOP approach. Every men/object for itself!
So moving 1 item to the next belt not only costs 1 remove, 1 add; there was also a deferred reorder system necessary because you couldn’t just move 1 item to the next belt. The next physic step catches it otherwise and you have double movement. You could reorder the physic step list starting from the last but then the whole logic with the trailing item that triggers all other items is broken. Or the more simpler version that I used, a dirty flag.
Naturally when items move in sync an increase of 1000 reorder items got added in the queue and that only got worse triggering the sensitive GC.
With the new grouping logic I could get rid of the small 2*4 arrays and replace them with normal Lists. Iteration was faster and code paths were simplified due to more known data. Most items can just be moved along the moveVector now.
The overall reduction was 68ms! It felt a little like cheating to reduce the iteration amount by 999000 but with a simple logic system, the needed data structure for this test scenario is realistic to achieve.
On to Multithreading!
Having a very coupled data format now, active Entities can be split up into
int.Floor(activeEntitiesCount / processorCount)
. In my case the processorCount is set to 6;2 for rendering.
Threading doesn’t get blocked as it runs self contained and syncs to 16.66667ms.
So the 1000 transportBeltGroups can be split into packs of 166 physic steps per thread. The last thread has to work a little more. This can surely be optimized further but it works well for now.
Not bad for one session of programming. Writing the article took me longer than the coding. Bringing coding thought processes into a 2nd language and having some form of structure that doesn’t feel like rambling without a point is harder than I imagined. 😀
There’s not much left to optimize now but I feel like I have proven enough how far Unity can be pushed without the need of C++, custom engines or trickery of unmanaged pointers.
It’s all in the architecture!
[The following is about software architecture philosophy and fun side projects ]
The most efficient architecture builds from the top. Creating from the top requires decisions based on classes and how they are
intertwined. Which relations do they have, what data do they need, how can you create this data and when to call as few methods as possible. Also a lot of grouping. Less objects means better scaling. I applied this thinking heavily in my other projects but somehow lost track in this one.
Sometimes I just get stuck in my own design and the only solution I’m having for that right now is to realize this “moment” in time. Taking a step back, rechecking my assumptions and try to come up with a better solution. Sometimes my brain is in tunnel vision though and no better solution comes along, hardening the tunnel vision even further. Low level optimization is usually the first warning symptom, then there’s GC spikes and refactoring. The evil nature of low level optimizations is also that even if you optimize a part like crazy you only get a few ms. One optimization I had went from 42ms to 38ms. Not bad but laughable compared to the data structure optimization. It also took longer to implement. 🙂
Oh well, the tree falls not at the first stroke.
Good architecture design is hard when you’re not familiar with the spatial design. It was my first 2d project on that scale. My other big projects where a WoW clone for arena matches and a planetary city builder, both with authoritative simulation servers in pure C# sockets and the other in NodeJS with Edge. (You might lol at that WTF NO C++? 🙂 ). The architecture is quite different as it lies in having the most efficient scaling, netcode and security. You are also pretty limited in the actual simulation because if it’s too complex the scaling is pretty bad leaving only room for either, not having a lot of entities to calculate or statistic based/simplified simulation. Both are heavy restrictions for game design and not fun as a base. Every mechanic has to be meticulously implemented, otherwise scale, security or netcode breaks.
Where to go from here?
I’ll most likely scrap the online part of the planetary city builder. Having no server to rely on, much more features and simulation steps can be implemented and the overall design can be thought on a bigger scale. Having a chain of events going on instead of a progress bar is much more fun to look at. It also creates more opportunities for the player to interact and manipulate the world resulting in more complex decision making. Everything leads to a better game.
Maybe I’ll write in more detail about the planetary city builder but with the drastic design change I’ll hold off for now. Coupled with an older project I have dubbed “Sandbox”, the opportunities, especially when it comes to modding and expanding the game are through the roof.
If I have a working prototype I’ll post more info.