



This image shows some results with our frame profiler in a particle-heavy scene.







This is a scene that has 70 fire skeletons that are all firing projectiles at the player. It's a scene that makes a good test because it stresses the particle system with a lot of particles, but doesn't consume too much GPU fill rate, which is not the area we want to test here.



As you can see, without Engine Multithreading this has resulted in a 1.9X speedup and with engine multithreading on we are seeing a 1.5X speedup. Both optimisations together give a significant 4.5X speedup in this scene compared to 2.3.0.



In order to get these gains, we used the special purpose AVX instructions which were introduced on CPUs since roughly 2011. AVX instructions allow you to apply the same set mathmatical operations on a larger set of data at the same time. For example, instead of calculating the velocity of one particle, we can calculate it on four particles at once with the same number of CPU instructions.



The actual particle subsystem by itself is roughly 4X faster using these instructions. For CPUs without AVX support, we also have an SSE2 implementation which is roughly 2X faster than before, which will still have a fairly significant end result on your frame rate.



I would like to stress that this is a CPU optimisation and so it will not have any effect on frame rate if your graphics card is the bottleneck. While the engine multithreading has had a huge effect on performance, we still have a lot more things to optimise. Our optimisation programmer, Vincent, has been attempting to optimise the particle system by using the special purpose vector instructions that are available on modern CPUs.This image shows some results with our frame profiler in a particle-heavy scene.This is a scene that has 70 fire skeletons that are all firing projectiles at the player. It's a scene that makes a good test because it stresses the particle system with a lot of particles, but doesn't consume too much GPU fill rate, which is not the area we want to test here.As you can see, without Engine Multithreading this has resulted in a 1.9X speedup and with engine multithreading on we are seeing a 1.5X speedup. Both optimisations together give a significant 4.5X speedup in this scene compared to 2.3.0.In order to get these gains, we used the special purpose AVX instructions which were introduced on CPUs since roughly 2011. AVX instructions allow you to apply the same set mathmatical operations on a larger set of data at the same time. For example, instead of calculating the velocity of one particle, we can calculate it on four particles at once with the same number of CPU instructions.The actual particle subsystem by itself is roughly 4X faster using these instructions. For CPUs without AVX support, we also have an SSE2 implementation which is roughly 2X faster than before, which will still have a fairly significant end result on your frame rate.I would like to stress that this is a CPU optimisation and so it will not have any effect on frame rate if your graphics card is the bottleneck. Path of Exile - Lead Programmer Last bumped on Apr 14, 2018, 11:30:07 AM

Posted by

Jonathan

on Grinding Gear Games on