I’ve been excited with Unity’s Burst compiler since I saw it in action with my own code. Little by little, I’ve been porting some of our common code to use ECS so we could easily leverage Burst. I’ve always wanted to write the most basic and generic A* algorithm written in ECS and see how fast could it be when compiled with Burst. Because if it’s already fast enough, there would be no need to write more sophisticated A* variants like Jump Point Search. Around last month, I was able to do this and I was very pleased with the results.

I made a simple environment where I could run A* over. I used a 80×40 2D grid. I added some controls where I could paint cells that are blockers. I used 80×40 because it’s the one where this map of rooms was made. I could have made a procedural room generator but I just wanted to test my A* code right away. (Maybe next time.) I also didn’t want to waste time making a scrollable or zoomable map. After all, I just want to see how faster Burst compiled A* is.

Before performance testing, I had to verify first that my ECS A* code works. As you can see in the image above, it shows the resolved path from the blue tile to the red tile. I even added handling for destination that can’t be reached in which the algorithm would still try to resolve the nearest position to the target.

Red tile is unreachable

Code

Here’s the A* search code as an IJob:

[BurstCompile] public struct AStarSearch<HeuristicCalculator, ReachabilityType> : IJob where HeuristicCalculator : struct, HeuristicCostCalculator where ReachabilityType : struct, Reachability { public Entity owner; public int2 startPosition; public int2 goalPosition; [ReadOnly] public ReachabilityType reachability; // This will be specified by client on whether it wants to include diagonal neighbors [ReadOnly] public NativeArray<int2> neighborOffsets; public GridWrapper gridWrapper; public ComponentDataFromEntity<AStarPath> allPaths; [ReadOnly] public BufferFromEntity<Int2BufferElement> allPathLists; public ComponentDataFromEntity<Waiting> allWaiting; // This is the master container for all AStarNodes. The key is the hash code of the position. // This will be specified by client code public NativeList<AStarNode> allNodes; // Only used for existence of position in closed set // This will be specified by client code public NativeHashMap<int2, byte> closeSet; private HeuristicCalculator heuristicCalculator; public OpenSet openSet; public void Execute() { this.allNodes.Clear(); this.openSet.Clear(); this.closeSet.Clear(); DoSearch(); // Mark as done waiting for the agent to respond this.allWaiting[this.owner] = new Waiting { done = true }; } private void DoSearch() { if (!this.reachability.IsReachable(this.goalPosition)) { // Goal is not reachable this.allPaths[this.owner] = new AStarPath(0, false); return; } float startNodeH = this.heuristicCalculator.ComputeCost(this.startPosition, this.goalPosition); AStarNode startNode = CreateNode(this.startPosition, -1, 0, startNodeH); this.openSet.Push(startNode); float minH = float.MaxValue; Maybe<AStarNode> minHPosition = Maybe<AStarNode>.Nothing; // Process while there are nodes in the open set while (this.openSet.HasItems) { AStarNode current = this.openSet.Pop(); if (current.position.Equals(this.goalPosition)) { // Goal has been found int pathCount = ConstructPath(current); this.allPaths[this.owner] = new AStarPath(pathCount, true); return; } ProcessNode(current); this.closeSet.TryAdd(current.position, 0); // We save the node with the least H so we could still try to locate // the nearest position to the destination if (current.H < minH) { minHPosition = new Maybe<AStarNode>(current); minH = current.H; } } // Open set has been exhausted. Path is unreachable. if (minHPosition.HasValue) { int pathCount = ConstructPath(minHPosition.Value); this.allPaths[this.owner] = new AStarPath(pathCount, false); // false for unreachable } else { this.allPaths[this.owner] = new AStarPath(0, false); } } private AStarNode GetNode(int index) { return this.allNodes[index]; } private AStarNode CreateNode(int2 position, int parent, float g, float h) { int index = this.allNodes.Length; AStarNode node = new AStarNode(index, position, parent, g, h); this.allNodes.Add(node); return node; } // Returns the position count in the path private int ConstructPath(AStarNode destination) { // Note here that we no longer need to reverse the ordering of the path // We just add them as reversed in AStarPath // AStarPath then knows how to handle this DynamicBuffer<Int2BufferElement> pathList = this.allPathLists[this.owner]; pathList.Clear(); AStarNode current = GetNode(destination.index); while (current.parent >= 0) { pathList.Add(new Int2BufferElement(current.position)); current = GetNode(current.parent); } return pathList.Length; } private void ProcessNode(in AStarNode current) { if (IsInCloseSet(current.position)) { // Already in closed set. We no longer process because the same node with lower F // might have already been processed before. Note that we don't fix the heap. We just // keep on pushing nodes with lower scores. return; } // Process neighbors for (int i = 0; i < this.neighborOffsets.Length; ++i) { int2 neighborPosition = current.position + this.neighborOffsets[i]; if (current.position.Equals(neighborPosition)) { // No need to process if they are equal continue; } if (!this.gridWrapper.IsInside(neighborPosition)) { // No longer inside the map continue; } if (IsInCloseSet(neighborPosition)) { // Already in close set continue; } if (!this.reachability.IsReachable(current.position, neighborPosition)) { // Not reachable based from specified reachability continue; } float tentativeG = current.G + this.reachability.GetWeight(current.position, neighborPosition); float h = this.heuristicCalculator.ComputeCost(neighborPosition, this.goalPosition); if (this.openSet.TryGet(neighborPosition, out AStarNode existingNode)) { // This means that the node is already in the open set // We update the node if the current movement is better than the one in the open set if (tentativeG < existingNode.G) { // Found a better path. Replace the values. // Note that creation automatically replaces the node at that position AStarNode betterNode = CreateNode(neighborPosition, current.index, tentativeG, h); // Only add to open set if it's a better movement // If we just push without checking, a node with the same g score will be pushed // which causes infinite loop as every node will be pushed this.openSet.Push(betterNode); } } else { AStarNode neighborNode = CreateNode(neighborPosition, current.index, tentativeG, h); this.openSet.Push(neighborNode); } } } private bool IsInCloseSet(int2 position) { return this.closeSet.TryGetValue(position, out _); } }

The grid is represented as a linear array of int2 (2 dimensional vector but with integers). This is handled by the passed GridWrapper upon job creation.

Neighbors or adjacent nodes is easily represented as an array of int2 offsets. This is the neighborOffsets variable in the code. To get the neighbors, the code just loops through this array and then add the currently processed position and the current neighbor offset. The resulting int2 would be the position of the neighbor.

The whole thing is just the baseline A* algorithm that everybody learns on the first time.

Performance Test

The performance test is simple. Every frame, I make an X amount of A* requests and the code executes those X requests within the same frame. A single request means picking a random start and random end position from the map in the image above. I picked this kind of map representation because this is how our current game maps approximately looks like.

The machine used for the test has the following specs:

Intel i3-6100 @ 3.70Ghz (4CPUs)

16GB RAM

I run two tests. First is for non-Burst compiled, then a Burst compiled one. The profilers are run with a development build of the simulation.

Non Burst

The following is how the non-Burst compiled simulation ran with a single request (X = 1):

Non Burst with single request (larger image)

A single A* search runs around 1ms-8ms.

I kept increasing the number of requests until simulation hits past 16ms (60fps target). This target is already hit at 15 requests:

Past 16ms is reached with just 15 requests (larger image)

Burst

Let’s see the performance of the Burst compiled simulation with just one request:

Burst with single request

The running time goes as low as 0.04ms and as high as 0.70ms. I haven’t seen a running time that reached at least 1ms.

I increased the number of requests until the frame time reaches past 16ms. Guess how many requests it takes.

Burst compiled A* can run up to 120 requests until it reaches 16ms (larger image)

I reached 120 requests before hitting that 16ms mark. That’s 8 times more requests than non-Burst A* can handle! For me, that’s already kind of amazing. That’s more than enough for the kind of games we make (simulation games). We don’t need 120 requests per frame. Even just 20 per frame would be nice.

Final Words

While the simulation here is not really indicative of that of an actual game, it still showed that Burst compiled A* is significantly faster. I wouldn’t be able to use this on our current game because of heavy OOP usage but I’m excited for our next game.