Fun With Data Structures: Random Teleport Problem

Solving a real-world problem with a well-known data structure

This is the first piece in a new series: Fun With Data Structures.

In this piece, we are going to solve a real-world problem using a well-known data structure — dictionaries (or hash table).

Task

We have a multiplayer space survival game. There are players, spaceships, galaxies, planets… And the list goes on.

One day, we decide to introduce a new feature:

Every second, the game randomly selects a spaceship out of all the ships in the game and teleports it to another galaxy. Just for fun.

For the sake of simplicity, assume that every spaceship has a unique integer id and name . They are stored in a dictionary; id as key and name as value.

a sample spaceships dictionary

There are millions of spaceships in the game. Players are constantly creating, destroying, and updating spaceships.

Our task is to implement an algorithm to pick a random spaceship from this dictionary (return a random item from a dictionary).

Time complexity assumptions:

Generating random number: O(1) time

Search/insert/delete in dictionary: O(1) time

Removing an item from a list, O(n) time

Appending an item to a list, O(1) time

Sample dictionary input for spaceships :

spaceships = {

5: "Ghost",

40: "Death Star",

16: "B-wing",

...

}

Sample output: Death Star (randomly picked)

Before reading further, I highly encourage you to try to solve this task on your own.

First attempt: generate a random int as key

It seems fairly simple, right? Just create a random number and use it as key to picking an item from the dictionary. Let's assume that every spaceship has an id between 0 and 100000000 . It is easy to generate a random number in this interval:

Let's analyse this algorithm.

Time complexity: O(1)

Extra space complexity: O(1)

Wow. This is as good as it gets. But does this algorithm really work?

No.

This would work only if we were sure that there are exactly 100000001 items in the dictionary and their ids are consecutive, starting from 0. But remember, players can create or destroy spaceships any time they want. There might be just eleven spaceships at one time, and three million spaceships at another time. They have unique ids, and there can be gaps between these ids.

Second attempt: move keys to a list

OK, it is not that straightforward to randomly choose an element from a dictionary. How about lists? They have a certain order and length. We can easily choose a random element from a list.

Let’s put all the keys in the spaceships dictionary in a list and pick a random element from this list:

It definitely works. (How about data race?)

Time complexity: O(n) to copy dictionary keys to a list.

Extra space complexity: O(n) to keep keys in the list.

Well, not that bad. We can afford O(n) memory.

But remember, we are going to call this function every second. Which means it is going to cost O(n) time every second. It will be really slow if there are millions of spaceships.

Do we really need to recreate key_list every time get_random_ship is called? Can we do better?

Third attempt: mapper dictionary

Instead of creating a new key list every time, we can create it once and use it for every call.

To do that properly, we need to keep key_list and spaceships in sync:

Whenever a key is removed from spaceships , remove this key from key_list

, remove this key from Whenever a new item is added to spaceships , append its key to key_list

We can append new keys to key_list in O(1) time, but what about removing? We need to find the key in the list, which will take O(n) time. We also need to remove it, which will take another O(n) time.

If we knew the position of the key in the list, we could have just found it in O(1) time, and, with a clever trick, we could remove it in O(1) time.

If we want to access something in O(1) time, dictionaries are our best shot. It is time to introduce mapper dictionary, key_mappings , which keeps the positions of each key in key_list .

Here is a snapshot of three data structures we use:

spaceships = {

5: "Ghost",

40: "Death Star",

16: "B-wing",

} key_list = [5, 40, 16] key_mappings = {

5: 0, # Key 5 is at position 0 in `key_list`

40: 1, # Key 40 is at position 1 in `key_list`

16: 2, # Key 16 is at position 2 in `key_list`

}

By looking into key_mappings we know that key 5 is at position 0 in key_list array, key 16 is at position 2 , and so on… Now we can update key_list in O(1) time at the expense of extra O(n) space of key_mappings .

Let’s go through an example and add a new spaceship, 4: "Starfighter"

Add new value to spaceships: spaceships[4] = "Starfighter" Append 4 to key_list: key_list.append(4) Add the index of new key to mappings: key_mappings[4] = len(key_list)

spaceships = {

5: "Ghost",

40: "Death Star",

16: "B-wing",

4: "Starfighter",

} key_list = [5, 40, 16, 4] key_mappings = {

5: 0,

40: 1,

16: 2,

4: 3,

}

Let’s remove the key 40 from spaceships. We know that 40 is at position 1 in key_list . The trick is, instead of removing 40 directly from key_list , we can put the last element into this position in O(1) time.

Remove item from spaceships: del spaceships[40] Find the index of the key in key_list: index = key_mappings[40] Replace it with the last key: key_list[index] = key_list.pop()

spaceships = {

5: "Ghost",

16: "B-wing",

4: "Starfighter",

} # Notice that how we moved last element into position 1

key_list = [5, 4, 16] # Notice that we set the value of `4` to 1 and removed key `40`.

# Although we can keep key 40 there instead of removing.

key_mappings = {

5: 0,

16: 2,

4: 1,

}

Time complexity:

One time O(n) to create key_list

O(1) every time we update spaceships dictionary.

dictionary. O(1) to get a random spaceship.

Space complexity:

O(n) for key_list and O(n) for key_mappings

Follow-up task

Try to implement the same logic in a relational database where spaceships are stored in a table.