(original post on : https://en.jeffprod.com/blog/2017/solving-a-rubik-s-pocket-cube-with-a-graph-database/)

A pocket cube can have 3 674 160 different positions. For a “No-SQL” database, it’s a piece of cake.

Neo4j is a graph database. In this tool, each node could be a cube position, and eges from one node to another could be a move.

As this software can find the shortest path between two nodes, could it solve the cube with “god move” i.e. 14 at most from a given position ?

The answer is : yes, within a few milliseconds.

Here is how to do.

Cube representation

First of all, we have to code the cube description. Here is a way to do :

This is the same as :

The position of a cube is written into an array of 24 elements :

# The solved cube is as it :

$position = array(

'w','w','w','w',

'g','g','r','r',

'b','b','o','o',

'g','g','r','r',

'b','b','o','o',

'y','y','y','y'

);



# or, if we give a number for each color :

$position = array(

0,0,0,0,

2,2,4,4,

3,3,5,5,

2,2,4,4,

3,3,5,5,

1,1,1,1

);

However, this representation has a downside : imagine a solved cube in front of you. Its position is 000044335522443355221111 . Turn the whole cube in the space to have another side color in front of you. It will be for example 000033552244335522441111 . But this is the same cube ! We can’t have 2 IDs !

We can rotate the whole cube 4 times on the 6 faces. There are 24 different notation for the same cube.

To reduce this number to one, we can sort thoses 24 notations by alphabetic order : 000022443355224433551111|000033552244335522441111|000044335522443355221111... .

The first element is used as the key to identify the position (ID).

The ID for the solved cube is : 000022443355224433551111 .

Then, we can code, the following methods :

domove : to play a move, i.e. turn a face like this :

Letters mean : Front, Back, Up, Down, Left, Right. Moves are done clockwise. A “i” is counter clockwise.

In the program, moves are done by swaping the array elements. For example, a move up (U), will change the elements 0 to 3 and 4 to 11 (refer to the description upper).

undomove : to undo a move (just “domove Li” to undo L ;))

: to undo a move (just “domove Li” to undo L ;)) setpos : to set a given position to the cube.

: to set a given position to the cube. rotx : do moves U, Di, and your cube has rotated a 1/4 turn.

: do moves U, Di, and your cube has rotated a 1/4 turn. roty : do moves Li, R.

Generating all possible positions

Or how to generate 3 674 160 ids for all the positions of the cube and the links between them.

Here is the algorithm :

we start from the solved cube

save that this position is to do



while we have positions to do:



get the ID (A) of the position

add ID (A) to the nodes if not already added



for each 12 possible moves:

play the move

add the ID (B) of this new position to the nodes if not already added

save thate this position (B) is to do if not already done

save the edge A->B in the edges file

undo the move



save the position ID (A) as done

Here is the progress :

After around twenty hours, we have two CSV files of 110 and 585 Mb.

nodes.csv :

pos:ID;:LABEL

000022443355224433551111;Cube

004441330522413305225511;Cube

004441330522330522415151;Cube

000244351522343052411351;Cube

000533512244312501344215;Cube

000533512244343125011452;Cube

000433502524343522411151;Cube

002234051524305312413451;Cube

021425330515430423125014;Cube

And bingo ! there are 3 674 160 nodes ! this number has never been coded. So it really works. The algorithm is correct.

We have the edges.csv file :

:START_ID;:END_ID;:TYPE

000022443355224433551111;004441330522413305225511;MOVE

000244351522343052411351;004441330522330522415151;MOVE

000533512244312501344215;000533512244343125011452;MOVE

000433502524343522411151;002234051524305312413451;MOVE

004550212433143501341225;021425330515430423125014;MOVE

000233551524341423112054;001232540524305123113454;MOVE

021425330405130211235544;031225351045130414235240;MOVE

002341441432535535011022;024552131053404523412130;MOVE

001142542432013435355120;002341441432015355352120;MOVE

You have noticed that i had written headers to import to Neo4j.

Import files to Neo4j

If we have a temporary database, we can delete it :

sudo rm -rf /data/databases/*

Change directory to our CSV files :

cd /var/lib/neo4j/import/data

Import files to Neo4j :

/var/lib/neo4j/bin/neo4j-admin import \

--into /data/databases/graph.db \

--nodes nodes.csv \

--relationships edges.csv \

--delimiter ";" \

--array-delimiter "-" \

--quote "'"

It is really fast ! 3,6 millions nodes and 10,6 millions of edges imported in 1 minute and 8 seconds !

IMPORT DONE in 1m 8s 438ms.

Imported:

3674160 nodes

10631005 relationships

3674160 properties

Peak memory usage: 79.57 MB

NB : you have to restart Neo4j : sudo service neo4j restart .

Solving the cube

Or finding the shortest path between two nodes.

Thanks to a graphical user interface coded in Javascript for example, we let the user describe his cube position. We generate the ID of his position, for example 002140534352524214031513 . We ask Neo4j, to find the shortest path between it and the solved ID which is 000022443355224433551111

Here is the Cypher query :

MATCH (c1:Cube), (c2:Cube), p = shortestPath((c1)-[m:MOVE*]-(c2))

WHERE c1.pos="002140534352524214031513" AND c2.pos="000022443355224433551111"

RETURN p

The results comes in 0.08 seconds :

NB : A directed link is not necessary but in Neo4j links are directed. We don’t care about the directions. A move from node (A) goes to (B) and (B) goes to (A).

We could’nt add the move name to the edges [MOVE] because we have simplified 24 notations to 1.

So to find the move name (F, B, D, Di, L…) between two nodes, we can do like this :

We start from the given position, we play the 12 possible moves, and we check each ID of the new position, to match with the ID of the next node of the solution got from Neo4j.

I made some tests. Cube is really solved at most in 14 moves !

Conclusion

Try the Rubik’s pocket cube solver online.

Have fun.

References