





Cypher: Neo4j’s Graph Query Language

>

<

An Example Dataset: StarCraft

Building

Unit

Requires

Factory

Barracks

Supply Depot

Supply Depot

Units

Medevac

Starport

Factory

Builds

Buildings

Barracks

Reaper

Ghost

Marauder

Marine

Units

Buildings

SVC

Supply Depot

Query 1: What Units can be Built at the Barracks? The MATCH and WHERE Clauses

MATCH

Building

Building

Build

Unit

b

r

u

MATCH

WHERE

Barracks

RETURN

Query 2: Average Unit Cost

MATCH

Builds

RETURN

b

u

RETURN

Building

Unit

Building

b.name

Query 3: What Buildings and Units are Unlocked by Construction of an Engineering Bay?

Requires

Engineering Bay

Buildings

Units

Builds

Buildings

Buildings

Requires

Requires

Engineering Bay.

Engineering Bay

Engineering Bay

Query 4: Which Buildings Have No Dependencies?

Buildings

Units

WHERE

Building

Requires

Refinery

Command Center

Supply Depot

Query 5: Which Units Have Additional Requirements for Construction?

MATCH

MATCH

Building

Builds

Unit

Building

Unit

MATCH

u

MATCH

WHERE

Requires

Builder

Query 6: What’s the Most Expensive Unit that Can Be Built at the Factory? Ordering and Limits

MATCH

WHERE

Factory

Unit

ORDER

LIMIT

RETURN

LIMIT 1

Building

Query 7: What Do the Barracks Unlock up to Two Levels Deep? Traversing Two-Level Relationships

RETURN

Barracks

MATCH

Barracks

Bunker

Orbital Command

Factory

Ghost Academy

Barracks

Starport

Armory

Query 8: What Are All The Dependencies of the Starport Building? What Not To Do

Starport

Requires

Battlecruiser

Query 9: What Are All The Buildings Required to Construct a Battlecruiser? Cypher’s Shortest Path Function

shortestPath

Query 10: What Are All The Buildings Required to Construct a BattleCruiser and How Much Will It Cost? Cypher’s “Unwind” Function

UNWIND

RETURN

Building

Query 11: What Are All the Necessary Components to Build a Battlecruiser, and Where Does it Need to be Built? Multiple MATCH Clauses

MATCH

WHERE

Unit

u

MATCH

Editor’s Note: Last October at GraphConnect San Francisco , Nicole White – Data Scientist at Neo Technology – delivered this presentation on how to write Cypher queries for your most common connected-data questions.For more videos from GraphConnect SF and to register for GraphConnect Europe, check out graphconnect.com ..When you ask someone what they love about Neo4j Cypher is always at the top. Cypher is essentially ASCII art ; you draw out your desired graph pattern in your code.A node is indicated with open and closed parentheses, a data relationship is indicated by open/close square brackets, and to specify a pattern you use hyphens in combination with the nodes and relationships.If you want to find a “node-relationship-node” pattern, you include two parentheses, a hyphen, open/close square brackets, another hyphen followed by a node on the other side, indicated below:With Cypher, you most commonly specify a relationship direction with the “greater than”or “less than”signs. The second row above shows that the node on the left has an outgoing relationship to the node on the right, while the third example shows the opposite.Let’s examine an example dataset from the game StarCraft Below is a hierarchical tree of requirements that demonstrate the different types of buildings you can construct and their required components. For example, the below hierarchy indicates that to build a barracks, you first need to have a command center.We imported this tech tree into Neo4j because it’s very good at storing, modeling and querying tree-like structures. Below is the resulting graph:In this dataset, we only have two node labels —and— while the relationship, which is the hierarchy of requirements, is labeledAbove, the bluenode in the center of the graph is a building that requires a, which requires a. There is an extensive hierarchy of building requirements that extends to the lowest node, which is typically theThe graph also indicates thathave requirements; for example, arequires awhich requires aWe also haverelationships, which demonstrate what is built by the different. For example, thebuilds aandAdditionally, resources — such as minerals and gas — are required in order to createand. The below example demonstrates the resource requirements for theunit andbuilding:The most important component when writing a query is theclause, which is where you draw the graph pattern that will be retrieved by the query. In the below example, we indicate that we want to start with a node labeled, which is included after a colon and surrounded by parentheses and is our entity type:Here we’ve indicated that we want to find anode that has an outgoingrelationship to anode. We chose to use the identifiersandto precede the colons, which are now bound to the entities and can be used in the following clauses.Below theclause we have theclause, which indicates that the type of building we want is. Theclause indicates the type of entities we want returned to us by our query.Neo4j returns the following visual result of our data.To find out the average cost of each unit, we use the sameclause but without the identifier on therelationship because we don’t want it in the. However, we do include theandidentifiers because we want those returned in theclause:The below table is the result of our query; it provides us with the name of theand the average cost of eachat thatin terms of minerals and gas.In the above example, theis unique, but this doesn’t need to be the case. You can have multiple buildings with the same name, and your query can return all of them unless you have a uniqueness constraint on that property, such as “buildings have to be unique by the name property.”In StarCraft, once you’ve constructed a certain number of buildings and units, you can construct an Engineering Bay. This allows you to build even more components.To find out what buildings and units can only be built once an Engineering Bay has been constructed, we need to traverse therelationship up one more level from theIn Query 2, we matchedtothrough arelationship. In this query, we are matchingwith otherthrough arelationship, which is why we’ve applied the building label to the nodes on either side of therelationship. However, we’ve indicated a specific building type to the node on the far right,Neo4j returns the following, with thein the middle and the immediate one-step-out buildings that are unlocked once thehas been constructed.Which buildings can we build right away without having any otheroron the map?To answer this question, we include a pattern in theclause. We aren’t inquiring about data relationships; our only requirement is the return of entities that don’t have anyrequirements.As indicated below, there are three buildings that do not have anyrelationships attached to them: theand. In other words, these are the only buildings that you can construct from the very start of the game.The prerequisite for constructing most units is simply related to a specific building. However, some have additional dependencies, which we indicate in the belowclause:In theclause we’ve indicated that we want to find athat, and the first additional requirementfor that. In the first line of theclause, we’ve bound the unit node to, which we also use in the second line of theclause.In theclause, we indicate with the less-than and greater-than signs that we don’t want entities returned that are the same (i.e., we don’t want units with arelationship that pointed back to the samebuilding).Below are the Neo4j results:Theclause shows that a building builds a unit, and theclause shows that the name of the building is. We want to return the name of the, as well as the amount of required mineral and gas it takes to build it.You can useandright after theclause, and request the results in descending order, i.e. from most expensive to least expensive.refers to the most expensive unit that can be built at this particularOur prior queries have all included single-level relationships (i.e., the entities have been directly related). However, this is a “variable length” query. In this case, we want toresults that are separated by both one and two degrees from the. In theclause, we indicate this by including an asterisk and the number 2 in our relationship:Below are the results in Neo4j:This map shows that theunlock theand. Two steps away, theunlock theandTo answer this query, we move from thenode all the way down the hierarchy, which you do by including an asterisk while omitting the “maximum” on the relationship:This is not a good query, because it requires an exhaustive search of your entire database, and will only return the below:It did therelationship all the way through the hierarchy until it didn’t have anywhere to go. This isn’t a great way to create a dependency, so let’s explore a better dependency with theas an example.To address this query, we rely on Cypher’sfunction, which allows you to find the single shortest path between nodes. The syntax for this function is demonstrated below:This query returns the following Neo4j graph:To answer this query we rely on Cypher’sfunction, which allows you to expand a collection into a sequence of rows. In the below example, we grab the nodes out of a path, place those nodes into their own separate rows, and thenthe name and amount of resources required to build it:The below table shows the amount of required resources to build eachin the Neo4j graph from Query 8:To answer this question, we will need to write multipleandclauses. In the below example, our first two lines are identical to the previous “shortest path” example, and we’ve boundto the identifierNeo4j then returns the following graph:So that is how you can use multipleclauses within a Cypher query.