Get Started with GRAKN.AI

Set up the knowledge graph platform & explore a graph

In this introductory article, I will show how to get up and running with GRAKN.AI. Before we start though, let’s quickly answer one question…

What is GRAKN.AI?

GRAKN.AI is the database for AI. It is a deductive database in the form of a knowledge graph that uses machine reasoning to simplify data processing challenges for AI applications. GRAKN.AI is composed of two parts: Grakn (the storage) and Graql (the language).

Grakn is a database in the form of a knowledge graph, that uses an intuitive ontology to model extremely complex datasets. It stores data in a way that allows machines to understand the meaning of information in the complete context of their relationships. Consequently, Grakn allows computers to process complex information more intelligently with less human intervention.

Graql is a declarative, knowledge-oriented graph query language that uses machine reasoning for retrieving explicitly stored and implicitly derived knowledge from Grakn.

You can find out more on our website, but, for now, let’s assume you want to try it out. This article explains how to get set up, as you can discover a lot about Grakn just from trying it out.

Download GRAKN.AI

GRAKN.AI is supported on Mac OS X or Linux, although we do expect to add Windows support in due course. The first thing you need to do is download and install it.

For more information on how to download older versions of GRAKN.AI, compile from source code, or import the Grakn Java API library as a development dependency, please visit our Github repository.

Prerequisites

GRAKN.AI requires Java 8 (Standard Edition) with the $JAVA_HOME set accordingly. If you don’t already have this installed, you can find it here.

If you intend to build Grakn from its source code, or develop on top of it, you will also need Maven 3.

Unzip the download into your preferred location and run the following in the terminal to start Grakn:

cd [your Grakn install directory]

./bin/grakn.sh start

This will start:

an instance of Cassandra, which serves as the supported backend for Grakn.

Grakn Engine, which is an HTTP server providing batch loading, monitoring and the browser dashboard.

Grakn Engine is configured by default to use port 4567, but this can be changed in the grakn-engine.properties file, found within the /conf directory of the installation.

Useful commands

To start Grakn, run grakn.sh start .

To stop Grakn, run grakn.sh stop .

To remove all graphs from Grakn, run grakn.sh clean

Test the Graql Shell

To test that the installation is working correctly, we will load a simple ontology and some data from a file and test it in the Graql shell and Grakn Visualiser. The example we are using is part of our genealogy-graph project.

The file we will use, basic-genealogy.gql, was included in the /examples folder of the Grakn installation zip up to release 1.4.0, and otherwise, you are welcome to clone it from the Grakn repo on Github. In the code below, we assume that the file is located in the /examples folder.

From the terminal, type in the following to load the example graph. This starts the Graql shell in non-interactive mode, loading the specified file and exiting after the load is complete.

./bin/graql.sh -f ./examples/basic-genealogy.gql

Then type the following to start the Graql shell in its interactive (REPL) mode, type:

./bin/graql.sh

The Graql shell starts and you see a >>> prompt. We will type in a query to check that everything is working.

match $x isa person, has identifier $n;

You should see a printout of a number of lines of text, each of which includes a name, such as “William Sanford Titus” or “Elizabeth Niesz”.

If you see the above output then congratulations! You have set up Grakn. If you are having trouble with the setup or any questions, please ask them on our discussion forum, on Stack Overflow, or on our Slack channel.

Test the Visualiser

Since the initial writing of this blog post, Grakn visualiser has been deprecated in favour of the new Workbase.

The Grakn visualiser provides a graphical tool to inspect and query your graph data. You can open the visualiser by navigating to localhost:4567 in your web browser. The visualiser allows you to make queries or simply browse the knowledge ontology within the graph. The screenshot below shows a basic query ( match $x isa person; ) typed into the form at the top of the main pane, and visualised by pressing “Submit”:

The help tab on the main pane shows a set of key combinations that you can use to further drill into the data. You can zoom the display in and out, and move the nodes around for better visibility.

The Ontology

You can find out much more about the Grakn ontology in our documentation about the Grakn knowledge model, which states that:

“The ontology is a formal specification of all the relevant concepts and their meaningful associations in a given application domain. It allows objects and relationships to be categorised into distinct types, and for generic properties about those types to be expressed”.

You can think of the ontology as a schema that describes items of data and defines how they relate to one another. You need to have a basic understanding to make useful queries on the data, so let’s review the chunks of it that are important for this demonstration:

insert # Entities person sub entity

plays parent

plays child

plays spouse1

plays spouse2 has identifier

has firstname

has surname

has middlename

has picture

has age

has birth-date

has death-date

has gender; # Resources identifier sub resource datatype string;

firstname sub resource datatype string;

surname sub resource datatype string;

middlename sub resource datatype string;

picture sub resource datatype string;

age sub resource datatype long;

birth-date sub resource datatype string;

death-date sub resource datatype string;

gender sub resource datatype string; # Roles and Relations marriage sub relation

has spouse1

has spouse2

has picture; spouse1 sub role;

spouse2 sub role; parentship sub relation

relates parent

relates child; parent sub role;

child sub role;

There are a number of things we can say about ontology shown above:

there is one entity, person , which represents a person in the family whose genealogy data we are studying.

, which represents a person in the family whose genealogy data we are studying. the person entity has a number of resources to describe aspects of them, such as their name, age, dates of birth and death, gender, and a URL to a picture of them (if one exists). Those resources are all expressed as strings, except for the age, which is of datatype long.

entity has a number of resources to describe aspects of them, such as their name, age, dates of birth and death, gender, and a URL to a picture of them (if one exists). Those resources are all expressed as strings, except for the age, which is of datatype long. there are two relations that a person can participate in: marriage and parentship

can participate in: and the person can play different roles in those relations, as a spouse ( spouse1 or spouse2 - we aren’t assigning them by gender to be husband or wife) and as a parent or child (again, we are not assigning a gender such as mother or father at this point).

or - we aren’t assigning them by gender to be husband or wife) and as a or (again, we are not assigning a gender such as mother or father at this point). the marriage relation has a resource, which is a URL to a wedding picture, if one exists.

The Data

We will not reproduce all the data here, but here is a snippet of some of the data that you added to the graph when you loaded the basic-genealogy.gql file. Each statement is adding either a person , a parentship or a marriage to the graph:

$57472 isa person has firstname "Mary" has identifier "Mary Guthrie" has surname "Guthrie" has gender "female";

$86144 has surname "Dudley" isa person has identifier "Susan Josephine Dudley" has gender "female" has firstname "Susan" has middlename "Josephine";

...

$8304 (parent: $57472, child: $41324624) isa parentship;

$24816 (parent: $81976, child: $41096) isa parentship;

...

$122884216 (spouse2: $57472, spouse1: $41406488) isa marriage;

$40972456 (spouse2: $40964120, spouse1: $8248) isa marriage;

...

Don’t worry about the numbers such as $57472 . These are variables in Graql, and happen to have randomly assigned numbers to make them unique.

Querying the Graph

Let’s make some queries. You can do this in the Graql shell or in the visualiser. First, to find all the people who are married, and list their names:

match (spouse1: $x, spouse2: $y) isa marriage; $x has identifier $xi; $y has identifier $yi;

List parent-child relations with the names of each person:

match (parent: $p, child: $c) isa parentship; $p has identifier $pi; $c has identifier $ci;

Find all the people who are named ‘Elizabeth’:

match $x isa person, has identifier $y; $y value contains "Elizabeth";

Querying the graph is more fully described in the documentation.

Using Inference

We will move on to discuss the use of GRAKN.AI to infer new information about a dataset. In the ontology, we have dealt only with a person , not a man or woman, and the parentship relations were simply between parent and child roles. We did not directly add information about the nature of the parent and child in each relation — they could be father and son, father and daughter, mother and son or mother and daughter.

However, the person entity does have a gender resource, and we can use Grakn to infer more information about each relationship by using that property. The ontology accommodates the more specific roles of mother, father, daughter and son:

person

plays son

plays daughter

plays mother

plays father



parentship sub relation

relates mother

relates father

relates son

relates daughter; mother sub parent;

father sub parent;

son sub child;

daughter sub child;

Note: You don’t need to reload the basic-genealogy.gql file into Grakn pick up these extra roles. We simply didn’t show this part in our earlier discussion of the ontology, to keep things as simple as possible. Also included in basic-genealogy.gql are a set of Graql rules to instruct Grakn’s reasoner on how to label each parentship relation:

$genderizeParentships1 isa inference-rule

lhs

{(parent: $p, child: $c) isa parentship;

$p has gender "male";

$c has gender "male";

}

rhs

{(father: $p, son: $c) isa parentship;}; $genderizeParentships2 isa inference-rule

lhs

{(parent: $p, child: $c) isa parentship;

$p has gender "male";

$c has gender "female";

}

rhs

{(father: $p, daughter: $c) isa parentship;}; $genderizeParentships3 isa inference-rule

lhs

{(parent: $p, child: $c) isa parentship;

$p has gender "female";

$c has gender "male";

}

rhs

{(mother: $p, son: $c) isa parentship;}; $genderizeParentships4 isa inference-rule

lhs

{(parent: $p, child: $c) isa parentship;

$p has gender "female";

$c has gender "female";

}

rhs

{(mother: $p, daughter: $c) isa parentship;};

If you’re unfamiliar with the syntax of rules, don’t worry too much about it just now. It is sufficient to know that, for each parentship relation, Graql checks whether the pattern in the first block (left hand side or lhs) can be verified and, if it can, infers the statement in the second block (right hand side or rhs) to be true, so inserts a relation between gendered parents and children.

Let’s test it out!

First, try making a match query to find parentship relations between fathers and sons in the Graql shell:

match (father: $p, son: $c) isa parentship; $p has identifier $n1; $c has identifier $n2;

Did you get any results? Probably not, because reasoning is not enabled by default at present, although as Grakn develops, we expect that to change. If you didn’t see any results, you need to exit the Graql shell and restart it, passing -n and -m flags to switch on reasoning (see our documentation for more information about flags supported by the Graql shell).

./bin/graql.sh -n -m

Try the query again:

match (father: $p, son: $c) isa parentship; $p has identifier $n1; $c has identifier $n2;

There may be a pause, and then you should see a stream of results as Grakn infers the parentships between male parent and child entities. It is, in effect, building new information about the family which was not explicit in the dataset.

You may want to take a look at the results of this query in the Grakn visualiser and, as for the shell, you will need to activate inference before you see any results. Browse to the visualiser at localhost:4567 and open the Config settings on the left hand side of the screen. When the page opens you will see the “Activate Inference” checkbox. Check it, and try submitting the query above or a variation of it for mothers and sons, fathers and daughters etc. Or, you can even go one step further and find out fathers who have the same name as their sons:

match (father: $p, son: $c) isa parentship; $p has firstname $n; $c has firstname $n;

If you want to find out more about the Graql reasoner, please visit our documentation.

Data Migration

In this example we loaded data from basic-genealogy.gql directly into a graph. However, data isn’t often conveniently stored in .gql files and, indeed, this data was originally in CSV format. Please visit the migration examples to learn more about loading data into a Grakn knowledge graph.

Where Next?

This post was a very high-level overview of some of the key use cases for Grakn, and has hardly touched the surface or gone into detail. The rest of our developer documentation and examples are more in-depth and should answer any questions that you may have, but if you need extra information, please get in touch.

A good place to start is to explore the Grakn Quickstart and the documentation for: