Grakn is a distributed graph database with a built-in knowledge representation and reasoning system (KRR). This provides it with the topological/connective/structural benefits of a graph, and the semantic benefits of a KRR. Structured this way, Grakn is more than just a graph database: it is a fully integrated, distributed knowledge base, specifically designed for complex data.

Because of its architecture, Grakn has a great deal of compatibility with logic programs. Logic predicates like parent are closely represented by edges in a graph. In addition to this, relations in Grakn are modelled as hyper-edges, which makes it easier to represent logical predicates as they can connect an arbitrary number of nodes. As an improvement on logic programming frameworks, Grakn also takes care of the lifecycle of a knowledge base and it makes sure when new data is inserted some constraints are respected and the knowledge doesn’t become inconsistent.

Grakn is the sum of two parts: 1. Grakn, which is the storage layer; and 2. Graql, the query language. All the elements represented in Grakn are part of a defined ontology in which everything is a relation, an entitiy, or a resource. Graql also supports rules that look a lot like logic rules. Things like “If X is a parent of Y and Y is a parent of Z then X is a grandparent of X“ can be expressed by adding inference rules.

Many concepts and ideas here at Grakn Labs have very close resemblance to computational logic concepts and even the inference algorithm we use is very closely related to SLD (Selective Linear Definite) resolution, used in different forms in Prolog or Datalog. This suggests that there may be, at a deeper level, an important degree of equivalence. Exploring the similarities between our distributed knowledge base and logic programming is therefore interesting from a theoretical standpoint. It will also help make Grakn more accessible to the logic programming community.

So, let’s see: can we map a Grakn dataset into a logic program and vice-versa? I’m going to use Answer Set Programming here as a reference mainly because it makes it easier to implement the language constraints. Conveniently, there’s a nice ASP engine available online here where you can try out the things we discuss below. If you are a Prolog person you can ignore the integrity constraints or use some Abductive Logic framework implementation.

Entities and relations

Let’s go through the quickstart tutorial example and let’s start with adding a person.

$s1 isa person has firstname "Mary" has identifier "Mary Guthrie" has surname "Guthrie" has gender "female";

We are saying that there is an entity whose firstname is “Mary” and whose surname is “Guthrie” who is a “female”. It’s an existentially qualified entity, so let’s pick a Skolem constant s1 to write that in ASP. In our graph the identifier is randomly assigned (8280 in the picture).

isa(s1, person).

has(s1, firstname, "Mary").

has(s1, identifier, "Mary Guthrie").

has(s1, surname, "Guthrie").

has(s1, gender, "female").

Note that the predicates are reified here, so the more natural representation gender(s1, “female”) becomes has(s1, gender, “female”). We could express this directly but introducing the predicate “has”, makes it easier to show some of the language constraints that are enforced by Grakn. These constraints come from the fact every concept is typed and must be declared. Resources (like an identifier or a name) must be explicitly associated to entities.

In fact, you can’t really add a person in Graql before defining a person entity type like this:

person sub entity

plays parent

plays child has identifier

has firstname

has surname

has gender;

What we express in this declaration is that you cannot assign a colour to a person, only an identifier, a first name, a surname, and a gender.

We can declare this in ASP as follows:

entity(person). plays(person, parent).

plays(person, child). type_has(person, identifier).

type_has(person, firstname).

type_has(person, surname).

type_has(person, gender).

So far we have just declared that some things have a certain structure and they are allowed to exist: a person has a first name, s1 is a person, s1 has a surname and that surname is Guthrie. But these things are not related.

As mentioned before, the ontology in Grakn enforces integrity constraints based on the given declaration. Similarly, we want ASP to generate an inconsistency when the same constraints are violated. If we add has(person, blue) to our knowledge base, ASP should tell us we have an unsatisfiable program. We can enforce this as follows:

:- isa(I, EType), has(I, A, _), not type_has(EType, A).

Moving on to relations, they are declared as follows in Graql:

parentship sub relation

relates parent

relates child; parent sub role; child sub role;

And here’s how we declare an instance of the relation:

$r1 (parent: $s1, child: $s2) isa parentship;

Similarly to what we did with entities we can write in ASP

% Declaring the relation relation(parentship).

relates(parentship, parent).

relates(parentship, child). % Now our instance, s1 is s2’s parent

isa(r1, parentship).

role(r1, s1, parent).

role(r1, s2, child).

We can enforce some other implicit language constraints by declaring that instances of a role need to respect the declared type

:- role(RelInstance, _, Role), isa(RelInstance, RelType), not relates(RelType, Role).

:- role(RelInstance, I, Role), isa(I, EType), not plays(EType, Role).

The first one is establishing that we can’t have a role that is not declared for a relation and the second one enforces the constraint that an entity can only play roles that are declared. So, in this example, a parenthood can relate a parent and a child and a person can be a child or a parent.

We have quite a few other things in Grakn like inheritance for relations and entities, and resources but that’s for another blog post.

Queries

Let’s try a few queries. These refer to the full knowledge base where three people in total are present.

“List parent-child relations with the names of each person”

In Graql:

match (parent: $p, child: $c) isa parentship; $p has identifier $pi; $c has identifier $ci;

In ASP we just specify what a match is and what we want to see in the output model as follows:

match(Parent, Child) :- isa(R, parentship), role(R, Parent, parent), role(R, Child, child).

#show match/2.

The answer is:

Answer: 1

match(s1,s2) match(s2,s3)

SATISFIABLE

Telling us, that the pairs (s1,s2) and (s2,s3) match the query. Which means s1 is s2’s parent and s2 is s3’s parent.

“Find all the people who are named ‘Titus’”

In Graql:

match $x isa person, has firstname $y; $y val is “Titus”;

In ASP:

match(Person) :- isa(Person, person), has(Person, firstname, "Titus").

#show match/1.



Results in the following answer:

Answer: Answer: 1

match(s2)

SATISFIABLE

Logic programs to Grakn

If we move from logic programs to Grakn, things can be modelled quite naturally, assuming we want to enforce stricter type constraints as shown previously.

Usually a parenthood relation would be represented in a logic program as parent(s1, s2). This codifies less information than our previous definition, but let’s assume we don’t care about roles and type constraints.

The family example would be written in ASP as:

parent(s1, s2). parent(s2, s3). firstname(s1, “Mary”). gender(s1, female). …

This can be represented almost exactly as we did before in Graql. The roles are implicitly codified by the order of appearance in the predicate. So we could define them as parent_role_1, parent_role_2.

Grakn also has rules. We can represent the following rule

grandparent(X, Y) :- parent(X, Z), parent(Z, Y).

As

$grandParentship isa inference-rule

lhs

{(parent: $gp, child: $p) isa parentship;

(parent: $p, child: $c) isa parentship;}

rhs

{(grandparent: $gp, child: $c) isa grandparentship;};

Generalising, any atom (a predicate applied to its arguments like parent(s1, s2)) can be mapped directly into a Grakn relation, but not into a graph edge. Unlike other graph databases, in Grakn relations have arbitrary arity and can represent generic logic predicates.

For example

played(ronaldo, champions_league_final, real_madrid).

Is easily represented as a hyper-edge.

It’s useful to model together for which team a player played a certain game, because that allows expressing rules like this:

player_won(P, X) :- played(P, X, T), team_won(T, X).

For comparison, other graph databases would require “rephrasing” the initial predicate and the rule using an additional entity ronaldo_champions_league_final:

player_performance(ronaldo, ronaldo_champions_league_final). played(ronaldo_champions_league_final, champions_league_final). played_for(champions_league_final, real_madrid). player_won(P, X) :- player_performance(P, Performance), played(Performance, M), played_for(Performance, T), team_won(T, X).

Note that most graph databases don’t support inference rules so the player_won relation would have to be made explicit and written to database for each specific instance (number of players * number of matches extra relations to be inserted).

The rephrasing can be generalised but it always requires the creation of extra non-domain entities like ronaldo_champions_league_final.

Grakn currently doesn’t support rules with negation in the body and generic integrity constraints in its current version (0.14), but we are actively working on it.

Here is a short and simplified reference translation to sum up this section:

Logic Programming

% Resource

resource_name(a, resource_value). % Relationship

relationship_name(a, b). % Rule

p(X, Y) :- q(X, Z), r(Z, Y). % Query

?- query(X, Y).

Graql translation

# Resource

$a isa object has resource_name resource_value; # Relationship

(position_1: $a, position_2: $b) isa relationship_name; # Rule

$ruleX isa inference-rule

lhs

{(position_1: $x, position_2: $z) isa q;

(position_1: $z, position_2: $y) isa r;}

rhs

{(position_1: $x, position_2: $y) isa p;}; # Query

match (position_1: $x, position_2: $y) isa query;

Grakn for logic programmers

Grakn graphs, or more in general any graph, can be codified as logic programs. In fact, a subset of Prolog and ASP including just Horn clauses is Turing complete. What we showed here is that the link between logic programs and Grakn is very direct. Horn clauses can be translated directly, predicates can be modelled as relations, terms as entities and the truth value of an atom is equivalent to the existence of a path.

Overall, we believe Grakn is the ideal substitute for a number of problems for which ASP or Prolog fall short.

If the application requires constant knowledge updates, having a database is much better than revising and reloading a Prolog program. Grakn simplifies the process of acquiring new data by checking integrity constraints on write. This prevents the knowledge base from getting in an inconsistent state which is often hard to debug.

While it’s not trivial to integrate Prolog and ASP with distributed databases, Grakn is built to handle high volumes of data and to be horizontally scalable.

Grakn adopts some of the inference tools developed in the field of logic programming but it focuses on being a database, the focus is on accessing the data and provide the right abstractions for the application layer.

Also, given the underlying graph representation, Grakn provides analytics tool that can be useful to find interesting connections between entities of the domain outside of pure inference (e.g. finding the shortest “friend” path between users, or connected clusters of products).

In short, if you love logic programming, give Grakn a go!