Graph data processing is already an integral part of big-data analytics with many applications in various domains including Finance, Cyber Security, Compliance, Retail, and Health Sciences. The adoption of graph processing is expected to further grow in the upcoming years. This is partially because graphs can naturally represent data that captures fine-grained relationships among entities. Graph analysis can provide valuable insights about such data by examining these relationships. Oracle Labs PGX has been providing graph solutions both for Big Data and for Relational Database customers. In this post, I will describe our new distributed graph traversal solution that significantly improves performance and memory consumption of Oracle PGX's in-memory distributed graph query engine. That is especially true on very large graph queries where our competitors either fail to execute due to memory usage (see the performance figures later in the post), or fall back to slow and inefficient disk-based joins. Typically, graph analysis is performed with two distinct but correlated methods, namely computational analysis (a.k.a graph algorithms) and pattern matching queries. Most graph engines nowadays, such as Oracle Labs PGX and Apache Spark GraphX/GraphFrames, support both graph algorithms and graph queries. With computational analysis, the user executes various algorithms that traverse the graph, often repeatedly, and calculate certain (numeric) values to get the desired information, e.g., PageRank or shortest paths. Pattern matching queries are declaratively given as graph patterns. The system finds every subgraph of the target graph that is topologically isomorphic/homomorphic to the query graph and satisfies any accompanying filters. For example, the following PGQL (for Property Graph Query Language) query: returns the persons p1 (called "John Doe") and p3 who have the largest number of common friends. Such queries can be used for example for friend recommendation. Graph queries are a very challenging workload because they focus on the connections in the data. By following connections, i.e., edges, graph query execution can possibly explore large parts of the graph, generating large intermediate and final result sets with a combinatorial explosion effect. For example, on a very old snapshot of Twitter (known as the "Twitter graph" in academic graph-related research papers), a single-edge query (e.g., (v0)→(v1)) matches the whole graph, counting 1.4 billion results. A two-edge query (e.g., (v0)→(v1)→(v2)) returns more than nine trillion matches. Additionally, graph queries can exhibit extremely irregular access patterns and therefore require low-latency data accesses. For this reason, high-performance graph query engines try to keep data in main memory and scale out to a distributed system in order to handle graphs that exceed the capacity of a single node. Graph data processing and querying is an increasing market that has many applications in various domains including Finance, Cyber Security, Compliance, Retail, and Health Sciences. These applications often require querying very large graph data in a fast and efficient manner. Oracle has been providing graph solutions both for Big Data and Relational Database audiences, while there are also commercial competitors like Amazon Neptune, Neo4J , as well as open source alternatives including Spark GraphFrame. Our invention can provide significant differentiation for Oracle's solution over those competitors. Our invention significantly improves performance and memory consumption of Oracle's current in-memory distributed graph query engine, especially on very large graph queries where our competitors either fail to execute due to memory usage (e.g., Spark GraphFrame) or to fall back to slow and inefficient disk-based joins (Neptune or Neo4J). Traditional Distributed Graph Traversal Approaches In a distributed system, graphs are typically partitioned across machines by vertices, meaning that each machine is storing a partition of the vertices of the overall graph, plus the edges corresponding to that vertex. For example, in the graph below, machine 0 stores vertices v0, v1, and v2, while machine 1 holds vertices v3 and v4. The edge between v0→v1 is local to machine 0, while the edge connecting v2→v3 is remote, as it spans machines 0 and 1. For large distributed graphs, none of the traditional graph exploration/traversal approaches is suitable for distributed queries. Breadth-first traversals and distributed joins quickly explode in terms of intermediate results and pose a performance challenge over the network. Depth-first traversals are challenging to paralellize and result in completely random data access patterns. In practice, most engines use breadth-style traversals combined with synchronous / blocking communication across machines. Breadth-First Traversals In breadth-first traversals, the execution expands the query in width. The query pattern is matched to the target graph edge-after-edge. For example, matching pattern (a)→(b)→(c) to the example graph above could proceed by matching edge (a)→(b) to all graph edges, namely (v0)→(v1), (v0)→(v2) etc., and then proceed with expanding these intermediate results to match edge (b)→(c). Typically, the execution proceeds with synchronous traversals, i.e., the first edge is completely matched before moving to the next edge to match. Expanding the query breadth-first is not ideal for a distributed system. First, materializing large sets of intermediate results at every step leads to an intermediate-result explosion. Breadth-first traversals typically have the benefit of locality (i.e., accessing adjacent edges one after the other). Unfortunately, locality in distributed graphs is much more limited, since many of the edges that are followed are remote. As a result, a large part of the intermediate results produced at each part of the query must be sent to the remote machines, creating large network bursts. Distributed Joins Graph traversals can be also expressed as relational joins. Following the edge (a)→(b) can be mapped to a join (or two) between the "vertex table" (holding the graph vertices) and the "edge table" (holding the edges): Distributed joins face the same problems as breadth-first traversals, plus an additional important problem. They perform table joins instead of graph traversals on top of specialized graph data structures. Unsurprisingly, graph-specific data structures are much faster than generic joins (see later in this post the performance comparison of Oracle Labs PGX Distributed to Apache Spark GraphFrames). Depth-First Traversals In depth-first traversals, the execution expands the query in depth. The query pattern is matched to the target graph as a whole, result-by-result. For example, matching the pattern (a)→(b)→(c) to the example graph above could proceed by matching (v0)→(v1)→(v2), then (v0)→(v2)→(v3), etc. The main advantage of expanding the query depth-first is that intermediate results can be eagerly expanded to final results, thus reducing the memory footprint of query execution. Nevertheless, depth-first traversals have the disadvantages of not leveraging locality and of more complicated parallelism. The lack of locality is depth-first results in "edge chasing" – i.e., following one edge after the other as dictated by the query pattern – thus not accessing adjacent edges in order. The complication for parallelism manifests because the runtime cannot know in advance if there is enough work for all threads at any stage during the query. For instance, a query like the `MATCH (p1:person)-[:friend]->(p2:person)<-[:friend]-(p3:person) WHERE p1 <> p2 AND p2 <> p3 AND p1.name = "John Doe"` that I described in the beginning of the post will probably produce a single match for (p1). If this intermediate result is expanded in a depth-first manner, the amount of intermediate results (hence the parallelism) will grow slowly. Dynamic Asynchronous Traversals For Distributed Graphs The Parallel Graph AnalytiX (PGX) toolkit, developed at Oracle Labs, is capable of executing graph analysis in a distributed way (i.e., across multiple servers); we refer to this capability as PGX.D. We are experimenting in PGX.D with a new hybrid approach to executing graph traversals that offers the best of breadth-first and depth-first traversals. Competing graph engines include the classic trade-off between performance and memory consumption for graph query execution: Sacrifice performance: One option is to use a fixed memory area (typically several gigabytes) for the execution, but spill intermediate results that do not fit to disk. Sacrifice memory: Another option is to perform the whole computation in memory. If the intermediate results do not fit the memory, this approach cannot compute this query on that graph. PGX.D enables the in-memory execution of any-size query without sacrificing memory or performance. In particular, graph queries in PGX.D: Can operate with a fixed, predefined amount of memory for storing intermediate results; Only use this memory for computations, i.e., do not spill any intermediate results to disk; and Can essentially calculate queries of any size, because intermediate results are turned to final results "on-demand", to keep memory consumption within limits. On the technical side, PGX.D achieves the aforementioned characteristics by deploying: Dynamic traversals, using Depth-first execution when needed, thus aggressively completing intermediate results and keeping the memory consumption within limits; and Breadth-first execution when possible, thus removing the performance complexities of depth-first traversals. Asynchronous communication of intermediate results from one machine to the other, thus not blocking/delaying local computation due to remote edges. Flow-control and incremental termination to keep global memory consumption (including messaging) within limits and guarantee the query termination (i.e., avoid deadlocks). Example Consider matching the pattern (a)→(b)→(c) to our example graph presented above and consider a worker thread currently starting to match from vertex v0. The thread can bind v0 as (a) and then try to expand to the edge (a)→(b). The worker could match (b) with v1. At this point, the dynamic traversal approach in PGX.D, based on how much memory does the query already consume, will dictate whether the worker will continue matching (a)→(b) edges (breadth) or should continue with the (b)→(c) edge (depth). In either case, the worker will eventually match v3 for (b) in which case PGX.D simply buffers the intermediate match to a message destined for machine 1 and continues matching the next (a)→(b) edge in a bread-expanding manner. Of course, PGX.D controls the number of outgoing messages / intermediate results in order to maintain the execution within limits. You can find more details in our GRADES 2017 publication that describes the main runtime queries in PGX.D (missing the local dynamicity which is described in a follow-up publication currently under submission). Comparing PGX.D to Open-Source Systems We use the LDBC social network benchmark graph (scale 100, 283 million vertices, 1.78 billion edges) and queries (we adapted the queries to reflect the current features of PGX.D; these changes mainly include the removal of HAVING clause, subqueries, and regular path queries). We compare PGX.D to Apache Spark GraphFrames (version 0.7 on top of Spark 2.4.1) and PostgreSQL (version 11.2). Both PGX.D and GraphFrames execute on 8 machines connected with Infiniband. We perform 15 repetitions and report the median run. Clearly, PGX.D is significantly faster than both GraphFrames and the traditional PostgreSQL RDBMS. PGX.D executes the total query suite 29.5- and 17.5-times faster than GraphFrames and PosgreSQL, respectively. In addition, PGX.D is configured to use approximately 16GB runtime memory for intermediate results, while the other two engines are configured to use the whole 756GB (8x for GraphFrames) of memory available in the underlying machines. As I mentioned earlier in this post, GraphFrames implements graph traversals on top of distributed joins on dataframes. Large-Scale Queries In this experiment, we evaluate the engines with very large queries. In particular: Q1: Simple cycle; pattern (v1)→(v2)→(v1) with (a) a COUNT(*) aggregation and (b) AVG aggregations of vertex data; Q2: Two-hop match; pattern (v1)→(v2)→(v3) with (a) a COUNT(*) aggregation and (b) AVG aggregations of vertex data. We execute these queries on graphs of increasing size: Graph # Vertices # Edges Description Livejournal 484K 68.9M Users and friendships Uniform Random 100M 1B Uniform random edges Twitter 42.6M 1.47B Tweets and followers Webgraph-UK 77.7M 2.97B 2006 .uk domains This experiment highlights the true need for the scalable in-memory distributed graph traversal methodology of PGX.D. As the query exploration size increases, GraphFrames and PostgreSQL cannot keep up with the workload. Even with the two smallest graphs, PGX.D is on average 48- and 115-times faster than GraphFrames and PostgreSQL, respectively. Clearly, joins in PostgreSQL are significantly slower than graph traversals in PGX.D. With Q2 on Twitter and Webgraph-UK we see that even the 8 x 756GB = 6TB of total memory (backed by 1+TB of disk) is not sufficient for GraphFrames. As in the previous experiment, PGX.D completes these queries with approximately 16GB of memory in each machine. What's Next We have explored dynamic asynchronous traversals in PGX.D only for graph querying and pattern matching with PGQL. We are currently exploring how to further leverage these fast in-memory explorations for machine learning. For example, we are developing large-scale random walks on top of PGX.D that will serve as the backbone for graph machine-learning solutions. Conclusions I briefly presented our new dynamic asynchronous traversal approach for distributed graphs in PGX distributed mode. Using this approach PGX.D achieves fast, scalable, fully in-memory, with a small memory-footprint distributed graph queries, thus enabling graph processing on a whole new scale of graphs and queries. The hybrid/dynamic traversal functionality is on PGX.D roadmap so stay tuned for new information on its availability. For more information and for trying PGX, you can visit Oracle Labs PGX Technology Network Page.