hlivy

Description

hlivy is a Haskell library that provides bindings to the Apache Livy REST API, which enables one to easily launch Spark applications -- either in an interactive or batch fashion -- via HTTP requests to the Livy server running on the master node of a Spark cluster.

Usage

Start with

import Network.Livy

which brings all functionality into scope. In particular, this exposes a monad Livy that has all the capabilites required to run Livy actions with runLivy . Generally, the format of a Livy action follows the pattern

send $ basicRequestObject requiredArg1 requiredArg2 & requestLens1 ?~ optionalArg1 & requestLens2 ?~ optionalArg2

This action is ran simply:

let req = basicRequestObject requiredArg1 requiredArg2 & requestLens1 ?~ optionalArg1 & requestLens2 ?~ optionalArg2 resp <- runLivy env $ send req

where env is a suitable environment. Concretely, if one wanted to create an interactive session, one would do something like this:

λ => import Network.Livy λ => -- Create a default environment λ => env <- newEnv "localhost" 8998 λ => resp <- runLivy env $ send createSession

The response body, in this case a CreateSessionResponse , should contain the the Session just created.

With this Session at hand, one can run "statements" -- snippets of Spark Scala, PySpark, SparkR, or SparkSQL -- in the given session.

λ => req = runStatement (SessionId 0) "val x = 1 + 1; println(x)" SparkSession λ => resp <- runLivy env $ send req

This response object, in this case a RunStatementResponse , contains the information needed to check on the status of the statement or retrieve results if available.

Batch actions are organized in the Network.Livy.Client.Batch module, and are used similarly:

λ => import Control.Lens λ => -- Application JAR in HDFS λ => req = createBatch "/user/hadoop/my-app.jar" λ => resp <- runLivy env (send req & cbClassName ?~ "com.company.my_app" ?~ cbExecutorCores ?~ 4)

See examples for more example use.