The number of organizations investing in Scala is exploding, and for good reason. Scala combines Object Oriented and Functional capabilities as well as immutability, which makes it an extremely powerful foundation for applications that need to run at massive scale.

This series on Scala aims to bridge the gap between theory and practice by focusing on something that is not easily found on the open internet: Actual examples of functional concepts being used in production, at scale. We’ll even give you code samples!

Part 1 of the series dives into something that’s near and dear to all of us: How to incorporate error handling as a primary concern with a minimal level of effort.

Disclaimer: This series assumes that you have a basic knowledge of Scala. You’ll still be able to read most of the examples if you don’t know Scala, but you might miss some nuances.

The Traditional Way

Exception handling. I thought this was the slickest way to do things back in the day. So you mean to tell me I can just create and throw some exception in the bowels of my application and let it bubble back up to the top? Sounds great!

Except when you put it into practice, in production, at scale.

There are two major issues with traditional exception handling in Scala: They are untyped, and they sit outside regular program control flow — they “break the rules.” Let’s look at some code:

/** * Validates whether some string is a valid ID */ def parseId(idString: String): String = idString match { case id if id != null && id.nonEmpty && id.length <= 20 => id case _ => throw new RuntimeException("ID cannot be empty, null, or greater than 20 characters") } try { val parsedId = parseId("12345") println(s"ID $parsedId is valid!") } catch { case e: RuntimeException => println(s"Error parsing ID: ${e.getMessage}") }

Two things jump right out:

The error case is an afterthought. It’s not part of the method signature at all. No clients or users of that method would know if the maintainer changed the code somewhere down in the bowels of that application which throws that exception.

Always remember this: Compile time safety is your friend. Unhandled runtime exceptions are your enemy.

This issue crosses the language barrier as well. You’ll see constructs like the following absolutely everywhere in Java:

public void myCrappyMethod() throws IOException, MalformedIdException, SomeExceptionFromTheDepths { File f = getFile("some/path/to/file.json"); appendStuffToJsonFile(f); saveFile(f); }

Can you tell me what method is going to throw which exception? Doesn’t really matter; we’re just going to pass them up the chain and hope for the best!

Want another issue with traditional exception handling? There are no graceful ways to accumulate errors. Sure you could smash together a list of strings before you throw an exception and add that as the message, but what if you don’t control the code throwing the exception? Now you basically need to catch the exceptions and try to add them to some kind of list of exceptions and then check if myErrorList.length > 0 somewhere down the line. Convoluted, tons of noise, and hard to maintain.

Enter the Concept of Either

Here at Threat Stack we see a lot of errors. This is mainly from the sheer amount and diversity of data that we ingest every second. Handling and collecting those errors is vital to a stable platform. In Scala predef, there is a class known as Either. Either allows us to specify that a method will return Either thing A or thing B. Right away, error conditions become first class citizens and part of the method signature. You’ll see a shiny new method called getMeMyStringPeasant() below which explicitly declares that it will return Either[AppError, String]. That reads so nice, doesn’t it? Enough plain text; let’s check out some code:

// Generic application error class case class AppError(msg: String) def getMeMyStringPeasant(shouldFail: Boolean): Either[AppError, String] = shouldFail match { case true => Left(AppError("We failed you!")) case false => Right("We always win!") } val result = getMeMyStringPeasant(false) // Since we asked for the “Good” result, we do some transformation // and print out the result. result.right.map(_ + " But why do I need right?") .right.foreach(println(_))

Remember the Either type declaration. The right side has the String. The left, or “Bad” side, would have an AppError if it failed. This example will print “We always win! But why do I need right?”. The “why do we need the right” part is answered below.

OK but how would you handle the error case? Well, since our AppError is on the left side, we use the left value. Let’s change the code up to this:

// Pass true to hit the error condition val result = getMeMyStringPeasant(true) result.right.map(_ + ": but why do I need right?") .right.foreach(println(_)) // Print the error out to the user result.left.foreach(error => println(error.msg))

This time, it will print out “We failed you!”. You’ll notice that I left in the right side transformations and output. That’s OK since those will never run if we are working with Either.Right values.

This works fine for a tiny example like this, but in the real world we’re usually dealing with something much more complicated. A true example could be something like: dealing with input from various sources and then calling APIs with that input or persisting them in a database somewhere. All of these actions are fraught with danger and possibilities for errors. Let’s mock out a scenario where we get an ID, need to validate/parse it, and then persist it to a backend datastore. We won’t actually persist it, but we’ll show what we want to accomplish. For multiple step processes like this, I really want to use a for-comprehension. I’m going to be calling two methods that both return an Either[AppError, _]. The underscore just shows that the right side values could be different types (and they will be). Time for some juicy code! Here’s the setup:

// Generic application error type class trait AppError { val msg: String } // Specific error types case class MalformedIdError(msg: String) extends AppError case class DbError(msg: String) extends AppError /** * Makes sure that the given ID is valid. A valid ID is defined as: * 1) Not null * 2) Not empty string * 3) Not greater than 20 characters * @param idString * @return */ def parseId(idString: String): Either[MalformedIdError, String] = idString match { case id if id != null && id.nonEmpty && id.length <= 20 => Right(id) case _ => Left(MalformedIdError("ID cannot be empty, null, or greater than 20 characters")) } /** * Our sassy method that tries to insert the ID into a fake database. Returns the * number of characters "inserted" into the DB */ def storeThatId(id: String): Either[DbError, Int] = { // Normally there would be some code for DB here but let's just keep // it simple. If the database is not "full" we will complete the insert // This mock method will check DB size by securely running our Math.random function def isDbFull: Boolean = { Math.random() * 100 > 50 } isDbFull match { case false => Right(id.length) case _ => Left(DbError("Database is full! Try vacuuming or something!")) } }

Well that’s some great looking code, but how do we actually use it? A for-comprehension would be perfect!

val idCreationResult = for { validId <- parseId("123456").right dbResult <- storeThatId(validId).right } yield dbResult // Display the results idCreationResult.right.foreach(_ => println("Created the ID!")) idCreationResult.left.foreach(err => println(s"Error inserting ID: ${err.msg}"))

Now, depending on how lucky you are, you’ll either see “Created the ID!” or “Error inserting ID: Database is full! Try vacuuming or something!”

But wait. What happens if parseId() fails? Well that’s the beauty of Either. It’s a form of short circuit error handling. That means on first failure, we’ll get a left value and anything after parseId() will not run. To teh code!

// Pass null into parseId() val idCreationResult = for { validId <- parseId(null).right dbResult <- storeThatId(validId).right } yield dbResult // Display the results idCreationResult.right.foreach(_ => println("Created the ID!")) idCreationResult.left.foreach(err => println(s"Error inserting ID: ${err.msg}"))

This time you’ll see: “Error inserting ID: ID cannot be empty, null, or greater than 20 characters”. That’s a thing of beauty! We can now design code that is more modular, error-centric, typesafe, and fault-tolerant.

That wraps up the basics of Either. Keep reading to go further in depth or take a break. I’m probably not your supervisor, so I can’t tell you what to do.

At Threat Stack, we don’t use Either directly. We mostly use a class called Xor which is in the Cats library. Cats is a library that I’ve come to really love. It’s a functional utility library that aims to be simpler to use than the more well known (and highly scientific) Scalaz library.

The issue with Either is that you have to explicitly map over the right value to handle the “Good” case. Most of the time you want to map over the good case automatically and explicitly handle the “Bad” case. Xor and Either are isomorphic (any Either value can be rewritten as an Xor value, and vice versa). Xor is monadic and right-biased, meaning that you will map(), flatMap(), and foreach() over the right value by default (without having to call .right). This allows you to use them naturally in for-comprehensions to map over the good results (the most common use case). Let’s rewrite our example using Xor. To teh codez!

import cats.data.Xor // Generic application error type class trait AppError { val msg: String } // Specific error types case class MalformedIdError(msg: String) extends AppError case class DbError(msg: String) extends AppError /** * Makes sure that the given ID is valid. A valid ID is defined as: * 1) Not null * 2) Not empty string * 3) Not greater than 20 characters * */ def parseId(idString: String): Xor[MalformedIdError, String] = idString match { case id if id != null && id.nonEmpty && id.length <= 20 => Xor.right(id) case _ => Xor.left(MalformedIdError("ID cannot be empty, null, or greater than 20 characters")) } /** * Our sassy method that tries to insert the ID into a fake database. Returns the * number of characters "inserted" into the DB */ def storeThatId(id: String): Xor[DbError, Int] = { // Normally there would be some code for DB here but let's just keep // it simple. If the database is not "full" we will complete the insert // This mock method will check DB size by securly running our Math.rand function def isDbFull: Boolean = { Math.random() * 100 > 50 } isDbFull match { case false => Xor.right(id.length) case _ => Xor.left(DbError("Database is full! Try vacuuming or something!")) } } val idCreationResult = for { validId println("Created the ID!")) idCreationResult.leftMap(err => println(s"Error inserting ID: ${err.msg}"))

As you can see, not much changed. The return type changed from Either to Xor. That’s cool, but the really cool thing is in how we use the methods and handle results. Notice that we don’t call .right in the for-comprehension anymore. We also don’t need left or right to display the results (except for explicitly handling the error case by calling leftMap()). The other nice thing is that Xor supports infix type declarations. This means you can rewrite:

Xor[DbError, Int]

To be:

DbError Xor Int

It’s completely up to user preference, but I like the infix notation.

Accumulating Errors

OK, let’s get fancy. We’re going to take another step down the rabbit hole to becoming insanely responsible programmers. We just looked at how to handle error flows in a short-circuit scenario. That is, once the first error is hit, we want to split off normal program flow and handle the error. That’s neat, but what if we want to try a bunch of things and report any and all errors that occurred during the operation? This is where Validated from the Cats library comes into play. Unlike Xor which is short-circuit, Validated exists to accumulate errors.

A real-world example that we have at Threat Stack is validating the incoming data from various event streams. When an invalid event comes in, we want as much information about it as possible. If multiple fields are invalid, we want to know about all of them, not just that we caught one bad field when in reality we have a larger issue that could be solved if we had more context. So how do we accomplish this? To teh codez!

Note: This is where non-Scala folk might lose a good amount of context, but the overall idea should be accessible to everyone.

Up first, let’s get some model classes built. We’ll use a generic MessageError type class and extend that for each specific type of error. In this case, we’ll have two error scenarios:

MissingFieldError: This error is generated when a required field in the message is missing. FieldParseError: This will be generated when we try to parse a specific field but fail miserably because of any number of reasons.

import cats.Apply import cats.data._ import cats.data.Validated.{Invalid, Valid} import cats.std.list._ import org.joda.time.DateTime import scala.util.Try // Type classes for the various error cases sealed trait MessageError { val field: String val msg: Option[String] } case class MissingFieldError(field: String, msg: Option[String] = None) extends MessageError { override def toString = s"Required field is missing: $field - ${msg.getOrElse("")}" } case class FieldParseError(field: String, msg: Option[String] = None) extends MessageError { override def toString = s"Unable to parse field: $field - ${msg.getOrElse("")}" } /** * Simple class for representing our parsed message */ case class ValidMessage( id: String, insertTime: DateTime, payload: String )

Now that we have models, let’s define an object that will basically parse the fields we know about and return the result. Each getter, will return a Validated[MessageError, _] (where the underscore is any type a valid field could be).

object MessageParser { private def getId(msg: Map[String, Any]): Validated[MessageError, String] = { msg.get("id") match { case Some(id: String) if id != null && id.nonEmpty => Valid(id) case Some(invalidId) => Invalid(FieldParseError("id", Some(s"Invalid ID: $invalidId"))) case _ => Invalid(MissingFieldError("id")) } } private def getInsertTime(msg: Map[String, Any]): Validated[MessageError, DateTime] = { msg.get("insert_date") match { case Some(rawDateTime: String) => Try(DateTime.parse(rawDateTime)).toOption match { case Some(validDate) => Valid(validDate) case None => Invalid(FieldParseError("insert_date", Some(s"Invalid raw date: $rawDateTime"))) } case Some(invalidRawDate) => Invalid(FieldParseError("insert_date", Some(s"Invalid raw date: $invalidRawDate"))) case None => Invalid(MissingFieldError("insert_date")) } } private def getPayload(msg: Map[String, Any]): Validated[MessageError, String] = { msg.get("payload") match { case Some(data: String) if data != null && data.nonEmpty => Valid(data) case Some(invalidData) => Invalid(FieldParseError("payload", Some(s"Invalid payload: $invalidData"))) case None => Invalid(MissingFieldError("payload")) } } }

We have three fields that we will parse explicitly: Id, Insert Time, and Payload (which in this case is just some string data). We could get a lot fancier than this by having some kind of Reader type class that implicitly pulls fields out as a given type or fails, but for brevity we’ll just use this simple method of extraction. In each getter we have specific logic as to what defines each field as Valid or Invalid.

The last thing we need is the method that actually takes the raw message, and returns a Valid result or a list of errors. Here is what that method will look like:

// ValidatedNel is a special kind of non empty list. A NEL is very cool in the sense // that it statically ensures that the list will contain at least 1 element. // A deep dive of that is outside the scope of this article but make sure // to check them out def parse(message: Map[String, Any]): ValidatedNel[MessageError, ValidMessage] = { // Partially applied type for Apply method // This is an advanced Scala topic so a deep dive of this is outside the scope // of this article. Just know that the partially applied type basically needs // to say that We're going to get a ValidatedNel containing MessageError for the // left value and some generic value for the right (could be String, Int, DateTime, etc) type PartialValidatedNel[A] = ValidatedNel[MessageError, A] Apply[PartialValidatedNel].map3( getId(message).toValidatedNel, getInsertTime(message).toValidatedNel, getPayload(message).toValidatedNel ) { case(id, insertTime, payload) => ValidMessage(id, insertTime, payload) } } }

OK. So there are some advanced concepts in there that we should probably talk about. Using Apply from the Cats library allows us to pass in our partial type and use all the goodness of Applicatives. Why do we need that? Because we want to be able to apply a function to N number of Validated results. In this case, we use map3 to suck in the results from each getter, and when all three are done, we get a result. If the result is Valid, that function with id, insertTime, and payload will get executed, and we will map that to a ValidMessage class. That function will not get executed if we encountered validation errors.

Phew, that’s some code for you! If you need the TLDR: We basically set up a case class called ValidatedMessage which we will populate if all fields are valid in the raw message. We then have an object aptly named MessageParser which is responsible for parsing each field and returning the results via a ValidatedNel instance which will either contain the valid message or a list of errors.

Implementation is cool but how the hell do we use this stuff? That’s the beauty of it. It all boils down to this:

// Our mock raw event val rawEvent = Map( "id" -> "12345", "insert_date" -> DateTime.now().toString, "payload" -> "Damn, I hope this works!" ) // Parse the raw event val parseResults = MessageParser.parse(rawEvent) parseResults.map(validMessage => println(s"Got a valid message! Here it is: $validMessage"))

This ends up printing:

“Got a valid message! Here it is: ValidMessage(12345, 2016-06-29T15:52:35.157-04:00, Damn, I hope this works!)”

Fine. That’s the valid case, but where is the error accumulation that we have been promised since the start of this article? Here goes:

// Data from raw event // Notice the invalid date and payload val rawEvent = Map( "id" -> "12345", "insert_date" -> "foobarDate", "payload" -> null ) // Parse the raw event val parseResults = MessageParser.parse(rawEvent) parseResults.map(validMessage => println(s"Got a valid message! Here it is: $validMessage")) parseResults.leftMap { errors => println("Ugh oh, we encountered message errors!") errors.unwrap.foreach(e => println(e.toString)) }

BOOM! This will end up printing:

Ugh oh, we encountered message errors!

Unable to parse field: insert_date – Invalid raw date: foobarDate

Unable to parse field: payload – Invalid payload: null

You now have a way to handle and accumulate errors in a declarative, type safe manner. That’s a hugely powerful tool to have in your arsenal!

Wrapping Up . . .

If you made it this far, you now know multiple ways to revolutionize the way you handle application errors in Scala. The main takeaway here is that you always want to explicitly declare error and success paths because you get compile time protection and uber clear code paths. You’ll never again be digging through source code to figure out which method is throwing some exception forty layers deep in a stack trace.

Ready for more? Take a look at Part 2 — Scala @ Scale: Compose Yourself!