September 4, 2015

One topic that fascinates most software engineers throughout their careers is understanding which is the best programming language. In the Developer’s Adventurous Guide to JVM languages, we looked at and compared a fleet of JVM languages including Scala, Groovy, Clojure. However, this post, Erlang vs. Java, will be quite different from that kind of comparison. We’ll look at the Erlang VM and the Java VM and take note of similarities and differences without trying to perform an apples to apples comparison and say which is best or which we prefer.

Why Use Erlang?

So why Erlang? Well, ever since I saw the following comic strip about writing distributed map-reduce queries in Erlang, I’ve been fascinated by the language.

Quoting the language authors: “Erlang is a programming language used to build massively scalable soft real-time systems with requirements on high availability”. Since its inception it has had a solid understanding of the problems it has intended to solve. Any design trade-offs could have been solved by focusing on a single goal. This simplicity makes the language incredibly purposeful.

The question we need to ask is whether we can take something we have written in a very opinionated language like Erlang and see how easy it is to write in a very mainstream, general purpose language like Java. In no way can I cover all the intricacies of any two languages in a single blog post, but we can certainly start the discussion! Here are the four main areas which this blog post will focus on:

Syntax and expressions

REPL

Concurrency

Other language perks

Comparison of Erlang vs. Java

Let’s start by looking at the most obvious difference that a developer sitting in front of their IDE would spot -- the syntax. For the most part I think that the syntax of a programming language is irrelevant, assuming it’s not too unreasonable and doesn’t give your teammates the necessary weapons to torment you (Scala, I’m looking at you). That said, it is a matter of taste and culture, so it matters a bit.

I’ll assume readers of this post are well versed with Java’s syntax, so I wont talk about that too much. If you’re not, consider it similar to a C++ like language with static type system, mostly object oriented: containing classes and methods on them. There’s a set of keywords and you form the program by writing statements that the virtual machine will execute. Not familiar with C++ either? Sorry, we’ll call that our prerequisite for this post!

Erlang, on the other hand, is a functional language, so the code you write consists of expressions and functions organized into modules. Here’s an example of how it might look:

-module(tut5). -export([format_temps/1]). convert_to_celsius({Name, {c, Temp}}) -> % No conversion needed {Name, {c, Temp}}; convert_to_celsius({Name, {f, Temp}}) -> % Do the conversion {Name, {c, (Temp - 32) * 5 / 9}}. print_temp({Name, {c, Temp}}) -> io:format("~-15w ~w c~n", [Name, Temp]).



If you’re familiar any functional programming languages, you’ll recognize the key parts of the code snippet above which converts temperature values to and from the Celsius scale. For Erlang n00bs, the module is a basic compilation unit in the Erlang language and it can contain functions that are specified by a name, a list of arguments and a list of expressions.

Expressions are the building blocks of an Erlang program, used to organize data into the data types (tuples, lists, maps, etc) and perform operations on them like math or by calling other functions.

Erlang really respects the immutability of the data, making the variables assignable only once in a scope. It values concise code, so the pattern matching and the guards on the function arguments are built right into the language. Let’s jump to an example and define a function like the following:

list_max([Head|Rest], Result_so_far) when Head > Result_so_far ->body



The body of the function will be executed if you call list_max with 2 parameters, where the first is not an empty list and its first element is greater than the second argument to the function. If the arguments don’t match these conditions, Erlang’s runtime won’t find the definition of the function to evaluate. That is really cool, albeit a bit restrictive. You cannot do something similar in Java, although if you’re a fan of this type of function definition you might want to look at Scala. You’ll still be on the JVM, but you’ll have much more freedom with pattern matching.

Erlang vs. Java REPL

One of the first things you may notice when learning any language is how easy it is to run and play with small snippets of code. In many languages you’ll find a REPL environment, in which you can write code, evaluate it and print the result back to you. You can generally hook the REPL to an existing process and use that to investigate internal state or evaluate the code that depends on the code loaded into the process.

Here’s an example of Erlang’s REPL. It’s really convenient and it can be a very powerful tool in the hands of an Erlang master.

$ erl Erlang R14B (erts-5.8.1.1) [source] [smp:2:2] [rq:2] [async-threads:0] [kernel-poll:false] Eshell V5.8.1.1 (abort with ^G) 1>3 + 4 7



The latest version of Java doesn’t have a built-in REPL, however, this is merely a question of time A REPL is being integrated into the upcoming JDK 9. In fact you can already access it with a little bit of manual sorcery.

Right now, the easiest way of getting a REPL capable of investigating the runtime state of the JVM process and evaluating the code on the fly would be to use khem-khemJavaScript console that is bundled with the JDK: jjs. While it doesn’t sound especially glorious or fancy, it is available in your JDK distribution right now and it’s pretty useful. Here’s a small hello world example:

It’s available to you now, it’s useful, it can do math for you, the only limitation currently is that it’s not that easy to attach it to existing JVM process. Give it a try and let me know what you think in the comments below!

Concurrency: Lightweight Processes and Message Passing

The underlying Java concurrency model is pretty clear. Essentially, a JVM has threads which can execute commands in parallel, share data, acquire and release locks and have fun together. On top of that many models aimed at developers can simplify your understanding of what is going on can be used, such as Executors, Actors, Fibers, and more. In fact, we've previously looked at them in a previous post on Flavors of Concurrency in Java.

Erlang's take on concurrency is far simpler, since the immutability of data is strongly encouraged by the language itself, meaning Erlang doesn't allow its threads to access shared data. Since threads don't share data, they are also called processes, but don't confuse them with operation system Erlang processes, these run inside a single virtual machine.

This is not a terrible restriction as message passing is a great way to make individually parallel entities communicate. After all, the OOP model was designed to encourage objects to pass messages to each other, rather than just encapsulation and creating factories.

So in Erlang the main concurrency pattern would be to spawn a process and pass messages to it. To manage its lifecycle and separate the business logic from say exception processing you spawn another process for supervision. Ultimately you’ll end up with a tree-like structure of processes where a root process is supervising all its children:

This is a great model, right! It is simple enough to be well understood by a human, it allows and encourages supervision and error handling, it also makes code quite readable. Also, every process just takes a bit of memory, unlike java.lang.Thread, so this design is actually much more scalable and awesome.

Now back to the Java world. While it’s true that the concurrency primitives which are provided out the box are Threads and Executors (groups of threads in a queue), more advanced models have been created and are well used.

A similar concurrency model would be an actor framework. Let’s take the Akka framework as an example to discuss here. In code, you would create an actor system, like: ActorSystem system = ActorSystem.create("Search"); , which will manage all the actor objects. The Actor concept is actually quite similar to Erlang process concept. Actors can send messages to each other, be organized into hierarchies to enable supervision and decide upon the appropriate action in the event of a failure.

You can find a short example of Akka code in this post on Java concurrency I mentioned earlier, or just check out their great documentation.

Additionally, if you want to take things even further you can look at the Quasar library by ParallelUniverse. It works by instrumenting your code and providing lightweight threads that can send and receive messages to each other using channels, very efficiently. The whole system is backed up by continuations, so some methods can be interrupted at any point to give CPU cycles to other threads. It’s more complicated than this short description of course, but it is a beautiful solution, so you won’t regret even the time you spend just reading the docs for Quasar.

Perks: Code Reloading & Fault Tolerance

A really great benefit of having stateless functional code is circumventing problems when updating your live application. Erlang’s runtime is designed to have at least 2 versions of your code to be executable at the same time. So whenever you need to release a new version of your software, you can just provide the definitions to the running process and the new calls will go through the new code. How cool is that!

There’s a catch though, as far as I’m aware, the new code will not necessarily will be called when you invoke a function. If you use the fully qualified function name with the module name, then such call is considered external and the runtime will direct it into the new version of the code all the time. If you make a private call without the module name, then the old version will be invoked. if still present in the runtime.

While it sounds like a half-baked solution, it is pretty impressive to have it built into the VM. Erlang’s authors were quite sure about where they want Erlang to shine: long running time real time systems.

The JVM of course cannot share the same claim. As a result of runtime state in any application, replacing code becomes much harder. The closest thing to an Erlang module reloading system is perhaps OSGi with it’s ability to replace bundles on the fly. In return it asks you to limit the freedom of your project architecture in favor of bundles separated by interfaces.

Of course in development you can always use JRebel to reload your code changes in the running JVM process, however, while it can save you time in development, it is not a solution to include in your production or mission critical environments.

The fault tolerance aspect of Erlang comes more from the philosophy and program structure. You use processes for any complex action in the system and all of them are hooked into a central system that supervises their behavior. Of course you can write bad programs in any language, but if you follow best practices and conventions, Erlang makes it harder to for your system to crash because of an unexpected error. Then again if you apply common sense, best practices and modern libraries, Java becomes a lot less fragile in reality.

Final Thoughts

In this post we’ve looked at Erlang the language and its virtual machine and tried to see if tasks that are easy to achieve are equally easy in Java or on top of the JVM. Oh, and there’s even an Erlang implementation that runs on top of JVM: Erjang. It’s capable of running Erlang programs, although it does have it’s own quirks and differences, but it does work!

To conclude, I just want to clarify that I didn’t have any intention to point fingers, start a flamewar or say which language is better than another, but instead look at a really cool language I’m not that familiar with and see if the good parts from that language are as available and easy to use on the platform that I use and love. Not surprisingly, the JVM is a great platform and even when Java the language is not flexible enough to accommodate something others can, other JVM languages and the ecosystem come to the rescue and save the day.

Additional Resources

Want to learn more about Java?

Our eBook, Java and JVM Conquer the World, explores why Java became and has remained a dominant development language.

Download the eBook