Why Does Erlang Have Weird Syntax?

I’ve heard the argument many times. People “don’t like Erlang’s syntax so [they] don’t like Erlang.” I, for instance, didn’t understand the block terminator syntax when I was first learning Erlang, so I asked Yariv Sadan about it:

Erlang syntax came from Prolog. The ‘end’ keyword is used to end code blocks that are contained inside the body of a function, and the ‘.’ symbol is used to close top-level function definitions. — Yariv Sadan

But what is the purpose of such a strange syntax? Is Erlang’s syntax really that weird or is it the single-assignment semantics which people don’t like? Erlang’s SAS derives from the functional programming paradigm which begs the real question: “Why is Erlang a functional language?”

The answer in a nutshell: To attain more language features, like message passing in Erlang’s case, the language incurs some overhead so efficiency gains must be made to compensate. It’s the same concept behind data structures: In order to know more about data, semantic restrictions can be imposed. In this case, the data is the program source itself and the imposed restriction is data immutability.

Erlang’s data immutability can be split into two levels: intra-process variable immutability and inter-process data isolation.

Why Doesn’t Erlang Have Share Data?

Concurrency is obviously the justification for Erlang’s isolation memory model, with the alternative being locking to synchronize destructive shared data operations. Lock complexity makes scaling hard because locks aren’t composable and are unassociated with the shared data. If data isolation is enforced between concurrent components (what Erlang calls processes), composibility can be maintained because destructive shared data modifications do not occur. Composability, or building instructions with instructions, is an inherent concept of programming and it’s why sequential programs are desirable.

This example excerpt from an older java.lang.StringBuffer class illustrates a common locking misconception. The author wanted to compose the length and getChars calls into one atomic operation to ensure critical data wouldn’t be modified between them. The author attempted to compose the calls by wrapping them in a synchronized method which locks this , not sb . Unfortunately, another thread could still modify sb in between the calls.

public final class StringBuffer { public synchronized StringBuffer append(StringBuffer sb) { int len = sb.length(); ... // other threads may change sb.length(), ... // so len does not reflect the length of sb sb.getChars(0, len, value, count); ... } public synchronized int length() { ... } public synchronized void getChars(...) { ... } ... }

Software transactions and message passing are fundamentally better synchronization abstractions than locks because of, among other things, composibility and accuracy, at the cost of greater overhead. STM can be built from message passing and I consider both to be acceptable replacements for lock based synchronization.

Why Does Erlang Have Data Immutability?

It seems that intra-process data isolation would be enough for a clean synchronization scheme, yet this is not entirely true. The immutable shared data model can be made even more complete when applied at the local sequential data level to achieve even more benefits:

Immutability efficiency gains compensate for message passing overhead.

Immutability of data can, in many cases, lead to execution efficiency in allowing the compiler to make assumptions that are unsafe in an imperative language. — Wikipedia To reiterate what I said earlier, the restriction of data immutability allows the compiler to know more about the program and as a result, regain some efficiency.

Immutability simplifies type policies for message passing.

In Scala, [an OOP language with message passing,] you can send between actors pointers to mutable objects. This is the classic recipe for race conditions, and it leaves you just where you started: having to ensure synchronized access to shared memory. — Yariv Sadan To expand on the type policy issue a bit more, it’s simpler to make all data immutable, both local and shared. This reduces the need for special cases intended to disallow sending pointers, or data structures containing pointers, through message passing. This congruency keeps message passing at the core of Erlang’s semantic focus.

Immutability makes Garbage Collection simpler, faster, and soft real-time.

The generational [Erlang] collector is simpler than in some languages, because there’s no way to have an older generation pointing to data in a younger generation (remember, you can’t destructively modify a list or tuple in Erlang). — James Hague Again, more restrictions means different ways of efficiency gains.

Downsides to Erlang’s Non-destructive Memory Model

Many common algorithms are designed with destructive memory in mind. As a result, Erlang doesn’t have some of the same types of data structures and libraries as imperative languages. This, combined with the confusion of learning a completely new programming paradigm, can be a large deterrent for learning functional languages. But what libraries Erlang lacks is largely made up for by an extensive actor based library platform.

Virtually all “big” Erlang libraries use Erlang’s features concurrency and fault tolerance. In the Erlang ecosystem, you can get web servers, database connection pools, XMPP servers, database servers, all of which use Erlang’s lightweight concurrency, fault tolerance, etc.

— Yariv Sadan

Erlang is clearly a domain specific language, focusing on problems that can be solved with parallelism. Erlang’s libraries favor this same domain and Erlang holds its own against even C in the targeted many-thread/process arena. Note: the “thread-ring” test on the last line of the chart:

An Erlang web server was also compared to Apache as another more practical comparison (again, intentionally a problem in Erlang’s domain). “Apache dies at about 4,000 parallel sessions. Yaws is still functioning at over 80,000 parallel connections.” So, Erlang may be hard to use because of its lack of conventional libraries or its different syntax, but as I’ve hopefully shown, the differences were chosen by design with good reason. Erlang is clearly one of the strongest languages in its parallelism domain and after a little over 20 years, is a very mature language.

None Found