Late one night on an uncrowded subway car in New York City, I had my laptop open, working on a game whose deadline was drawing near. A gentleman sat next to me and, seeing the walls of colored text on my screen, asked if I was writing C++. I told him I wasn’t, and he was curious to hear what language I was using. I was working on a web game in a programming language I had designed for myself, and I told him so—it was something that I made up, I said. After looking at me for a moment, he asked, “Why would anyone do that?” I started to answer, but alas, we had arrived at his stop, and he disappeared onto the platform before I could explain myself. In many ways, I’ve been trying to answer that man’s question for years now. The thing is, I absolutely love programming languages. I work as a graphics and video game developer, which is thrilling and challenging work, but secretly I would rather be hacking on compilers. I love languages because, of everything I’ve encountered in computing, languages are by far the weirdest. They combine the brain-bending rigor of abstract math, the crushing pressures of capitalistic industry, and the irrational anxiety of a high school prom. The decision to adopt or avoid a language is always a mix of their perceived formal power (“Does this language even have this particular feature?”), employability (“Will this language get me a job?”), and popularity (“Does anyone important use this language anymore?”). I can’t think of another engineering tool that demands similar quasi-religious devotion from its users. Programming languages ask us to reshape our minds, and that makes them deeply personal and subjective. The field of study of programming languages is called programming language theory, or PLT. Software engineers are confronted with programming languages just about every day, but few develop a deep relationship with PLT. Languages are tools, primarily, a means to an end, and most professionals will do fine just learning to use the popular ones well enough to get their jobs done. Guido van Rossum is the original author and Benevolent Dictator for Life of the Python programming language. Rich Hickey is the original author and Benevolent Dictator for Life of the Clojure programming language. Diving deeper into PLT, though, is a great way to grow as a developer. Not only is language design a lot of fun, but a deeper understanding of the tools you use every day will give you a better handle on them, and can make learning new languages considerably easier, even if you don’t dream of becoming the next Guido van Rossum or Rich Hickey. And hey, you never know—your personal project could become the next major piece of software engineering infrastructure. It’s happened before.

What is a programming language? So, what is a programming language? This might seem like an odd question to ask about tools this ubiquitous, but starting from a definition is often helpful to focus the conversation. A programming language is a formal language used to communicate instructions to a computer. It is formal in that it conforms to a rigid set of rules that determine what is and is not allowed. It is a means of communication in that the primary goal of the tool is to translate ideas in a programmer’s head into a form that a computer can act on. The fact that you are communicating with a computer is significant. Unlike other forms of language, or even instructional arts like musical composition or screenwriting, the final agent fulfilling the instructions is not human. The result is that qualities that other forms of communication tend to depend on—like intuition, common sense, and context—are not available. Also in this issue It doesn’t have to be Turing complete to be useful Using a Turing-incomplete DSL can have a host of advantages—from predictable resource usage to improved analysis. The decisive factor in what makes something a programming language (or not) is known as Turing completeness. Alan Turing’s seminal work in the 1940s included the definition of the Turing machine, a mathematical description of an abstract computer that became foundational for our understanding of how algorithms work. A Turing machine can, provably, implement any computable algorithm, and any system that can simulate the Turing machine can do so as well. Such a system is deemed Turing complete, and most programming languages have this status as a basic goal (though there are some interesting languages that do not). A deep dive into computability theory is beyond the scope of this article, but suffice it to say that a language with some notion of state (often variables or argument passing) and conditionals is most likely Turing complete. This leaves out markup languages like HTML and configuration languages like YAML or JSON, but includes a hilarious collection of systems that are accidentally Turing complete (including an abuse of HTML and CSS). In practice, you interact with programming languages via computer programs or software libraries into which you feed code in order to produce an effect. They come in two broad manifestations: as compilers and as interpreters. Each approach has its advantages and disadvantages, and the line between the two can be quite blurry, with frameworks like Mono going so far as to offer both simultaneously. The Make A Lisp (MAL) project is a Lisp designed to teach language design and implementation. It offers guides on how to implement the simple language in 72 programming languages. An interpreter’s job is to take source code and immediately implement its effects. An interpreter turns source code into an internal representation that it can use to carry out the computation the source code describes. This representation will include the functions, variables, expressions, statements, and all other semantics of the source language. You can think of source code as an extreme, Turing-complete configuration file that controls the interpreter’s behavior. My first foray into language design was based on Peter Norvig’s excellent Lispy interpreter in Python, and the more recent MAL project has amassed implementations in 72 languages. The advantages of interpreters include their simplicity, the fact that they can often start executing faster than compilers, and their ability to run in environments where compiling new code is prohibited (like on iOS or most video game consoles). This piece, however, will focus on compilers. The job of a compiler is to take source code and translate it into a target code with the same meaning. Often that target code is in a lower-level language like machine code, but that isn’t always the case. The generated target code can then be evaluated in order to carry out the computation of the original source code. Compilers can be thought of as a pipeline of transformations, starting with the programmer’s source code and proceeding through a series of internal representations that end in the desired target code, after which it is handed off to another system for evaluation. Bytecode resembles machine code, but it is designed to be executed by a virtual machine rather than by physical hardware. As a result, bytecode can be higher-level (i.e., it can represent constructs that hardware does not directly support) and more portable (i.e., the same bytecode can run on different machine architectures). The classic example is a compiler for the C programming language, where source code written in C is compiled into machine code that a computer’s hardware can execute directly. In this case, a higher-level language is compiled into a lower-level one. C# and Java are similar, but they compile into bytecodes that are executed by the Common Language Runtime (CLR) and the Java virtual machine (JVM), respectively, as opposed to physical hardware. Virtual machines like the CLR and the JVM provide cross-platform environments that handle a lot of low-level details for you while providing additional functionality like garbage collection and a type system. There are even cases where it is desirable to compile a lower-level language into a higher-level one. To run in the browser, the JSIL project compiles C# bytecode into JavaScript so it can run on the web, and Emscripten does the same for C and C++. There are also situations where the same language is both the source and target language. The so-called transpilers Babel and Closure compile JavaScript into JavaScript in order to access new features of the language and implement optimizations, respectively.