Java.next

Choosing your next JVM language

Content series: This content is part # of # in the series: Java.next Stay tuned for additional content in this series. This content is part of the series: Java.next Stay tuned for additional content in this series.

I'm a failed poet. Maybe every novelist wants to write poetry first, finds he can't and then tries the short story which is the most demanding form after poetry. And failing at that, only then does he take up novel writing.

—William Faulkner, Nobel Prize-winning author

Not all writers — not even the great ones — are meant to write in every medium. Similarly, some programming languages feel more natural than others to a programmer. Some developers are born C programmers, some have an innate connection to Lisp, and others swear by Perl. The fact that no one language matches every developer's predilections helps to explain why so many computer languages exist. The implication for Java.next is that no single language will dominate the landscape, because there's no single perfect language for everyone.

The Java language might seem like a counterexample. But Java's dominance emerged from a unique set of circumstances — what Bruce Tate famously described as a "perfect storm" in his book Beyond Java. When Java was released in the mid-1990s, it faced an uphill battle for adoption. It was slower than the compiled languages popular at the time. It was memory-hungry (at the time when memory prices took a temporary upward spike). And it wasn't particularly suited to the then-predominant client/server style of development. Java's only two saving graces were its relative ease of use (through facilities such as garbage collection) and Applets, which were unique at the time. If the landscape had remained static, Java likely would not have survived.

About this series The Java™ legacy will be the platform, not the language. More than 200 languages run on the JVM, and it's inevitable that one of them will eventually supplant the Java language as the best way to program the JVM. This series explores three next-generation JVM languages: Groovy, Scala, and Clojure, comparing and contrasting new capabilities and paradigms, to provide Java developers a glimpse into their own near future.

But Java and the then-new World Wide Web were an excellent match, especially when the Servlet APIs became popular. Suddenly, the server-side deployment model mitigated many of Java's disadvantages. This combination of factors (hardware, web, paradigm) was Tate's perfect storm: Developers needed new tools to program the web, and server-side Java alleviated memory constraints and presented a simplified model for building robust web applications. By being poised at the right place at the right time, with a major company (Sun) backing it, Java became the dominant force in the software industry.

It's unlikely that a series of circumstances will line up so neatly for another language. We have entered into a polyglot world of computer languages, a trend that continues to grow. Attempts to identify the next language with the same impact as Java are doomed to fail. In your investigation of a Java.next language to adopt, focus on the aspects that resonate with you rather than looking for overwhelming popularity.

Multiparadigm languages

Many modern languages support several programming paradigms: object orientation, metaprogramming, functional, procedural, and others. Of the Java.next languages, both Groovy and Scala are multiparadigm. Groovy is an object-oriented language with functional extensions via libraries. Scala is a hybrid object-oriented and functional language, with emphasis on functional programming preferences such as immutability and laziness.

Multiparadigm languages offer immense power, giving you the ability to mix and match paradigms to closely match problems. Many developers chafe at the limitations in Java prior to Version 8. A language like Groovy provides many more facilities, including metaprogramming and functional constructs.

Although they're powerful, multiparadigm languages require more developer discipline on large projects. Because the language supports many different abstractions and philosophies, isolated groups of developers can create starkly different variants in libraries. For example, code reuse tends toward structure in the object-oriented world and toward composition and higher-order functions in the functional world. When designing your company's Customer API, you must decide on the best style and make sure everyone on the team agrees (and adheres) to it. Many developers who moved from Java to Ruby encountered this problem, because Ruby is a multiparadigm language. C++, another multiparadigm language, caused suffering for many projects that awkwardly (and often inadvertently) tried to span procedural and object orientation.

One solution is to rely on engineering discipline to ensure that all developers on a project are working toward the same goal. Many developers are wary of using metaprogramming to make pervasive modifications to core classes. For example, some testing libraries add methods to Object to allow broad-scoped assertions. Unit testing enables pinpoint understanding of complex extensions, mitigating the fear of unknown side effects.

Some languages, including Clojure, impose more discipline by primarily embracing one paradigm while pragmatically supporting others. Clojure is firmly a functional Lisp for the JVM. You can interact with classes and methods from the underlying platform (and create your own if you like), but Clojure's primary support is for strongly functional paradigms such as immutability and laziness.

Killer feature: Functional programming

The embrace of functional programming is the most important future-language characteristic for most developers. I covered the functional aspects of the Java.next languages in several installments. The key to the efficacy of the functional paradigm lies in the ability to express ideas at a higher level of abstraction.

In the "Memoization and composition" installment, I converted the imperative indexOfAny() method (from the Apache Commons StringUtils library) into Clojure, yielding a shorter and simpler yet more general function. Clojure in eminently readable to the initiated, but it looks odd to non-Lisp developers. Scala is designed to be more readable to Java developers. The same indexOfAny() method, cast into Scala rather than Clojure, is shown in Listing 1.

Listing 1. A Scala indexOfAny() implementation

def indexOfAny(input : Seq[Char], searchChars : Seq[Char]) : Option[Int] = { def indexedInput = (0 until input.length).zip(input) val result = for (char <- searchChars; pair <- indexedInput; if (char == pair._2)) yield (pair._1) if (result.isEmpty) None else Some(result.head) }

The purpose of the indexOfAny method is to return within the first parameter the index position of any of the characters passed in the second parameter. In Listing 1, I first generate an indexedInput by building a sequential list of numbers based on the length of the input string. Then I use the built-in zip() function in Scala, which "zips" the two lists together. For example, if I have the input string zabycdxx , the results in indexedInput look like Vector((0,z), (1,a), (2,b), (3,y), (4,c), (5,d), (6,x), (7,x)) .

After I have the indexedInput collection, I use a for comprehension to replace the nested loops in the original version. First, I search through the searchChars ; I check for the presence of each of these characters in the second part of the pair in the indexedInput (using Scala shorthand pair._2) ) and return the index portion of the pair if it matches ( pair._1 ). The yield() function generates values for the return list.

In Scala, it's typical to return an Option rather than a possible null , so I return either None if no results exist or Some otherwise. The original indexOfAny() method returns only the index of the first matched character, so I return only the first element in the result ( result.head ). In the Clojure version, I return a list of all the matches. It is easy to convert the Scala version to do the same, as shown in Listing 2.

Listing 2. indexOfAny returning all matches

def lazyIndexOfAny(input : Seq[Char], searchChars : Seq[Char]) : Seq[Int] = { def indexedInput = (0 until input.length).zip(input) for (char <- searchChars; pair <- indexedInput; if (char == pair._2)) yield (pair._1) }

In Listing 2, the return is a list of matches rather than only the first match. For example, the result of lazyIndexOfAny("zzabyycdxx", "by") is Vector(3, 4, 5) , which matches the index within the input string of each of the target characters.

Functional programming languages give you the ability to work at a higher level of abstraction using more-powerful building blocks, such as map() in preference to loops. When you're freed from focusing on low-level code details, you can focus more clearly on more-relevant problems.

The functional pyramid

Computer language types generally exist along two axes, pitting strong versus weak and dynamic versus static, as shown in Figure 1.

Figure 1. Language typing characteristics

Strongly typed variables "know" their type, enabling reflection and instance checks, and they retain that knowledge. Weakly typed languages have less sense of what they point to. For example, C is a statically, weakly typed language: Variables in C are really a collection of bits that can be interpreted in various ways, to the joy and horror (sometime simultaneously) of C developers everywhere.

Java is strongly, statically typed. You must specify variable types, sometimes repeatedly, when declaring variables. Java has gradually introduced type inference, but it doesn't go nearly as far as any of the Java.next languages in conciseness with respect to types. Scala, C#, and F# are also strongly, statically typed, but they manage with much less verbosity by using type inference. Many times, the language can discern the appropriate type, which reduces redundancy.

These distinctions have existed since the early eras of programming languages. However, a new aspect has entered into the equation: functional programming.

As I illustrated in "Functional coding styles," functional programming languages have a different design philosophy from imperative ones. Imperative languages try to make mutating state easier and include many features for that purpose. Functional languages try to minimize mutable state and build more general-purpose machinery. But functional doesn't dictate a typing system, as you can see in Figure 2.

Figure 2. Functional programming languages

Functional programming languages rely (and sometimes insist) on immutability. The key differentiator among languages now isn't dynamic versus static; it's imperative versus functional, with interesting implications for the way we build software.

In my blog back in 2006, I accidentally repopularized the term polyglot programming and gave it a new meaning: taking advantage of modern runtimes to create applications that mix and match languages but not platforms. This redefinition was based on the realization that the Java and .NET platforms support more than 200 languages between them, with the added suspicion that there is no "one true language" that can solve every problem. With modern managed runtimes, you can freely mix and match languages at the bytecode level, using the best one for a particular job.

After I published my blog article, my colleague Ola Bini published a follow-on post discussing his Polyglot Pyramid. The pyramid, shown in Figure 3, suggests the way people might structure applications in the polyglot world.

Figure 3. Bini's pyramid

In Bini's inverted pyramid, he suggests using more-static languages at the bottommost layers, where reliability is the highest priority. Next, he suggests using more-dynamic languages for the application layers, using simpler syntax for building things like user interfaces. Finally, atop the heap, are domain-specific languages (DSLs), built by developers to encapsulate succinctly important domain knowledge and workflow. Typically, DSLs are implemented in dynamic languages to leverage some of their capabilities in this regard. Bini's goal was to put more certainty at the bedrock levels and more flexibility near the top.

Bini's pyramid was a tremendous insight added to my original post. But the landscape has changed in the intervening years. I now believe that typing is more a developer preference, distracting from the important characteristic: functional versus imperative. My new functional pyramid appears in Figure 4.

Figure 4. The functional pyramid

The resiliency we crave comes not from static typing but from embracing functional concepts at the bottom. If all of your core APIs for heavy lifting — such as data access and integration — could assume immutability, all that code would be much simpler. Of course, immutability everywhere changes the way we build databases and other infrastructure, but the result is better stability at the core.

Atop the functional core, use imperative languages to handle workflow, business rules, UIs, and other parts of the system where developer ease is a priority. As in the original pyramid, DSLs sit on top, serving the same purpose. However, I also believe that DSLs will penetrate through all the layers of our systems, all the way to the bottom. This is exemplified by the ease with which you can write DSLs in languages like Scala (functional, statically strongly typed) and Clojure (functional, dynamically strongly typed) to capture important concepts in concise ways.

Building applications that adhere to this pyramid represents a big change, but the implications are fascinating. To get a glimpse of the possibilities, check out the architecture of Datomic (a commercial product). Datomic is a functional database that keeps a full fidelity history of every change. An update doesn't destroy data; it creates a new version of the database. You can roll the database back in time to see snapshots from the past. Because you always have history, practices such as continuous software delivery — which relies on the ability to roll your database backward and forward in time — become trivial. Testing multiple versions of your application becomes trivial because you can directly synchronize schema and code changes. Datomic is built with Clojure, using functional constructs at the architectural level. And the top-of-stack implications are amazing.

Conclusion

This is the final installment of Java.next. I hope this series has sparked your interest in taking a deeper dive into the languages and concepts that I touched upon in the previous 15 installments. The programming-language terrain has changed since I began the series 18 months ago. Java 8 is a strong contender in the Java.next arena, finally adding the core functional-programming elements that will dominate the language landscape over the next few years. All four Java.next languages (Groovy, Scala, Clojure, and Java 8) have strong communities and growing user bases, with a constant stream of new innovations. The landscape looks bright for JVM languages, no matter which contender — or combination — you choose.

Downloadable resources

Related topics