The recent Emerging Languages Camp at OSCON (Coverage of Day 1 and Day 2), featured a long list of talks about new languages at various stages of development, from experiments that were only a few weeks old, more mature languages like Google Go or Newspeak, and veritable oldtimers like D.

JRuby's Charles Nutter presented the language Mirah, which will appeal to everyone who likes Ruby syntax.



InfoQ caught up with Charles Nutter to see what differentiates Mirah from plain old JRuby and other JVM based languages.



InfoQ: So Mirah, formerly known as Duby, who is it for?

At present, it seems the best way to describe Mirah is as "a nicer way to write Java." So I'd say the primary target right now is Java developers that want a replacement for javac with a nicer syntax and a few additional features.

InfoQ: Is Mirah a language that every programmer can/should switch to or is it aimed at certain areas?

Related Sponsored Content Demystifying Microservices for Jakarta EE & Java EE Developers

The goal is create a language that can do everything Java can do, a few things that Ruby can do, and still be as lightweight as possible (i.e. no runtime library requirements). So anywhere you'd use Java, you should be able to use Mirah. It turns out that the Java libraries and most of the Java type system aren't really that cumbersome if you put a little sugar around them. For me, the best sugar available comes from Ruby syntax and some of Ruby's "apparent" language features. We start with Ruby as the syntax and Java or JVM bytecode as the result, and we'll see how far we can go with that.

InfoQ: How do I use Mirah and Mirah code? Is there a compiler or an interpreter?

Mirah is a statically-typed, compiled language, but it runs well as a script too. You write what mostly looks like "Ruby with a few type annotations", and then either use the "mirah" command to run it as a script or the "mirahc" command to compile it to JVM bytecode or Java source. Both commands also have a "-e" flag for doing quick command-line scripts.

InfoQ: Once the code's compiled to JVM bytecodes, how much of a runtime is there to lug around?

No language feature (other than support for dynamic invocation) imposes any library dependencies on you other than the classes you directly reference yourself. It's a primary design goal of the language to avoid, for as long as possible, any language-specific runtime. We may not be able to do that indefinitely, but it's a good thing to aim for. All the other JVM languages immediately require you to ship their runtime once you've written a single line of code. With Mirah you take source in and get executable code out, and that's what you ship. No nonsense.

InfoQ: Do the Mirah binaries lend themselves to being translated to Dex code for Android?

Certainly! The Android SDK takes in any JVM bytecode and converts it to Dalvik bytecode, so you can either compile Mirah to JVM bytecode or compile it to Java source and use javac to compile that. Once you've compiled Mirah code, it's basically indistinguishable (to JVMs and JVM-related tools) from what javac produces.

InfoQ: Could Mirah become something like Squeak's Slang for JRuby? Ie. a restricted subset of a dynamic language that's easier to compile to fast code?

That's one of the original justifications for making Mirah, and it may happen at some point. Having JRuby extensions (or JRuby itself) written in Mirah could make it more approachable to developers for whom Java syntax is undesirable. At the moment, however, we're just focusing on stabilizing Mirah itself and building out missing features.

InfoQ: Is Mirah written in Mirah (or will that happen at some point)?

There are small parts of Mirah written in Mirah now, like the tooling for the Ant task. We'd like to move toward being self-hosted, but it's not currently a primary goal. Mirah's codebase is currently almost all written in Ruby, which turns out to be a really nice language (and runtime) for building a compiler. Self-hosting might gain us bragging rights, but unless there's a compelling improvement over having the toolchain in Ruby it probably won't happen soon.

Of course if JRuby were rewritten in Mirah, then we'd be self-hosting in a way...Mirah would bootstrap JRuby which would bootstrap Mirah.

InfoQ: What's the delta between Mirah and Ruby, what was added/removed from Ruby to make Mirah from a language/grammar point of view?

At first we just focused on getting the basic structure of files to compile: classes, methods, instance variables, literals, imports. As we've gone forward, we've added Java-specific features like interfaces and Ruby-specific features like internal iteration (compiled like Java 5's "for" loops) and closures (compiled like Java's anonymous inner classes.) We'll continue to add features from both languages and probably start borrowing some others like implicit conversions (from Scala) or explicitly immutable classes (from Clojure or Seph). It will be interesting to see how much we can do with just a compiler and no runtime library.

InfoQ: There was some work on other Mirah (Duby) backends, eg for .NET.

Yes, Jimmy [Schementi] did a prototype of a Duby backend that could output C# source. I believe it worked well enough to do simple flow control, basic math, and basic literals. I'd love to see that work start again.

InfoQ: Are there other backends? (Native or LLVM; Rubinius)?

I experimented with a C backend, but the nature of Ruby syntax means the target backend should probably be object-oriented, garbage collected, and structurally similar to Ruby or Java. A Go backend might work, for example, or an ooc backend.

Mirah could also target dynamic language runtimes like Ruby itself, ultimately bringing static type-safety at compile time to those backends. I have not considered what that might look like in practice or whether the dynamic nature of those systems (e.g. runtime mutable types) would defeat attempts to build a static language atop them.

InfoQ: How is Mirah tied to the JVM - or if it isn't: how do you keep it independent of the JVM? What types, for instance, do developers use? Eg. there is 'fixnum' as type annotation - how's that defined?

The type names used are largely defined by the inference and code-generation phases, and can map to whatever is appropriate for the target backend. In early Mirah code, "fixnum" was used simply as an alias for "int" or "long". These days, Mirah code targeting the JVM generally just uses "int" or "long" as the types directly, and all other types are just JVM primitives ("float", "double", etc), regular JDK classes, or third-party libraries. It's not expected that you'd take a Mirah program unmodified and run it on a different backend; that's not a goal of the language. But if you know Mirah for one backend, you will have an easier time writing it on another backend, since the apparent features of the language are still the same.

InfoQ: What's the community, who are your collaborators?

Right now most of the compiler work is being done by Ryan Brown from Google, and I've been contributing when I can. We have a few other folks submitting patches and adding features, but it's a fairly young project today.

InfoQ: Are there any projects, articles, etc by collaborators that you'd like to point out?

Phil Hagelberg has a "playground for Android development" called Garrett: http://github.com/technomancy/Garrett



Another contributor who goes by "consiliens" has been working on patching Mirah to support generating GWT applications. This one is particularly interesting since GWT only processes Java *source*, so there's basically no way to build GWT applications with most other JVM languages.



John Woodell, also from Google, has been putting together the basics of a web framework called Dubious: http://github.com/mirah/dubious. He's taking a similar design approach to Mirah, starting with the structure of a "Rails-like" application and filling in just the blanks necessary to make that compile as Mirah code. Dubious is an example of building a language around your application or around a library, rather than forcing your application or library to conform to the language. An easily-understandable compiler and a flexible syntax make that possible in Mirah.

InfoQ: What do numbers look like in Mirah? Is there a numerical tower, are there fixed size values that can overflow, is there boxing etc?

When targeting the JVM, the numerical tower is just that of the JVM. You have primitive numeric types and their boxed equivalents, and overflow behaves just like when writing Java.

InfoQ: Would JVM support for tagged numbers/immediate types help?

It would help us in the same way that it would help Java, in that boxed numbers would cost considerably less to construct and use. Outside that, there's no pressing need for tagged numbers, since Mirah supports Java's primitive types as well.

InfoQ: What are the metaprogramming features?

Metaprogramming comes primarily in the form of compile-time macros, which are how many of Mirah's apparent features are implemented. Closures, for example, are translated by the compiler into their equivalent anonymous inner class form. Iteration over a java.util.Collection type is translated into external iteration using java.util.Iterator. By writing the compiler in Ruby, it becomes very easy to create macros that perform these translations, allowing you to design the language to suit your need.



I would also like to add two key features to make Mirah feel a bit more like Ruby: open classes, which would be compiled like extension methods on C#; and implicit type conversions, which would behave much like they do in Scala. These two simple features can give a statically-typed language a much more dynamic feel, and both can be supported without shipping a runtime library.



The source code for Mirah is available at GitHub.