The opinions in this article are my own, and do not necessarily reflect those of Oracle. Additionally, all information in this this post is either publicly available, or a recollection of first-hand experiences leading up to the release of what are now publicly available products.

I'd like to share a little secret with you about the Oracle Cloud Infrastructure web management console (referred to hereafter as "the Console"). It's technically an open secret, I suppose. It's not something we're hiding, and it's easily discoverable; there's usually just no good reason to mention it in the course of conversation.

Until quite recently, almost the entirety of the Oracle Cloud Infrastructure web management console was written in Lisp—specifically, ClojureScript.

When Nebula shut its doors in 2015, quite a few of us joined the then-nascent Oracle Cloud Infrastructure group for what has been a career-defining effort: to build a next-generation, enterprise-focused public cloud from the ground up.

And my first move as UX architect? Why... to introduce a 59-year-old programming language into the critical path of our product, of course. It's a move that raised a few eyebrows—exactly the kind of move you'd expect a Lisp zealot to pull when nobody was looking.

Except... I wasn't a Lisp zealot. In fact, with the exception of a few for-fun Scheme sessions while reading SICP, I hadn't written a meaningful line of Lisp since taking CSCI 446 under John Paxton my freshman year of college—one I didn't even graduate from.

So... why Lisp? Every new engineer, hiring manager, or acquired company who joins asks me this question, so I figured it's time to write it all down.

But perhaps more importantly: I feel incredibly lucky to have been with Oracle these past 2+ years. The people are brilliant, and our goals are ambitious. I often feel like the dumbest person in the room (in every good way), and nowhere in my career have I worked with a group that consistently shipped so many awesome things in such a short amount of time. I hope this story conveys the sense of empowerment I think we all feel working here.

Building on Greenfield

When I joined Oracle in 2015, the Infrastructure group was just being officially formed. There were fewer than forty of us. Our charter was as nebulous as it was ambitious—and full of exactly my favorite kind of inspirational hubris. It was best explained in the form of a question that became our most powerful recruiting tool:

"What if you could build the cloud over again? What could be done better today? What opportunities and advantages exist in new technology and approaches?"

Many of us who joined—myself included—carried forward lessons learned from building the first-generation clouds at AWS, Microsoft, and Google. Over time, large systems—and the organizations that build them—become ossified; it's nearly impossible to evolve fundamental elements of the design.

But we had no systems. Heck, we barely had computers. Building this "gen-2 cloud" gave us a chance to revisit those fundamental elements and innovate in interesting ways. It was this chance to innovate that brought us off-box network virtualization, enabling customers to instantly provision bare metal servers and engineered systems into their virtual networks. It was this chance to innovate that let Jag reach back to the year 1952 and bring forward a Clos Network, giving us a massively scalable network with consistent, sub-100-microsecond latency.

And for me, a lowly UX engineer in a sea of hardcore distributed systems developers—and thus only person with enough emotional energy to bother thinking about how truly wonderful/terrible modern web application development is—it meant a chance to lay out the architecture for what would become the interface sitting between Oracle's customers and the work of thousands of engineers.

Establishing Platform Constraints

The problem with greenfield development is it's really easy to build the wrong thing. Complete freedom from legacy systems, from customers dependent on certain behavior, from strict deadlines... these things sound wonderful in theory, but in practice, constraints are what help establish success criteria. Without them, it's easy to "chase the shiny", building the thing you want, but not necessarily the thing customers need.

And so, because nobody told me I couldn't, I got a few of us in a room and we came up with some organizational and platform constraints to help us shape our thinking:

The overall goal here was simple: build service APIs good enough that any customer could use them to build their own modern management tools—including web consoles—without requiring learning a dozen different API calling conventions, or using custom proxies or middleware.

And to help prove that we were doing the right thing, I would be our first customer.

Establishing Product Experience Constraints

Having developed multiple cloud management consoles for AWS and again for the Nebula One private cloud, I admit that I carried a nontrivial amount of bias (which I lovingly referred to as "intuition") into the product and engineering design of the Console.

Luckily, I was not alone. Members of team had likewise worked on OpenStack, GCP, and AWS management consoles. Convincing them to do better was easy. And so, because nobody told me I couldn't, I got a few of us in a room and we came up with some product experience constraints:

The Console would have a consistent experience across the suite of services, with the goal of enabling cross-service integration experiences without having to jump between disparate tools. Additionally, the Console would allow instant navigation between service areas. One platform, one platform experience. The Console would have a "push-ready", live user interface. No need to hit your browser's refresh button. No need to re-render whole pages. Data would trickle in, and update relevant parts of the Console automatically. This came from trying to retrofit pull/refresh-based experiences at previous companies. Conceptually modeling it as push and baking that into the architecture from the start prevents the kind of "one way door" decisions that make switching over later difficult.

Yeah, But... What About Lisp?

I'm trying to make a point here: this technology decision was not made in a vacuum. We began with the end in mind. Having collected a set of platform and experience goals, I began to brainstorm what kind of technology stack would let us meet these goals on our somewhat... aggressive timeline... with minimal staffing.

Let me elaborate.

At AWS, any given service team can have anywhere between one and a half-dozen UX engineers dedicated to their service's portion of the management console.

At the time, my UX engineering team at Oracle numbered two (plus a talented QA/ops engineer), and we had to provide product coverage for the entire core offering: identity and access management, virtual networking, compute, block storage, and object storage. And we had to build the UIs as the services themselves were being defined and built. And manage our own operations. And we had about four months to do it.

The platform portfolio has only continued to grow. We've since solved the "only two people" problem (and, by the way, most of them write Clojure every day, and we're hiring), but at the time, the name of the game was optimize for high leverage, high velocity.

And so, because nobody told me I couldn't, I sat in a room and came up with some design constraints that directly related to our platform, product experience, and delivery constraints:

We would design for low operational overhead. No complex / bespoke application servers. No stateful architecture. Push as much work either down into the infrastructure services or up into the browser as possible. This meant a single page application—preferably one that we could eventually push out to a CDN as infrastructure-free JS/CSS/image assets. This meant creating well-designed APIs, and establishing the organizational precedent that the service needed to think of web interfaces as first-class stakeholders in their APIs designs. We would design for ease of debugging. QA and customers would be able to take a snapshot of their application state which engineers could use to reproduce and root cause bugs. We would design for ease of development. Bret Victor's talk Inventing on Principle had stuck with me, and I wanted to ensure that our tooling provided that "immediate connection" with the product. This meant more than just a fast write/compile/evaluate loop. It meant live coding. It meant a REPL that could evaluate changes in situ. Combined with #2, this meant being able to easily jump to a specific application state in order to develop/debug functionality. And, perhaps most importantly, we would design for simplicity, rather than ease. We would eliminate common sources of bugs by removing the conditions under which they could occur, even if it meant having to ramp up on difficult or nonintuitive technologies. More on this below.

Evaluating the Ecosystem

Taken together, all these things narrowed down the universe of options to a set of fairly noncontroversial decisions.

I had spent the previous year experimenting with functional reactive programming after coming across some fairly convincing arguments for its use. But, ultimately, it was a white paper and a conference talk that convinced me that the key to simplicity and correctness was orienting our architecture around the concerns of what, who, when, where, why, and how; modeling them as simple transformations on data; and ensuring these concerns were sufficiently divorced from each other.

In effect, this meant we needed to be able to design our own system rather than buy into a wholesale framework. Now... don't get me wrong: choosing popular frameworks comes with a lot of benefits. A lot of decisions are made for you, allowing you to focus purely on core application functionality rather than incidental concerns. It's easier to hire talent from the framework's community. And, perhaps most critically, it's easier to get started quickly.

But once a project scales past a couple dozen developers, a couple years, or beyond the core areas of the creators' original concerns, frameworks add a lot of drag. Large applications tend to specialize over time, making it harder and harder to rev the underlying framework. (And, I fully admit, some part of this has always been due to an organizational inability to directly contribute back to a framework in order to widen its concerns without pinning myself to a custom/patch implementation.)

We needed to be able to draw arbitrary boxes around the core parts of our application's infrastructure, and be able to replace/upgrade those boxes at the drop of a hat. This meant choosing a language/platform/community accustomed to building standalone, composable libraries.

Additionally, after years of hunting down bugs related to functions mutating input arguments, I decided immutable data structures made the list of desiderata. But it wasn't enough to simply say, "Our application will use immutable data structures." No—the entire application and all dependencies needed to standardize on this.

This was important for more than just preventing mutation bugs. It was also crucial to our goals of live coding and a reactive architecture. It turns out that when your core convention is data transformation, the only way to avoid ungodly amounts of copying is to rely on structural sharing, which requires pervasive immutability.

So, the following items were added to our list of engineering dependency constraints:

We would choose a language/community that encouraged composable libraries over monolithic frameworks. A community that thinks in terms of libraries rather than framework integrations tends to design small, highly concern-specific APIs. Their authors assume usage in multiple contexts, and thus can't make sweeping assumptions about the environment. This tends to result in highly pluggable, highly testable systems. Naturally, this doesn't come without a tax; there are times when library APIs need to be massaged to fit the application's conventions. We would require immutable data structure support at the language level, rather than library level. Ensuring that immutable data structures were part of the standard library would not only ensure better performance—it would remove the need for constant marshalling/unmarshalling at each application/dependency boundary, a constant source of errors. It also meant not having to learn multiple APIs for common structures. We would choose a language/libraries that aligned with our goal of creating something vaguely FRP-like. Strongly directed dataflow programming with reactively-rendered, declaratively-defined views would give us the "live updating" experience we wanted without having to piecemeal-integrate change detection and specialized update logic—something we had implemented at Nebula, but was also a cause of serious maintenance burden.

This last item had some very interesting, pervasive implications. Any library that relied on owning its own non-incidental state, or that didn't provide fast/efficient setup/teardown would almost certainly be impossible to use in this model; or would simply require so much somersaulting to coerce its behavior into something vaguely functional that the benefits of its use would be outweighed.

Yeah... BUT WHAT ABOUT LISP?!

This laundry list of platform, product experience, delivery, and engineering constraints took what was originally an open-world adventure and turned it into something that felt a bit less expansive. I wasn't sure I'd hit them all, but I had a decent idea of the relative importance of each item.

I compiled a list of languages, libraries, and frameworks, which had some of the usual subjects, as well as a few oddballs: JavaScript, TypeScript, CoffeeScript, Java, Dart, F#, Elm, Clojure(Script), Backbone, React, Angular, Knockout, Meteor, Reagent, re-frame, GWT.

I had experience with some of these in past lives. The original S3 and Storage Gateway Consoles were the first at Amazon to be written in Java/GWT. Nebula One used JS/Backbone.

There were others that I hadn't used, but were ranked lower based on core tenets:

Knockout and Angular were (at the time) strong proponents of two-way data binding.

Code in the Meteor community relied too much on environment-based conditionals peppered throughout in order to achieve isomorphic compilation.

Dart was promising, but too new for me to feel comfortable with it. The Dart VM wasn't likely to catch on anytime soon, and the Dart-to-JS process felt too much like GWT's spiritual successor; I was having flashbacks to my first days at S3 writing a custom linker for GWT in order to get the generated output to do what we wanted. (I admit, I still hold out a perverse hope for the widespread success of Dart; it feels like what JavaScript would and should have been were it designed, instead of evolved.)

Convincing anyone (including myself) that Elm and F#/Fable were viable for production usage would have been difficult. As beautiful as they are in practice, anything with a Hindley-Milner scent tends to scare people. I could see the fear in management's eyes: "But how will we hire people?!"

In the end, we were left with two obvious choices that met most of our constraints:

JavaScript/TypeScript, React, and likely Redux ClojureScript, and re-frame, which pulled in Reagent and React. Re-frame calls itself a framework, but is really a very lightweight, highly pluggable (either by design or by virtue of being written in Clojure) collection of libraries used to enforce event/data flow conventions. Our ability to easily sidestep the framework where needed and drop down to Reagent/React was a key factor in our decision to use it.

Of these two, ClojureScript edged out ahead in its pervasive use of immutable data structures. But that alone wasn't enough for me to choose a winner.

It's the People

Any large software product comes down to people: those building it, those who help you build it, and those you're building it for.

The JavaScript community's people are its greatest strength—and weakness. Their diverse backgrounds bring forth novel ideas. But the combination of JavaScript's ad-hoc, vendor-driven evolution—and subsequently poor reputation—combined with a community made primarily of people who were originally seen as the renegades and misfits of the software community—a sort of anti-big-software culture—means that JavaScript developers spend a lot of time inventing and reinventing things that have been in industry for decades or more, and pitting them against each other in a sort of Battle Royale which results in everyone scrapping their build system / compiler / DI container / data structure library / rendering library every 3-6 months.

And while this sort of Darwinian model of software development is awesome for pushing forward new ideas, it comes with its own brand of chaos. Your code, at any point in time, is based on trend, rather than deep thought.

Navigating these trends is relatively simple if you know where to look: backwards. My path to JavaScript was circuitous: BASIC, Perl, PHP, C, Java, Common Lisp, Python, Ruby, and then JavaScript. Exposure to the ideas in each of these communities—in particular, the Java and Ruby communities—gives you pretty solid indicators for what will succeed.

Say what you will about Java... regardless of your opinions regarding its verbosity, one thing can't be denied: there's a lot of it. It was built with the end in mind: to be the lingua franca of the business software world. Its community builds large, mission-critical systems, spanning hundreds or thousands of developers.

So, if you see an idea present in the JavaScript community promising to wrangle complexity, ease operational burden, increase debugability, or scale development beyond a couple people, if it's present in the Java world chances are it'll stick, because it's already stuck.

But much of today's JavaScript community exists because of the creation and success of Node.js, whose early community mirrored the early Ruby/Rails communities in temperament and ideals, namely: simplicity over complexity, convention over configuration, choice over mandate, and shorts over suits.

So, if you see an idea present in the JavaScript community promising disruptive power and flexibility at the cost of a little chaos, it will often win over a well-ordered monolith.

The Clojure community manages to blend all these elements much more naturally, and with greater purposeful intent, resulting in less incidental chaos. By thinking deeply on core aspects of new and interesting problems, they're able to extract highly reusable lessons that can be widely applied without disrupting existing methods. It's a kind of thought leadership that I hadn't seen in other communities. And the ClojureScript community still benefited from all the positive outcomes of the JavaScript community's Darwinian idea selection process.

To me, this created the perfect blend of of cutting-edge thinking coupled with sustainable software craftsmanship; of flexibility and power coupled layered atop logic and structure.

And so, because nobody told me I couldn't, I picked Lisp.

And then we shipped a product. And built an awesome team. And made our product even better. And now, we'll continue building even more awesome stuff.

And when they ask us how we did it—why we did it—I suspect the answer will be the same: because nobody told us we couldn't.