The Economics of Programming Languages

David N. Welton

davidw@dedasys.com

2005-07-18

[ This article originally appeared in byte.com ]

I like programming languages a lot. I've used a number of them professionally, and have even written one myself - Hecl - although it borrows most of its ideas, if not source code, from Tcl. And, of course, I've taken part in my share of debates and discussions on "which language is best," a topic which of course doesn't have one clear answer but is often the source of heated arguments.

I recently read an interesting book, Information Rules: A Strategic Guide to the Network Economy by Carl Shapiro and Hal R. Varian (Harvard Business School Press, 1998; ISBN: 087584863X), which talks about the economics of the world of high technology. While reading it and thinking about programming languages, a number of things clicked. They aren't earth-shattering conclusions. On the contrary, a lot of them are more or less common sense, but it's nice to read that there are some methodically studied theories behind some of the intuitions, hunches and observations I've made over the years.

In this article, I'll attempt to list what I believe to be the most salient points of the economics of programming languages, and describe their effects on existing languages, as well as on those who desire to write and introduce new languages.

Languages as Products

Programming languages, like any product, have certain properties. Obviously, like any other sort of information good, production costs in the sense of making copies are essentially zero. Research and development (sunk costs) are needed to create the software itself, which means that an initial investment is required, and if the language is not successful, chances are the investment can't be recouped.

This applies to many information goods, but programming languages also have some qualities that make them special within this grouping. Namely, that they are both a means of directing computers and their peripherals to do useful work, but they are also a means of exchanging ideas and algorithms for doing that work between people. In other words, languages go beyond simply being something that's useful; they are also a means of communication. Furthermore, in the form of collections of code such as packages, modules or libraries, programming languages are also a way to exchange useful routines that can be recombined in novel ways by other programmers, instead of simply exchanging finished applications.

Positive Network Externalities

This leads us to what is one of the most important concepts. "Network externalities," or "network effects," refers to the notion that the more people own or use something, the more valuable it is to everyone. Consider a cell phone, for instance. If you had no one to call, you might get some enjoyment out of selecting annoying ring tones in public places, but beyond that, it would be a mostly worthless chunk of plastic. A large part of the phone's value is in the ability to call your friends, and have them call you.

Programming languages' value is clearly not wholly dependant on this concept - you can certainly do a great deal even with a language that is not used by many people. However, consider all the extras you get if your language is popular: books, open source libraries, examples, discussion/support groups, job possibilities, or conversely, the ability to easily hire programmers to work on your company's product. If you use a popular language such as C, and a new networking protocol comes out, it is very likely that someone will create a library that "speaks" this protocol, and these days it's quite possible that it will even be open source. On the other hand, if you are using an obscure language, you will have to write the support for this new protocol yourself, or hope someone in a more limited group will.

The effects of this are clearly visible. Despite the vast number of languages that have been created by individuals, university researchers and corporations, the majority of programmers at any given point in time will probably use one of a relatively small set of languages. However, there are some factors that mitigate these effects, as we will see shortly.

Switching Costs

In the expense column, "switching costs," or the costs of changing programming languages, are fairly obvious. Rewriting a large piece of software in another language is fraught with pitfalls, and is likely to be extremely expensive and time consuming. Once it gets written, it may be extended or modified, but complete rewrites are rare. Would you propose rewriting a working system to your boss? "No, it won't actually do much more, than what we have now, and it will take three programmers five months to complete, but it'll be in Haskell instead of Fortran!". This is why banks still have so much code written in Cobol.

For the individual, too, having to learn a new language is hard. Even for expert programmers, it's difficult to truly master many languages, as frequent practice is needed to "stay in shape." Once you've got a few under your belt, it's to your advantage to be able to use them for a while. You'd rather put "20 years Java experience" on your resume than "6 months of Forth programming," wouldn't you? Because both programmers and programs incur switching costs that can be very high, the effect is that languages last a long time, even if they are no longer "the latest thing" being taught in universities.

Switching costs are even greater for people who are not expert programmers - learning one language is already a big investment for them, and it would not be quick and painless for them to learn multiple languages, or switch to a different language. The old adage of "the best tool for the job" doesn't really fit in this case either. Since most programming languages are a great deal more complex than, say, a hammer, most people would rather just learn one or two that do a reasonable good job at dealing with the problems they most frequently deal with, yet hopefully adaptable enough to deal with other needs that may arise.

The choice of a language to learn is consequently of greater importance to these individuals, who are however the least well positioned to able to make an informed choice, lacking the experience and ability to compare and contrast the toolchests available to them.

Once our hypothetical "part-time" programmer has selected and invested the time to learn a new language, they are likely at that point to want to defend their choice - no one likes to think that they made a bad investment. And of course they want to see it continue to go strong, so they are not put in a position of being forced to learn something else, starting from scratch.

Introducing a New Language

For those who wish to introduce and popularize a new language, a key point is how to overcome the momentum conferred on existing, popular languages by network effects and switching costs.

In this day and age, programming language implementations are mostly available for free (and in many cases even have open source implementations), meaning that introducing a newer, cheaper language and implementation is rarely possible. If we develop a new language, the "cheaper" and "better" portions of the equation therefore have to come from the quality of the language itself. It will have to make code:

easier to write - expanding the number of people who can use the language, and thus its value. Think of scripting languages compared to C. They are more forgiving and easier to use thanks to forms of garbage collection or reference counting, meaning that people who might not have been able to accomplish a given task with C could do it with a scripting language.

more efficient - making for faster programs that take less time or system resources to accomplish what they need. Although, as we can see from recent trends like Java, this is less important as memory prices fall and disk space is ever more plentiful.

higher quality - meaning that less programmer time is spent hunting bugs, and more on developing new features. Garbage collection, again, is one such feature that means that the programmer need worry less about keeping track of memory. Some research has gone into "provable" programs, although without any effect on mainstream techniques at this date. Not having to deal with direct pointer manipulation is also a way to eliminate a category of bugs.

more productive - languages that let you do complex things easily mean that you can do more than a competitor using a language that's slower to develop in. PHP on the web let people hack out quick dynamic web pages very quickly without a great deal of setup. The Tk graphical toolkit made writing GUIs very easy, propelling the Tcl language to widespread use. This is very often a question of libraries that abstract at the right level rather than the language itself; most languages, however, have a "feel", and it's the core language and libraries that set the tone.

Of course, the gains must be significant - if you only get a five percent gain, but lose out on all that existing code, or that big community of programmers who you can ask for help at a moment's notice, is it really worth it?

Compatibility

One way of giving yourself a boost is with some form of compatibility. C++ is an obvious example of this. Another is the major scripting languages, all of which are written in C, and provide an interface that allows them to interoperate at that level. Tcl was, for instance, conceived not as a stand-alone language, but as a library providing a flexible language to be included in larger programs, meaning that it has a C API that lets you easily interact between Tcl and C. This means that all the scripting languages have been able to take advantage of libraries written in C, greatly increasing their usefulness.

Lisp, a very elegant language that is widely admired by language aficionados, has often taken the approach of being oriented to doing everything in Lisp - in the early 80s, computers were even produced that ran Lisp: "Lisp Machines"! Perhaps this desire to be Lisp "all the way down" has cost this language something in terms of its ability to co-opt through compatibility with C, but that's a complex discussion in and of itself and is probably best left for another article.

Java is another language that tends to desire "purity," although it is possible to integrate native C code with it. Its success is likely attributed to some genuine improvements over C and C++ in terms of ease and quality, and to its rigid organization, which gives teams the ability to assign tasks to less talented programmers without fear of contaminating the entire project with a hard-to-find segmentation fault. Millions of dollars of Sun Microsystems dollars for marketing and building out large standard libraries were also useful in the language's uptake, although this is a technique not available to many aspiring language designers.

Attack a Niche

Despite attempts to make languages flexible and fit to carry out as many different tasks as possible, as tools, languages do have areas where they are more suitable than others. Fortran is still tops for some scientific applications, but you most likely wouldn't use it for web applications. Erlang is a very robust language and platform that is ideal for applications that are not allowed to fail, yet it's likely not as easy for beginners as something like Python.

The world of computers and technology is always changing, so it follows that new niches and markets will emerge where existing languages aren't ideal. For example, scripting languages were a natural fit for the Web. Easy to use and handy for dealing with text, they made it much more pleasant to create web pages than working with C would have been. Language designers hoping to see their creation more widely used may wish to identify a particular niche where their invention is particularly well suited. The new language might not be that far out in front of the field in every way possible, but if development is concentrated to provide a system that is at least superior in one thing that it does very well, that might prove to be enough to at least get people using it.

PHP, for example, wasn't a novelty when it was created. As a language, it's nothing particularly innovative compared with others of its kind. However, it did dynamic web pages really well, and very easily compared to other systems available at the time. I recall setting up mod_perl and installing various components that often had version requirements. It was a pain compared to the more tightly integrated PHP setup. Another example, as mentioned above, was the combination of Tk with Tcl for scripting GUIs. Tcl is well suited to a variety of tasks, from running on Cisco routers to doing web scripting itself, but Tk was its home run, and what it became known for.

Conclusion

The notions discussed here will seem familiar to those who have dedicated some thought to the matter, but by framing them in terms of costs and benefits, I hope to give language users some insight into selecting a language that best suits their needs. This framework will also help language designers and advocates to better market their products by giving them a common language, that of economics, to discuss the challenges and opportunities inherent in the market for computer programming languages.

Besides the Information Rules book mentioned at the beginning of this article, another resource for readers interested in following up on this topic is John R. Mashey's "Languages, Levels, Libraries, and Longevity" published in ACM Queue (vol. 2, no. 9, Dec/Jan 2004-2005). Thanks also to Professor Stephen J. Turnbull, of the University of Tsukuba, for his insightful comments and suggestions.