Common Lisp code to create an n-inputs m-units one layer perceptron. Taken from the code of AIMA, a classic textbook in Artificial Intelligence. The whole code here.

If you are a programmer that reads about the history and random facts of this lovely craft, and practice it ad honorem — just for fun — , you have found yourself reading about a programming language called Lisp. Some praise it as a software miracle, as the best tool for programming. Some even dare to call Lisp one of the best programming languages ever invented (even if that doesn’t make sense at all). After all, before Python, Scala, Haskell, there was programming, and before Deep Learning there was Artificial Intelligence. Great hackers that love Lisp:

Paul Graham, co-founder of Y-Combinator is a big Lisp evangelist. He wrote his startup’s code in Lisp. Viaweb — the startup — was co-founded along with Robert Tappan Morris, a legendary hacker who allegedly released the first computer worm accidentally. After being a rehabilitated worm writer (written in C), Robert Tappan wrote the server code of the small company in Common Lisp. Viaweb was sold to Yahoo! in 1998 for $48 million dollars. Of course there’s not enough evidence yet to said that C lead you to jail while Lips makes you a millionaire.

Alan Kay, a pioneer in the practical aspects of OOP and the lead developer of the original Smalltalk (another software miracle?), has called Lisp the greatest single programming language ever designed. He has also compared Lisp with Maxwell’s equations.

Edsger Wybe Dijkstra, a pioneer in the field of formal verification and specification, concurrency theory, and operating systems design, whose most famous work is a greedy one, once said:

Lisp has jokingly been called the most intelligent way to misuse a computer. I think that description is a great compliment because it transmits the full flavor of liberation: it has assisted a number of our most gifted fellow humans in thinking previously impossible thoughts.

Robert Floyd, a Stanford professor without a PhD — I’m just joking , but that’s true— , the designer of Floyd-Warshall algorithm, a pioneer in axiomatic semantics, and the author of the best book to learn mathematical induction from as a hacker, once said:

Although my own previous enthusiasm has been for syntactically rich languages, like the Algol family, I now see clearly and concretely the force of Minsky’s 1970 Turing Lecture, in which he argued that Lisp’s uniformity of structure and power of self reference gave the programmer capabilities whose content was well worth the sacrifice of visual form.

Some CS celebrities that have treated Lisp as a miracle (sometimes). A venture capitalist, a musician, Dijkstra’s algorithm inventor, and Robert Floyd (he was highly appreciated by Donald Knuth).

Three of our luminaries, along with Marvin Minsky (the guy referred by Floyd), and John McCarthy (the inventor of Lisp), were awarded with the Turing Award. So why do many CS celebrities talk so good about a simple programming language? Lisp is famous nowadays because of the things others have said about it, but in the early days of AI, Lisp was the de facto language to express ideas related to natural language processing, computer assisted geometry, text generation, AI planning, and automated theorem proving. Yes, there was AI before Machine Learning, indeed, there was an AI winter before the boom of neural networks and statistical approaches to AI, but that’s a topic that deserves an entire single post.

The Lisp approach to AI

John McCarthy, the inventor of the term “Artificial Intelligence”, the inventor of garbage collection, and the inventor of Lisp. Marvin Minsky, the founder of the AI lab at MIT.

The progress, development, and evolution of Lisp was tightly related to the early progress, development, and evolution of Artificial Intelligence. Two of the guys mentioned before were pioneers in AI. John McCarthy, the creator of Lisp, coined the term Artificial Intelligence, while Marvin Minsky shaped the content of the new field by founding the AI lab at MIT. Many of their students were the developers of the first digital milestones of artificial intelligence.

Programs for natural language understanding and generation, game playing (the link contains a paper from the man who introduced the term Machine Learning), theorem proving, early computer vision , symbolic mathematics (specially integration), problem-solving and knowledge representation, were produced at Stanford and MIT using different dialects of Lisp as a tool to express those ideas in. Was it just a coincidence, or is there something special with the idea (not just the language) of Lisp? This is a list of some classic AI programs that were expressed in Lisp.

A typical conversation between a human and ELIZA. The paper that introduced the program is called ELIZA — A Computer Program for the Study of Natural Language Communication between Man and Machine

MACSYMA (MAC’s symbolic manipulator) was one of the first computer algebra systems originally developed at MIT’s project MAC. MACSYMA was written in a dialect of Lisp called MacLisp, and at that time it was one of the biggest Lisp programs out there. In 1982, MACSYMA was licensed to Symbolics, a computer hardware company whose main focus was the production of machines whose architecture was optimized for the development and interpretation of Lisp programs.

A very simple session in MACSYMA.

SHRDLU, was the dissertation of Terry Winograd (the PhD advisor of Larry Page at Stanford University) at MIT. It was written in the AI lab created by Minsky to demonstrate a dialog with the machine that could lead to actions taken by the machine in a virtual environment both agents (the human, and the machine) were capable to understand. As MACSYMA, SHRDLU was written in MacLisp.

A sample session in SHRDLU. The program was supposed to understand and execute actions told by a human in natural language.

The progress of AI in its early days was not because of Lisp, I do think CS subjects should be agnostic of the language they express their ideas in. Lisp was used on the early days of AI because it was flexible enough to allow quick experimentation and prototyping (REPL), and it introduced fundamental ideas that were cool and fresh at the moment (IF-THEN-ELSE construct, recursion, and Garbage Collection). Those features proved themselves to be useful to express the kind of the ideas AI people needed to express. This innovation, and the rapid adoption of Lisp for AI (in labs and projects) helped the language grow and become a standard AI language.

Of course all these programs could have been written in other languages, but Lisp was an accepted and highly praised vehicle to explore and implement these kind of ideas at the moment.

Lisp in the real world

At this point, you may think that Lisp was just an academic invention to teach and implement symbolic AI programs. But the rapid adoption of Lisp in academia, implied a massive effort to embrace Lisp (or any of its descendants) in real-world production ready software. The following is a collection of some of those programs; most of the programs included in this list are still running on production environments, while the other part of it used to backup large pieces of software in well-known projects or companies.

You need to know that Lisp and its dialects have evolved a lot since McCarthy defined it for the first time, but most of the original idea of Lisp has been preserved in its descendants. Most of the semantics of Lisp has been an invariant in most Lisp’s implementations that were capable to power or support, in one way or another, the operation of the following projects.

Some of the projects/companies whose stack has included Lisp.

One of Lisp’s main virtues, is that it enables a programmer to create new linguistic abstractions with ease. So there should be not surprise in the fact that Lisp has influenced many popular programming languages; two of them — very close to the AI/Data Science/ML community (besides from Lisp itself) — , which are R and Julia.

R was originally written as a very simple Lisp interpreter using as reference a chapter of a very popular introductory textbook on computer science, and a really good but surprisingly unknown book on Programming Languages. Lisp held an enormous influence in the development and conception of the first R implementation as documented by Ross Ihaka (the creator of R) many times:

Julia development was heavily inspired by the same Lisp dialect that inspired R. That influence was so big, that the language developers decided to write some parts of the language pipeline in it. The Julia parser is written entirely in Scheme and it’s evaluated using a Lisp dialect written by one of the language designers (femtolisp).

Another language that’s worth mentioning is Lush, a scientific object-oriented programming language designed to prototype numerical analysis, computer vision and machine learning programs. It was designed and implemented by Yann LeCun, the man behind the introduction of Convolutional Neural Networks to Computer Vision (along with Kunihiko Fukushima), and the current director of Facebook’s AI lab.

If Lisp if so great, Why TensorFlow’s main language isn’t Lisp?

Most of the programs mentioned earlier made heavy use of symbolic manipulation. As mentioned by Carlos E. Perez in his post The Many Tribes of Artificial Intelligence, before ML and the Neural Network boom, there were symbolic based approaches to AI that combined symbolic manipulation of some elements, following a collection of rules that were modeled with the purpose to encapsulate the behavior of an intelligent system. The problem those days was not the efficient computation of numerical problems, but the manipulation and synthesis of symbols.

Just as C, C++, and Fortran shine in numerical computation where performance matters the most, Lisp shines in symbolic manipulation. One of Lisp’s greatest strengths is being able to handle efficiently symbols and lists.

Lisp is not a perfect language, it has many flaws (lots of dialects, lack of well-known libraries, weird syntax that does not contribute to attract people in, dynamic typing, etc.), but it was a well-suited tool for the problems AI pioneers were trying to tackle at those days, just the same way C/C++, or Fortran are a perfect choice to implement the underpins of a Deep Learning system (TensorFlow is implemented both in C++ and Python). There’s not a single Swiss army knife programming language, we do need to pick a language that suits the most the particular task we’re approaching.

Exploring AI with Lisp

The whole idea of this series is to use Lisp, more specifically, its dialect Scheme to explore Artificial Intelligence (AI is much more than programming, and AI programming is much more than Lisp) related ideas. The goal is to learn together about classical AI concepts such as general problem solving, text generation, symbolic mathematics problems, knowledge representation, expert systems, search, NLP, logical and stochastic reasoning, game playing, and even “contemporary” stuff such as neural networks using the Scheme programming language to express those ideas.

Let’s begin our journey exploring Artificial Intelligence using Lisp. Your homework for the next post in the series is to install MIT-Scheme on your machine.