Jan 21, 2013

Today I would like to introduce an idea that I’ve been playing around with as a thought experiment for years, but that has finally become a reality. Imagine a programming language designed specifically for teaching young computer science students a solid foundation in sound computer science topics as well as practical techniques useful in creating rock-solid industrial systems. Below, I’ll outline the features of Enfield.1

the Enfield logo

Syntax

To start, I’m strong believer that when learning new, and at times complex topics it’s outright detrimental to overload students with frivolous syntax rules. Therefore, Enfield is designed with minimal syntax rules and what’s more minimal than a Lisp like syntax. An example function definition in Enfield is as follows:

(define (fibo number) (letrec ([f (lambda (number n i) (if (<= number 0) i (f (- number 1) (+ n i) n)))]) (f number 1 0))) (fibo 500) ;=> 1394232245616978801397 2438287040728395007025 6587697307264108962948 3255716228632906915576 58876222521294125

There are a few interesting points about this snippet:

A Lisp syntax Bignums Tail-recursion Lexical scope

In each case a pedagogist can take the opportunity to deeply explore these important topics. However, focusing again on syntax for a moment, while I do think that a simplified syntax is best, it’s important to allow maximum flexibility. It would be nice for Enfield to provide macros in additional to lexer and parser hooks (another chance to dive deeply here) in order to form the syntax to one’s will, allowing a multitude of syntactic forms. One interesting possibility, and one that I think is sorely lacking from CS education, is a special form of Enfield that allows a student to write essays in a textual language containing embedded source code. Too often CS education focuses on mathematical formalisms and code to the dismissal of the spoken and written word — Enfield can help!

Essential building blocks

Key to any college programming course is that the language of choice provide the essential building blocks of computation. Ideally, an introductory course would require students to not only learn a language, but to also create an interpreter for a subset of the language using the language itself. Enfield should therefore provide a set of basic building blocks useful for building Enfield in Enfield. Features such as metaprogramming, code as data, concurrency primitives, continuations and parsing libraries are just a few that would allow a deep understanding of not only Enfield, but language design and implementation in the abstract.

A plethora of libraries

While it’s a great idea to keep the syntax and logical footprint of Enfield small, in no way should the language be minimal in it possibilities. A language for learning should provide a large spectrum of possibility, from offering easy exploration of the untyped lambda calculus to first-class access to performant linear algebra libraries via its packed offerings, package repositories and popular source repositories.

Hardcore IDE

A first-class IDE is essential to the learning process. As amazing IDEs like LispWorks, Mathematica Notebooks and ObjectStudio have shown, there is amazing potential in a programming environment tightly integrated with the language on which it operates. A Enfield IDE should have the standard fare, including: breakpoints, value inspection, import resolution, expression evaluation and all of the other features found in modern IDEs. Additionally, a first-class graphical capability allowing students to plot graphs and explore the shape of algorithmic execution are key.

An illustration of a ‘hypothetical’ IDE session of a student creating a Tetris game.

There is a lot to learn from new IDEs technologies like Light Table in developing a Enfield IDE.

A plethora of paradigms

While Enfield supports functional styles, and any curriculum should use it as the primary focus, it should also support numerous paradigms.

Query languages and logic

Logic programming languages should be readily available or easily made using Enfield. Ideally a curriculum built around Enfield would include sections on miniKanren (supplemented with appropriate miniKanren literature, Prolog, SQL and Datalog. A solid foundation in logical thinking and declarative programming is key for any computer scientist.

OOP

In addition to covering the standard academic topics such as functional programming, logic and continuations, Enfield should allow industry-proven object-oriented techniques. A simple example might look as follows:

(define animal-interface (interface () say)) (define cat% (class* object% (animal-interface) (super-new) (define/public (say) (display "meeeeew!")))) (define tom (new cat%)) (send tom say) ;; meeeeew!

This is a work in progress.

Metal

Lest I be accused of favoring high-level programming over bare metal techniques, I should say that every programmer should have exposure to low-level programming. Ideally Enfield would have a first-class FFI used to access C-level libraries created by the students. A different world unfolds when one needs to manage memory manually and tweak algorithms to their utmost to milk every last bit of speed. Enfield should facilitate that experience rather than deny.

Other paradigms

In addition to the paradigms above, Enfield should also support others including, but not limited to: prototype-based object programming (sample below), dataflow, imperative and parallel programming.

(define-object account (*the-root-object*) (balance set-balance! 0) ((payment! self resend amount) (self 'set-balance! (+ (self 'balance) amount)))) (define a1 (account 'clone)) (define a2 (account 'clone)) (a1 'payment! 100) (a2 'payment! 200) (a1 'balance) ;; => 100 (a2 'balance) ;; => 200 (a1 'payment! -20) (a1 'balance)

Of course, every section on a given paradigm should include deep discussion and exercises around the benefits and drawbacks of each paradigm. Using a multi-paradigm programming language as a teaching tool allows students to experience a wide breadth of paradigms allowing them to develop a true understanding of what each offers, its drawbacks and its strong points. The last thing we want are programmers versed in only one or two paradigms.

Creating robust software

A college curriculum is a delicate balance between academic and industry focuses. I’m of the opinion that a well-balanced curriculum focuses on core concepts, paradigms and histories to start and then only diving into “real-world” concerns for senior year projects or courses and graduate programs. However, along the way Enfield courses could indeed introduce and foster an understanding of techniques used for creating robust, real-world systems. As with the lessons on paradigms, a college curriculum should feature deep discussions and essays outlining the short-falls and benefits of learned techniques outlined below. It would be a tragedy to produce programmers who think that any one of the techniques below is wholly sufficient in ensuring robust software systems.

Unit testing

An Enfield unit testing framework could be developed from scratch with supplemental material from luminaries such as Kent Beck. Any included or available Enfield unit test framework should provoke much in-class respectful debate; something else sorely lacking from typical college curriculum and Internet message boards.

Contracts

A little used but powerful technique for software verification is the use of software contracts; or patterns for declaring relation function and method input/output constraints and object invariants. A thorough investigation of contracts on functions (including higher-order functions) compared to those on classes involved in hierarchies is fertile ground for deep discussion and debate. A simple example of what a contracts system might look like would be as follows:

(provide (contract-out [argmax (->i ([f (-> any/c real?)] [lov (and/c pair? list?)]) () (r (f lov) (lambda (r) (define f@r (f r)) (and (is-first-max? r f@r f lov) (dominates-all f@r f lov)))))])) (define (dominates-all f@r f lov) (for/and ([v lov]) (>= f@r (f v)))) (define (is-first-max? r f@r f lov) (eq? (first (memf (lambda (v) (= (f v) f@r)) lov)) r))

Contracts in Enfield should be applicable post-facto so that students can apply constraints to previous exercises or external libraries. This would be a fun and challenging.

Static typing

Enfield wouldn’t be complete without the ability to provide some level of static typing. Ideally, its static checks, like contracts, should be applicable to existing libraries either piecemeal or in whole. The exercise of applying static types to functions created from whole-cloth or pre-existing should provide a nice counter-point to the dynamic experience provided by Enfield by default. Static checks in Enfield would not look very different from regular source:

(: sum-list ((Listof Number) -> Number)) (define (sum-list l) (cond [(null? l) 0] [else (+ (car l) (sum-list (cdr l)))]))

Class focus on topics revolving around polymorphic dispatch, gradual typing, subtyping and union types should provide heated discussion.

Practical concerns

Rather than a purely academic language, Enfield should be reasonably practical. As a senior project or special course students could engage in building a webserver using pure Enfield. They would start with as few lines of code as possible…

(define (start req) (response/xexpr `(html (head (title "Hello Cleveland!")) (body (p "Hi!")))))

… and build up to more complicated offerings. Maybe using the robust FFI a student could create Node.rkt! Additionally, Enfield should be simple enough to allow the student to write a subset of it to target the web-browser via JavaScript.

Conclusion

I’ve outlined a complete programming language, runtime and tools that would, could and should be used in colleges across the world. From my perspective a multi-paradigm approach is ideal to provide a breadth of experience, to generate class discussion and debate and to minimize the tangential complexities that comes from switching languages.

If you’d like to discuss Enfield, its features or any of the ideas presented herein then feel free to comment on this post or email me at the address at the top of my blog.

:F

thanks to Brian T Rice for inspiring this post with a well-timed comment on the Twitters

update feb.2013: I have completed the implementation of Enfield! The complete implementation of Enfield (in Racket) is on Github. Patches welcomed!