W - A simple programming language

So why would anyone in his right mind would create a whole new programming language? Isn't BASIC or C++ good enough already?

Well, almost. You see, recently I got my hands on two vintage Hewlett-Packard handheld computers, an HP-95LX and an HP-200LX, and I promptly fell in love with both. Both teeny DOS-based machines are surprisingly useful, to the extent that even their little keyboards are eminently practical. (HP keyboard quality helps.) And both lack a decent programming language.

That is not to say that you cannot write code for these HP handhelds using any compiler that can produce DOS programs, such as Microsoft's Visual C++ 1.52c (the last 16-bit version of that development system.) But you'll never actually install Visual C++ on an HP-95LX. Apart from the fact that this machine doesn't run Windows, it also lacks the substantial number of MIPS and megabytes that this development system requires in order to run.

How about BASIC? Well, if you're lucky and you can find your old floppy disks, you may be able to locate an old copy of GWBASIC or some other "lean-and-mean" text-based BASIC interpreter. But an interpreted language is inherently inefficient, and, well, requires the interpreter to run. OK, so maybe I could find a BASIC compiler somewhere if I really looked hard, but who the devil wants to write programs in BASIC anyway?

No, I was looking for something much more simple and efficient. So like any good programmer, I decided to stop looking and start doing... having built my own computer not too long ago, it was time to put down my screwdriver and embark on another project, building a programming language and compiler from scratch. A compiler that actually fits on a single sheet of paper (okay, I admit, I DID have to use a very small font for that. But, it's still readable even if you need to put on your reading glasses!)

What I had in mind was a C-like language, but not quite C itself. Just like C, W would have functions and compound statements; local and global variables; pointers and expressions. W, on the other hand, will be a keyword-less and typeless language.

Typeless and keyword-less you ask? What for? Well, I admit to having been influenced by both C's predecessor, BCPL (itself a typeless language) and by Richard Bartle's MUDDLE, an object oriented programming language designed especially for writing Multi-User Dungeon (MUD) games. A language without data types and keywords has a great degree of clean elegance, something I was trying to reproduce in W.

This language has only one data type: a 16-bit word (hence the name, W.) Every symbol, be it the name of a function, a local, or a global variable, is in fact just a substitute for a 16-bit quantity that may be a value or an address in memory. This restriction may seem too much at first; how can a language like this, for instance, effectively handle character strings? As it turns out, it's not nearly as difficult to do as it may appear. W may be elegant but it's also practical. As these pages demonstrate, it is a language that is sufficiently powerful to compile its own compiler, and produce usable code.

A First Example

Ever since Kernighan's and Ritchie's "bible" on the C programming language first saw the light of day, it has been traditional to introduce a programming language through a simple program that just prints "Hello, World!" on the computer screen. Here is this program in W:

write := 0x8B55, 0x8BEC, 0x085E, 0x4E8B, 0x8B04, 0x0656, 0x00B8, 0xCD40, 0x7321, 0x3102, 0x8BC0, 0x5DE5, 0x90C3 _() := { write(1, "Hello, World!\r

", 15) }

Short as this program might be, it already demonstrates a few key characteristics of W.

First, W is a language without a standard library. Interfacing with the operating system is the programmer's responsibility. In the present case, we wish to use the operating system to print a 15-byte message on standard output. For MS-DOS, one possible implementation in machine language would call the appropriate Interrupt 21 function as follows:

0100 55 PUSH BP 0101 8BEC MOV BP,SP 0103 8B5E08 MOV BX,[BP+08] 0106 8B4E04 MOV CX,[BP+04] 0109 8B5606 MOV DX,[BP+06] 010C B80040 MOV AX,4000 010F CD21 INT 21 0111 7302 JNB 0115 0113 31C0 XOR AX,AX 0115 8BE5 MOV SP,BP 0117 5D POP BP 0118 C3 RET 0119 90 NOP

It is the bytes of this machine code subroutine that are assigned to the symbol write in the W program above.

Second, a W program at the topmost level essentially consists of declarative statements in the following form:

symbol := definition

Both the symbol write , and the function _() are defined in this fashion.

Third, each W program must contain a function named _() , which is where program execution will begin.

Compiling this program with the W compiler produces the following output:

C:\>w hello Address map (global symbols): ============================= 0120 (code) _ 013B (heap) write

The compiled result is an MS-DOS executable, hello.com, a file 100 bytes in length.

The following pages provide a more in-depth introduction into W, both from a user and from a compiler programmer perspective.

If you wish to download the W compiler and experiment with it yourself, feel free to do so by clicking this link.