The origin of CAR and CDR in LISP

alt.folklore.computers

The use of the keywords CAR and CDR to denote the head and tail of a list in LISP has always struck me as odd,

According to the LISP 1.5 Programmer's Manual ISBN 0 262 13011 4, `A' and `D' stand for `address' and `decrement', as on page 36 it says:

Lists are not stored in the computer as sequences of BCD characters, but as structural forms built out of computer words as parts of trees. In representing list structures, a computer word will be depicted as a rectangle divided into two sections, the address and decrement.

+-----------+-----------+ | add. | dec. | +-----------+-----------+

Copy Address Register

Copy Decrement Register.

David Udin wrote:

David Udin

Charles Richmond replied to David

Edward Rice replied to David

Fredrick Backman replied to Charles

David Ulim wrote in reply to questions by Kevin D. Quitt (italics):

On early IBM machines: 704, 709, 7040, 7090, 7094. Perhaps someone can tell us which was the first used for implementing lisp; I first encountered it on the 7094. Yes, it had only a 32k address space. The remaining bits were used to mark words during garbage collection and, as I recall, to indicate whether a word was an atom or s-expression.

Comment by Alan Kotok:

As I recall, the first LISP implementations were being done around 1961, which probably meant for the 7090. Steve Russell worked on it an maybe he can comment.

And that's why the PDP-10 was so popular for lisp, because it had 36 bit words, and therefore a 256K workspace--you could actually do something significant on it without worrying (too much). The PDP-10 also had really nifty half-word instructions which made the implementation faster and safer.

Rumor at the time had it that Alan R. Kotok, (one of?) the designer(s) of the PDP-10 was influenced by the needs of lisp implementation in designing the PDP-10.

Comment by Alan Kotok:

Indeed, that's me (without the R.). I was chief architect of the PDP-10, many, many years ago. The architecture dated back to 1963 and the PDP-6, which I worked on with Gordon Bell. And, yes, facilitating a good LISP implementation was an important consideration.

Comment by Alan Kotok:

Basically true. Perhaps Pete cares to amplify. Comment by Pete Samson:

Yes, the subway program filled the 256K-word Fabri-Tek memory. To reduce the time spent garbage collecting, some of the numerical routines were written by hand in LAP (Lisp Assembly Program) to avoid consing of intermediate numerical results. Probably just visiting each station on the subway map could have been done in less than 24 hours, but we undertook a slightly different problem, to travel over each segment of right-of-way; it took more than 25 hours (all on a single fare, of course). Fortunately, the New York subway system is open all night.

Bill Sudbrink wrote:

Charles Richmond wrote

These names are hold-overs from the original implementation of LISP on the IBM 704. That machine had partial-word instructions to reference the address and decrement parts of a machine location. The a of CAR comes from "address", the d of CDR comes from "decrement". the c and r come from "contents of" and "register". Thus CAR could be read "contents of address part of register".

Bill Sudbrink replied again:

Allan J. Baum wrote:

The offset (or address part) of an inst. was in the low half. The decrement field was in the high half.

In index register ops, the decr. field was used to compare/modify(by addition) index registers, while the address in the low half was used general as a branch displacement. Non index-reg ops used this field as extended opcode & indirect specifier selector

Which still begs the question of why it was called the decrement field. Index regs could be transferred from either half of the accum or memory

Ron Hunsinger replies:

Bingo! You've got it.

Except that the decrement didn't have to come from an index register. It could come from the decrement field of the instruction, as a literal.

The 704 and successor machines did not have clean addressing modes or a consistent instruction format, but many instructions broke the 36-bit instruction word into fields of 3:15:3:15.

3: Opcode (or opcode family in most cases, with the rest of the opcode coming from the other fields)

15: Decrement field, used in many instructions to provide a literal value to be subtracted from an effective address or a register. In other instructions, this was an extension of the opcode field.

3: Index field. The 7090 had three index registers, each of which corresponded to one bit of this field. Setting multiple bits logically ORed the registers to get the index value to be subtracted. (I could be wrong about this.) The 7094 had 4 additional index registers (hence the extra 4 in the model number), and the index field selected one of the 7 index registers (or 0 for no indexing). In some instructions, this was part of the opcode.

15: Address field, used in most instructions to specify a base address.

(In reply to the last paragraph:)

Because of how the field was used in instruction words.

There were instructions to make it easier to modify the decrement and address fields of data words, because back then self-modifying code was the norm, and nobody bothered to distinguish between instructions and data.

Email from Alan Kotok

Email from Pete Samson

Email from Steve Russell

The 704 family (704, 709, 7090) had "Address" and "Decrement" fields that were 15 bits long in some of the looping instructions. There were also special load and store instructions that moved these 15-bit addresses between memory and the index regiseters ( 3 on the 704, 7 on the others )

We had devised a representation for list structure that took advantage of these instructions.

Because of an unfortunate temporary lapse of inspiration, we couldn't think of any other names for the 2 pointers in a list node than "address" and "decrement", so we called the functions CAR for "Contents of Address of Register" and CDR for "Contents of Decrement of Register".

After several months and giving a few classes in LISP, we realized that "first" and "rest" were better names, and we (John McCarthy, I and some of the rest of the AI Project) tried to get people to use them instead.

Alas, it was too late! We couldn't make it stick at all. So we have CAR and CDR.

As the 704 has 36 bit words, there were 6 bits in the list nodes that were not used. Our initial implimentation did not use them at all, but the first garbage collector, comissioned in the summer of 1959, used some of them as flags.

Atoms were indicated by having the special value of all 1's in car of the first word of the property list. All 0's was NIL, the list terminator.

We were attempting to improve on "IPL-V", (for Interpretive Processing of Lists - version 5) which ran on a 650. I believe that the 0 list terminator was used there, but I believe that the all 1's flag for atoms was original.

Hope this is enlightening.

Question by Richard Simmons to Steve Russell

Tom Eggers (here at the University of Colorado at Colorado Springs), tells me that "you were there" when they invented lisp, and you would know the answer to this question.

In the IBM 7094, the Contents of the Decrement Register and Contents of the Address Register were thus:

+---+---------------+---+---------------+ | | CDR | | CAR | +---+---------------+---+---------------+

Answer from Steve Russell

I believe that we started writing list structures in the "natural" (to us english-speakers) from left to right before we had fixed on implimentation details on the 704. ( In fact, I believe that IPL-V wrote them that way ).

I don't remember how we decided to use the address for the first element, but I suspect it had to do with guessing that it would be referenced more, and that there were situations where a memory cycle would be saved when the pointer was in the address.

Hope this is sufficient. Sorry its not a better story.

Links to LISP sources

Back to my `life as a hacker' page.