Xerox Alto file system archive



Paul McJones

Revised 9 November 2017



Contents

A walk through the archive: people and software

Hardware

BCPL software

Mesa software

Smalltalk software

Lisp software



Oral histories

More on the archive

File names

Dump/load files

Disk-image files

Viewable formats

Raw files: endianness

File types

Provenance

A walk through the archive: people and software

In 1984, Butler W. Lampson, Robert W. Taylor, and Charles P. Thacker received the ACM Software System Award "For conceiving and guiding the development of the Xerox Alto System, which clearly demonstrates that a distributed personal computer system could provide a desirable and practical alternative to time-sharing."

Here is Butler Lampson's original memo motivating the project:

Why Alto. Xerox internal memorandum, December 19, 1972. Author's web site



Hardware

"Although a number of people in CSL and SSL contributed to the specification of the new system, Butler Lampson, Alan Kay, and Robert Taylor were the individuals primarily responsible for shaping the design. To the extent that CSL had project managers, I filled that role. My task was to convert the vision of Lampson, Kay, and Taylor into working hardware." [Thacker 1988]

Charles P. Thacker. Personal Distributed Computing: The Alto and Ethernet Hardware. In Adele Goldberg, editor. A History of Personal Workstations, Addison-Wesley, 1988. Co-author's website Butler W. Lampson. Personal Distributed Computing: The Alto and Ethernet Software. In Adele Goldberg, editor. A History of Personal Workstations, Addison-Wesley, 1988. Author's web site Two-part video recording of this talk: Collection of Computer History Museum

"The concept and structure of the Alto are due primarily to Chuck Thacker, Ed McCreight, Butler Lampson, and Alan Kay. The hardware described in this paper was designed by the authors together with Roger Bates, Tat Lam, Bob Metcalfe, and Severo Ornstein. The working environment, network, software, and microcode that grew on the Alto are due to hard work and fine craftsmanship contributed by many members of the Computer Science Laboratory and System Science Laboratory of the Xerox Palo Alto Research Center." [Thacker et al. 1979]

C. P. Thacker, E. M. McCreight, B. W. Lampson, R. F. Sproull, and D. R. Boggs. Alto: A personal computer. Xerox PARC report CSL-79-11, August 1979. Reprinted in Computer Structures: Principles and Examples, second edition, ed. Siewiorek, Bell and Newell, McGraw-Hill, 1981, pages 549-572. Author's web site Robert M. Metcalfe and David R. Boggs. Ethernet: distributed packet switching for local computer networks. Xerox PARC report CSL-75-7, November 1975. PDF at bitsavers.org

Communications of the ACM, Volume 19, Number 7 (July 1976), pages 395-404. ACM Digital Library Hardware documentation: http://www.bitsavers.org/pdf/xerox/alto/

BCPL

BCPL was designed by Martin Richards at MIT in 1967 based on his previous work implementing CPL at the University of Cambridge. The language and its compiler were designed to be easily ported to new computers. At PARC, BCPL was used for early work on Data General Novas, using a compiler ported by James Curry based on the TX-2 implementation at MIT Lincoln Lab. When the Alto project began, it was natural to continue using BCPL, which was used for most of the original system, server, and application software.

Operating System (OS), packages, and utilities

The Alto Operating System (OS) was designed by Butler Lampson, based on Stoy and Strachey's OS6. It was implemented by Lampson with Gene McDaniel, Robert F. Sproull, and David R. Boggs.

Butler Lampson. Alto OS Design Notes. Internal memo, Xerox PARC, February 1973. Author's web site Sources: [Indigo]<AltoSource/OSSOURCES.DM!2>

Manual: [_cd8_]<altodocs>os.press!2 B. Lampson and R. Sproull. An open operating system for a single-user machine. ACM Operating Systems Review, Volume 11, Number 5 (Dec. 1979), pages 98-105. Author's web site



Additional system facilities in the form of packages or libraries were designed and implemented by a number of people, including Basic File System (BFS) (Lampson), B-trees (Ed McCreight), floating-point arithmetic (Sproull), sorting (McCreight), and cubic splines (Patrick C. Baudelaire, Robert Flegal, and Robert F. Sproull).

A number of utility programs were written for the Alto OS.

James H. Morris, Jr. designed and implemented the first version of the Scavenger, which scanned a disk pack and restored its file system to a consistent state. David R. Boggs later reimplemented the Scavenger.

CopyDisk was designed and implemented by David R. Boggs. As its name implies, it could copy a filesystem from one disk pack to another, initially on an Alto with two disk drives. Later versions of CopyDisk could operate between two Altos connected by the Ethernet, or between an Alto running CopyDisk and another Alto running the IFS file server.

The Swat debugger was designed and implemented by Morris, and was later rewritten by Boggs.

The Alto assembler was written by Ed McCreight.

Communication

"Pup is the name of an internet packet format (PARC Universal Packet), a hierarchy of protocols, and a style of internetwork communication. The fundamental abstraction is an end-to-end media-independent internetwork datagram. Higher levels of functionality are achieved by end-to-end protocols that are strictly a matter of agreement among the communicating end processes." [Boggs et al. 1979]

Pup was designed and first implemented (in BCPL) by David R. Boggs, John F. Shoch, Edward A. Taft, and Robert M. Metcalfe.

Alto sources: [Indigo]<AltoSource>PUPSOURCES.DM!4>

Documentation: [_cd8_]<pup>

TENEX sources: [_cd8_]<pup> David R. Boggs, John F. Shoch, Edward A. Taft, and Robert M. Metcalfe. Pup: An Internetwork Architecture. Xerox PARC report CSL-79-10, July 1979. PDF at bitsavers.org

IEEE Transactions on Communications, COM-28 (1980), pages 612-623. IEEE Xplore

The Alto File Transfer Program (FTP) was designed and implemented by David R. Boggs.

The Alto Telnet program (Chat) was designed and implemented by Robert F. Sproull. Sproull later enhanced Chat to allow the Alto to serve as a graphics terminal to a remote computer, usually MAXC, PARC’s timesharing system that ran Tenex and Interlisp. The system is described in:

Robert F. Sproull. Raster Graphics for Interactive Programming Environments. Xerox PARC report CSL-79-6, June 1979. PDF at bitsavers.org

Printing

"In the meantime, various people at Xerox were building a series of experimental raster printers. The first of these was called XGP, the Xerox Graphics Printer, and had a resolution of 192 dots to the inch. Xerox made XGP's available to certain universities, and by 1972 they were in use at Carnegie-Mellon, Stanford, MIT, Caltech, and the University of Toronto. Each of those organizations produced its own hardware and software interfaces. The XGP is historically interesting only because it is the first raster printer to gain substantial use by computer scientists, and was the arena in which a lot of mistakes were made and a lot of lessons learned. To replace the XGP, Xerox PARC developed a new printer called EARS, and then another newer printer called Dover. After the agony of converting software from XGP to EARS, various Xerox people realized that applications programs generating files for the XGP or for EARS should not be tied to the device properties of the printer itself. Bob Sproull and William Newman, of Xerox PARC, developed a relatively device-independent page image description scheme, called "Press format", which was used to instruct raster printers what to print."

[Brian Reid, "PostScript and Interpress: a comparison", LASER-LOVERS distribution list, March 1985]

The Alto device-independent print file format (Press) was designed by William Newman and Robert F. Sproull; Joe Maleson extended it with a capability for digitally sampled (scanned) images, which were printed via halftoning.

The PressEdit print file manipulation program was designed and implemented by William M. Newman.

The Fred font editor was designed and implemented by Patrick C. Baudelaire.

The PrePress font manipulation program was designed and written by Robert F. Sproull, then extended by Lyle Ramshaw.

The Spruce imager for Dover printers was designed and implemented by Robert F. Sproull and Dan Swinehart.

The Press imager was designed and implemented by Robert F. Sproull and Patrick C. Baudelaire. It converted Press files to raster page images in various formats (black and white, color, different sizes, scanning directions, etc.). The bit maps could then be transmitted to printers of various sorts often using a custom hardware interface. Although it was slow, the Press imager allowed experimentation with different color correction, halftoning, and scan-conversion algorithms. It was used with the Slot/3100 and the Pimlico color printer.

Storage

The Alto "Interim" File Server (IFS) was designed and implemented by David R. Boggs and Ed Taft.

Applications

Bravo was the first "WYSIWYG" word processing system. It was designed by Butler Lampson and Charles Simonyi, and was implemented by Simonyi, Thomas J. Malloy, Carol Hankins, Greg Kusnick, Kate Rosenbloom, and Bob Shur. Simonyi later moved to Microsoft, where he led the application software group, including Microsoft Word.

Sources: [Indigo]<AltoSource>BRAVOSOURCES.DM!1>

Manual: [_cd8_]<altodocs>bravomanual.press!2

File format: [_cd8_]<altodocs>bravo-file-format.press!2 Charles Simonyi. Meta-Programming: a Software Production Method. Ph.D. Thesis, Stanford University and Xerox PARC report CSL-76-7, December 1976. Bravo began as Project B for the thesis research. PDF at PARC.com

The Gypsy publication system was designed and implemented by Larry Tesler and Tim Mott. Gypsy used Bravo's text-editing routines, but provided the first modeless user interface, based on a model of selection, copy and paste.

The Markup bitmap editor was designed and implemented by William M. Newman.

The Draw vector editor was designed and implemented by Patrick C. Baudelaire.

The SIL editor and CAD system was designed and implemented by Charles P. Thacker. Ed McCreight and Roger Bates contributed additional CAD tools, including Analyze, Gobble, Route, Build, and NetDelays.

Charles P. Thacker. SIL-a simple illustrator for CAD. In Sheldon S. L. Chang, Editor-in-Chief, Fundamentals Handbook of Electrical Computer Engineering, Volume 3, pages 477-489, John Wiley & Sons, 1983. Copyright © by John Wiley & Sons, Inc. Posted here by permission of the publisher. PDF Sources, documentation: [_cd6_]<sil> , http://www.bitsavers.org/pdf/xerox/alto/SilMemos.pdf

The Descriptive Directory System (DDS) was designed and implemented by L. Peter Deutsch.

The Neptune file manipulation system was designed and implemented by Keith Knox.

Mesa

The Mesa programming language was designed by Charles M. Geschke, Butler Lampson, Jim Mitchell, James H. Morris, Jr., and Edwin H. Satterthwaite, with contributions from Alan Kay, Charles Simonyi, and John Wick. Mesa evolved from the Modular Programming Language (MPL), which was part of the Modular Programming System (MPS) project carried out jointly by PARC and the SRI International Augmentation Research Center (ARC). One of the goals of MPS was to facilitate migrating ARC's oNLine System (NLS) from the PDP-10 to smaller computers. MPL was designed by Butler Lampson and James G. Mitchell with contributions from others at SRI and PARC.

As the Alto project proceeded, it was decided to retarget the MPL compiler to the Alto. The new language, renamed Mesa, had a richer type system and stronger type checking than MPL. Its syntax was based on Pascal; its type system was influenced by Pascal and Algol 68. Mesa supported modular programming with separate interface and implementation modules, which features in turn influenced Wirth's Modula-2.

Chuck Geschke and Ed Satterthwaite designed and implemented the Mesa compiler for the Alto, and Richard Johnson and John Wick wrote a Mesa version of the Alto operating system, which also served as the runtime for Mesa programs. By the summer of 1976, the Mesa compiler had been rewritten in Mesa and brought up on the Alto. Mesa was used for much of the later Alto software, such as the Laurel email client and the Grapevine distributed email transport and name service. It was also used for products such as the Star Office Automation system, and a successor language, Cedar Mesa, was used for many later research projects running on successors to the Alto.

The Laurel electronic mail client was designed and implemented by Doug Brotz, Roy Levin, Mike Schroeder, and Ben Wegbreit.

The Grapevine distributed mail transport and name service was designed and implemented by Andrew Birrell, Roy Levin, Roger Needham, and Michael Schroeder.

Sources: [_cd8_]<grapevine> Andrew D. Birrell, Roy Levin, Michael D. Schroeder, and Roger M. Needham. Grapevine: an exercise in distributed computing. Communications of the ACM, Volume 25, Number 4 (April 1982), pages 260-274. ACM Digital Library Author's web site

The Alto Gateway was designed and implemented by Hal Murray.

Smalltalk

In 1987, Adele Goldberg, Daniel H.H. Ingalls, Jr., and Alan C. Kay received the ACM Software System for "seminal contributions to object-oriented programming languages and related programming techniques. The theories of languages and development systems known as 'Smalltalk' laid the foundation for explorations in new software methodologies, graphical user interface designs, and forms of on-line assistance to the software development process." Smalltalk-72 was used fairly extensively at PARC, but its syntax and semantics were fairly different from the modern Smalltalk. Smalltalk-76 introduced classes, and Smalltalk-80 is the version that became widely known and used outside of Xerox PARC.

Lisp

"The ByteLisp project began in 1973 as an outgrowth of the author's previous work on small Lisp systems and of other work being done at Xerox on personal computing. The proposed architecture of the system was described in a paper published in August 1973. A small group led by the author began implementation of the system around that time, first on a Data General Nova and then on the Alto hardware ... which became available in 1974. The system was running around mid-1975, but far too slowly to be usable. In the course of the next year and a half, the group rewrote most of the non-Lisp-implemented kernel of the system in Lisp, added new microcode for the arithmetic functions based on some dynamic measurements, wrote down a precise definition of the lnterlisp dialect we were implementing, and designed and built a novel garbage collection method. The system reached essentially its present form in early 1977; by that time it had successfully run a number of large Interlisp programs, although still too slowly for any real use. Since then we have done no work on this system. Other papers at this conference describe an implementation of essentially the same system architecture on newer hardware. ... In addition to the author [L. Peter Deutsch], Dan Bobrow made major contributions to the design of the Alto Lisp system. Willie Sue Haugeland [now Willie Sue Orr], the other member of the original implementation group, has been largely responsible for the implementation throughout the development of the system. J Moore wrote the precise specification of Interlisp, without which the project could not have hoped to emulate existing Interlisp adequately. Larry Masinter and Warren Teitelman provided invaluable help in debugging the specification and moving the Interlisp system towards machine-independence." [Deutsch 1980]

So far, none of the source code for the Alto implementation of Lisp has been located. However, the system was the basis for Interlisp-D, which ran on various microcoded "D machines", including the Dorado, Dolphin, Dandelion, and Daybreak. A later implementation of the virtual machine in C led to an implementation for Sun and other Unix workstations and Linux.

Oral histories

More on the archive

File names

The files in the archive are named according to the conventions of the Alto Operating System. Each file has a name consisting of letters (upper and lower case can be used interchangeably), digits, and any of these special characters: + - . ! $

The name is usually divided into two part by a period: the main name before the period, and the extension after the period. A name can also have a version number, which is a number that comes at the end of the name, preceded by an exclamation point. For example, in the name " bravomanual.press!2 ", the main name is " bravomanual ", the extension is " press ", and the version number is 2.

When Alto files were stored on an IFS file server, they were grouped in directories and subdirectories. A directory name can contain the same characters as a file name. A directory path is a list of names of directories, with "<" at the beginning and ">" after each directory name. An example is " <BravoX>Fonts> ".

The archive includes files that had been stored on several different servers IFS file servers, so a full name for a file specifies a server (with square brackets around the name), a directory path, and a file, for example: " [Indigo]<BravoX>NEPTUNE>NEPTUNE-MANUAL.PRESS!2 ". The servers present in this archive include Filene, Ibis, Indigo, Io, Ivy, and Pixel. There are two additional "pseudo-servers", _cd6_ and _cd8_, corresponding to additional files that had been restored in an earlier project at Xerox PARC.

The archive has a web page listing the servers. For each server there is a web page listing the top level directories, with an additional web page for each subdirectory. Each directory page lists subdirectories and files in separate sections. Each file is generally presented in two different ways: in a viewable format (e.g., HTML, PDF), and in its original format (referred to as "raw" in the directory listings).

Dump/Load files

An Alto file with the extension " dm " or " DM " is called a Dump/Load file and contains a collection of files, analogous to a Windows " zip " file or a Unix " tar " file. In this archive, each Dump/Load file has been unpacked to a new directory with the same name as the Dump/Load file. For example, corresponding to " [Indigo]<AltoSource>ALTOUSERSHANDBOOK.DM!2 " is " [Indigo]<AltoSource>ALTOUSERSHANDBOOK.DM!2> ". The corresponding web URL is " .../Indigo/AltoSource/ALTOUSERSHANDBOOK.DM!2_ ", with an underscore at the end.

Disk-image files

An Alto file with the extension " altodisk " or " bfs " or " copydisk " is an image of an entire Alto disk pack, analogous to an " iso " CD-ROM image. In this archive, each disk image file has been unpacked to a new directory with a name based on that of the disk image file. For example, corresponding to " [Indigo]<BasicDisks>BcplProg.BFS!13 " is " [Indigo]<BasicDisks>BcplProg.BFS!13> ". The corresponding web URL is " .../Indigo/BasicDisks/BcplProg.BFS!13_ ", with an underscore at the end.

Viewable formats

Many of the files (such as most of the program source code files) consisted of ASCII text; each of these has been rendered as an HTML file with a <pre> element surrounding the body of the file. Some files (such as documentation, memos, and, sometimes, program source code) were created by the Bravo word processor, which embedded formatting information along with the text. Each of these files has been rendered as an HTML file that uses CSS styles to mimic the intended Bravo formatting (such as font changes, bold and italic, and indentation). Some files originally created by document editors such as Bravo had been converted to the Press device-independent print file format. Each of these has been rendered as a PDF file that attempts to mimic the intended Press formatting (including font changes, bold and italic, etc.) and graphical elements (including rectangles, Bezier curves, and scanned images). All other files have been rendered as an "octal dump" of up to the first 100,000 bytes, using the Unix command line " od -t oC -N 100000 ".

The renderings of Bravo and Press files is still buggy and incomplete, but in most cases is good enough to convey the intended content. In particular, support is lacking for the various Alto fonts: encodings, metrics (widths), and glyphs. The Bravo document profile is currently ignored, as are named tabs. Press graphics — rectangles, dots, and objects (filled paths consisting of straight and curved line segements) — are supported.

There should be viewers for Sil documents and fonts (.al and .strike for the screen and various formats for the printer -- see http://www.bitsavers.org/pdf/xerox/alto/printing/).

Raw files: endianness

Each link of the form (raw) in the directory listings gives access to the original sequence of bytes for the corresponding file. This will only be of use when you understand the file type for that file. Note that the Alto had 16-bit words, and was "big-endian": the high-order byte of a 16-bit word is stored at an even byte address, and the low-order byte of the same word is stored at the next higher odd byte address.

File types

Often, the type of the file can be determined from the extension in the file's name. Here is a short glossary of file types:

.al Alto screen font represented as a bitmap for each character .altodisk See .bfs .asm Alto assembler source program .bcd Relocatable object file created by Mesa compiler .bcpl BCPL source program .bfs Contents of a complete Alto disk pack represented as a file .boot A file from which an Alto can be boot-loaded (operating system, utility, etc.) .bootmesa A configuration file used by the Mesa linker to build a bootfile .br Relocatable object file created by BPCL compiler .bravo Document created by Bravo word processor (viewable as HTML in the archive) .cm Command file interpreted by Alto Executive .copydisk See .bfs .d BCPL definitions, incorporated into a program via a get statement .decl alias for .d .dm DumpLoad file, packaging a set of files .image Executable form of program written in Mesa (started with RunMesa.run) .laurel Relocatable object file created by Mesa compiler intended as an add-on to the Laurel mail client application .mac Tenex PDP-10 assembly language .mc Microcode assembler source program -- specifically for D0 ????? .mesa Mesa source program .mu Alto microcode assembler source program .press Device-independent print-ready file, created by various applications (viewable as PDF in the archive) .pub Document prepared as input to PUB formatting program (running on PDP-10 rather than Alto) .run Executable form of program written in BCPL and/or assembler (created by BLDR) .strike Another format for Alto screen font, optimized for use with BitBlt (bit-boundary block transfer) .symbols Symbolic debugging information associated with a Mesa .image file .syms Symbolic debugging information assocatiated with a BCPL .run file

Provenance

The files in this archive originally lived on a set of IFS file servers at Xerox PARC. Over the years various techniques were used to archive the files to offline storage, in the form of 9-track magnetic tapes. First this was done via a program called BSYS running on MAXC, which was a computer that ran the TENEX operating system. Later this was replaced by a program called Archivist that ran on a Dorado (a much-faster successor to the Alto), which allowed a user to specify a set of files to be archived. By around 1991 the IFS servers and the Archivist program were no longer being used, and a new suite of programs were written to transfer files from the old 9-track tapes to 8mm digital tape cartridges, preserving the original record structure as defined by the Archivist program but including one extra record consisting of a file named rosetta.tar containing, as explained in a README file:

... the C sources for the programs used to read the maxc and archivist 9 track, 1/2" tapes, the shell scripts used to drive those programs during the transfer of the 9 track, 1/2" tape data to the 8mm tapes, and the Cedar/Mesa sources for the Archivist program which was originally used to archive the data to 9 track, 1/2" tape.

A collection of the 8mm tapes were later transferred to CD-ROM, again preserving the Archivist record structure. In 2011, Xerox PARC donated this CD-ROM to the Computer History Museum, as lot X6195.2011. In addition to the copies of the Archivist records, that CD-ROM included two additional directories, cd6 and cd8, that had been restored from archive tapes at some time before 2011. Starting in the fall of 2013, a program named restore_alto_files was written to read the archive tape images, unpack Alto Dump/Load files and Alto disk images, and create web pages for browsing the directories and viewing various file types (including Bravo and Press).

Here are copies of rosetta.tar.gz and restore_alto_files.tar.gz.