Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site uw-beaver Path: utzoo!watmath!clyde!cbosgd!cbdkc1!desoto!packard!ihnp1!ihnp4! houxm!vax135!cornell!uw-beaver!laser-lovers From: laser-lovers@uw-beaver Newsgroups: fa.laser-lovers Subject: PostScript and Interpress: a comparison Message-ID: <889@uw-beaver> Date: Fri, 1-Mar-85 19:08:05 EST Article-I.D.: uw-beave.889 Posted: Fri Mar 1 19:08:05 1985 Date-Received: Sun, 3-Mar-85 03:18:18 EST Sender: daemon@uw-beaver Organization: U of Washington Computer Science Lines: 712 From: Brian Reid <reid@Glacier> This essay offers a comparison of two modern schemes for controlling what laser printers print. One scheme, called PostScript, is offered by Adobe Systems, Inc.; the other scheme, called Interpress, is offered by the Xerox Corporation. A discussion of these two schemes has provoked a considerable amount of interest in this forum recently. I have for some time been promising (threatening?) to provide my interpretation of the difference between the two systems. It is long enough and detailed enough that you will certainly never want to read another word on the topic after you read it, but given the nature of computer mail systems you almost certainly will be given the opportunity. -------------------------------------------------------------------------- To a first order, PostScript and Interpress are indistinguishable. What I mean by that is that by comparison with all other current techniques for page image representation, the two can be considered to be nearly identical. I believe that it is worth looking at how they got to be that way; their similarities and differences can best be understood with a proper historical perspective. Part I: History The Evans and Sutherland Computer Corporation has for quite a number of years sold very expensive, very powerful graphics devices for CAD/CAM and for real-time simulation. The CAD/CAM machine is called The Picture System; the simulation machines are custom-built for each application. Custom simulation graphics machines are used for such purposes as providing the windshield graphics for military flight simulation systems--emulating what a pilot would see if he were looking out the window of a real airplane. These graphics systems use a very clever graphics model, developed by Ivan Sutherland and others, which is based on coordinate system transformations and line drawing. Although the Evans and Sutherland company is primarily in Salt Lake City, they had a small research office in Mountain View (California) in the early 1970's. John Warnock was in charge of it, and John Gaffney worked for Warnock. One of the activities of the Mountain View office was to develop software for producing 3-dimensional graphical databases both for the Picture System and for the simulation machines. Working with Warnock, Gaffney had by 1975 programmed and documented and released the first version of a programming language that was called "The Evans and Sutherland Design System". Gaffney came to E&S from graduate school at the University of Illinois, where he had used the Burroughs B5500 and B6500 computers. Their stack-oriented architectures made a big impression on him. He combined the execution semantics of the Burroughs machines with the evolving Evans and Sutherland imaging models, to produce the Design System. Like all successful software systems, the Design System slowly evolved as it was used, and many people contributed to that evolution. John Warnock joined Xerox PARC in 1978 to work for Chuck Geschke. There he teamed up with Martin Newell in producing an interpreted graphics system called JAM. "JAM" stands for "John And Martin". JAM had the same postfix execution semantics as Gaffney's Design System, and was based on the Evans and Sutherland imaging model, but augmented the E&S imaging model by providing a much more extensive set of graphics primitives. Like the later versions of the Design System, JAM was "token based" rather than "command line based", which means that the JAM interpreter reads a stream of input tokens and processes each token completely before moving to the next. Newell and Warnock implemented JAM on various Xerox workstations; by 1981 JAM was available at Stanford on the Xerox Alto computers, where I first saw it. In the meantime, various people at Xerox were building a series of experimental raster printers. The first of these was called XGP, the Xerox Graphics Printer, and had a resolution of 192 dots to the inch. Xerox made XGP's available to certain universities, and by 1972 they were in use at Carnegie-Mellon, Stanford, MIT, Caltech, and the University of Toronto. Each of those organizations produced its own hardware and software interfaces. The XGP is historically interesting only because it is the first raster printer to gain substantial use by computer scientists, and was the arena in which a lot of mistakes were made and a lot of lessons learned. To replace the XGP, Xerox PARC developed a new printer called EARS, and then another newer printer called Dover. After the agony of converting software from XGP to EARS, various Xerox people realized that applications programs generating files for the XGP or for EARS should not be tied to the device properties of the printer itself. Bob Sproull and William Newman, of Xerox PARC, developed a relatively device-independent page image description scheme, called "Press format", which was used to instruct raster printers what to print. As part of an extensive grant program to selected universities, Xerox donated Dover printers and made documentation of the Press format available under a nondisclosure agreement. As far as I know, that nondisclosure agreement has never been lifted, though information about Press format has been widely enough distributed that by 1982 researchers at the Swiss Federal Institute of Technology (EPFL) at Lausanne had given conference papers about their own independent implementation of Press format. Press format was a smashing success; it revolutionized laser printing technology in the academic and research communities, and stimulated a large number of people to think about issues of device-independent print graphics. Nevertheless, Press format had its limitations, and various people felt the need to revise the basic design. Sproull left Xerox in 1978 to become a professor of computer science at CMU. Newman returned home to England to become an independent consultant. Martin Newell left Xerox to join Cadlinc Corp. Warnock and Geschke remained at Xerox. While at CMU, Sproull began making plans for a new version of Press that would combine the graphics model of JAM with the page image description properties of Press. Sproull returned to Xerox for a sabbatical leave in 1982, and enlisted the help of Butler Lampson in the creation of the new page image description language that Warnock dubbed "Interpress". The name caught on. While it is difficult to separate the contributions made by Sproull and Lampson, it is not incorrect to say that Lampson and Warnock produced the execution model of Interpress while Sproull and Warnock produced the imaging model. It is also approximately correct to characterize this first version of Interpress as being derived from the graphics model and execution model of JAM with additional protection and security mechanisms derived from experience with programming languages like Euclid and Cedar, and a careful silence on the issue of fonts. The trio worked under Geschke's direction, and Geschke was responsible for refereeing disagreements and for making certain that the resulting design was acceptable to the rest of Xerox. My own involvement with the Interpress effort is difficult to explain. Sproull was my thesis adviser at CMU; we had discussed many of the issues in page description languages at length. As a consultant to PARC during the Interpress design work, my primary activity was one of writing or rewriting the Interpress materials. I also represented a "consumer" point of view rather than a "designer" point of view, and often complained about aspects of the evolving language. I feel uncomfortable discussing the issues involved in the transition of Interpress from an artifact of the research lab to a marketable product. I shall therefore not discuss them. During this transition phase Geschke and Warnock left PARC (December 1982) to start Adobe Systems, Sproull returned to CMU (June 1983), and Lampson left PARC to join DEC Research (November 1983). Warnock had various philosophical differences with the final Interpress design, and he voiced those differences to the rest of the Interpress group at every opportunity. At Adobe, Geschke and Warnock saw the opportunity to try again, with a design group composed of people who shared his ideology. They enlisted Doug Brotz, a Xerox PARC researcher who had had no involvement with any of the Press/JAM/Interpress world, to join them in developing a new page description language named PostScript, based on combining the execution model and imaging model of JAM with a protection structure more reminiscent of C or the Unix shell than of Euclid or Cedar. While not at all a copy of JAM, PostScript resembles JAM more than it resembles Interpress. PostScript also embraced various Unix notions, such as the use of text streams to convey information. On March 15, 1984, Adobe shipped its first PostScript manual to a potential customer. That PostScript manual was printed on a PostScript printer using a Times Roman font licensed from Allied corporation and digitized by Adobe. At that time all aspects of the Interpress project were still very proprietary, and it appeared to me that Xerox had no interest in releasing them. However, on April 25, 1984, I received a Xerox press release announcing the availability of Interpress documentation. I finally managed to get my hands on a copy of the Interpress documentation in February of 1985, and was quite surprised to discover that the Interpress documentation had not been printed on an Interpress printer, but was instead printed on a Press format printer, using the same Times-like and Helvetica-like fonts that I had become familiar with at CMU and Stanford on the Dover printers. ---------------------------------------------------------------------- Part II: Comparison Part I outlined the history of PostScript and of Interpress, as I have been able to determine it. With that historical background, I now offer a comparison of the two languages. While there are quite a number of extant schemes for the description of printed images, most of them are better described as "data structures" than as "languages". In particular, only PostScript and Interpress are directly executable. Languages can be compared at several different levels. Languages have a lexical representation, a syntax, a semantic model, an intended style of usage, and implementation considerations. LEXICAL CONSIDERATIONS The lexical properties of a language define the way the tokens of the language are represented in terms of bits, bytes, or characters. The FORTRAN language was defined in terms of a particular character set, which the implementor was expected to use. The ALGOL language was defined in terms of keywords and symbols, and the language definition left the implementor free to choose how he would represent those keywords in terms of characters available on his computer. For example, the FORTRAN definition of a "DIMENSION" statement is that it is the letter "D" followed by the letter "I" followed by the letter "M", etc. The ALGOL definition of the "BEGIN" keyword was merely that it was a keyword; the ALGOL standard document used boldface to identify keywords. When ALGOL is implemented on computers whose character sets include boldface, the implementors normally use the boldface characters as a way of identifying keywords. When ALGOL is implemented on other computers, the implementors choose other schemes for identifying keywords, such as putting them in quotes or putting them in all capital letters. Both PostScript and Interpress have an operator called MOVETO, and in both languages it does exactly the same thing, which is identical to what the MOVETO operator did on the Evans and Sutherland hardware that spawned this graphics model. Let's look at how that operator would be represented in the two languages. The PostScript language is defined in terms of characters, like FORTRAN. The definition of the PostScript operator "MOVETO" is the letter "M" followed by the letter "O" followed by the letter "V", etc. The Interpress language is defined in terms of keywords; the definition of the Interpress operator "MOVETO" is that it is a keyword in the ALGOL sense. The Interpress 2.1 standard suggests that MOVETO can be represented with the serial number 25 in a standard encoding that the standard provides, but the definition of the MOVETO keyword is independent of the choice of encoding. Since PostScript is defined in terms of sequences of characters, it is always possible to assume that a PostScript file can be transmitted over any link capable of sending characters, and can be stored in any device capable of holding characters. Since Interpress is defined more abstractly, it is not necessarily possible to make any assumptions at all about a particular Interpress file. However, any Interpress encoding can be translated into any other Interpress encoding, so it is always possible to take an Interpress file and translate it into a stream of characters which will then have properties identical to PostScript's. Conversely, it is always possible to translate a PostScript program into a tokenized keyword form, though the PostScript standard does not suggest any particular tokenization scheme. It is worth mentioning that the word "token" is slightly overloaded here. A "tokenization scheme" is a means of doing data compression, wherein a sequence of characters is called a "token" and is replaced by a token number, which will occupy less space. However, a language can have tokens without having a tokenization scheme. Both PostScript and Interpress have an execution semantics that is defined in terms of things called "tokens". The Interpress tokens are normally represented by tokenization schemes--i.e. replaced with integers--while the PostScript tokens are normally left as sequences of characters. In later sections of this message the word "token" will be used to mean either the PostScript kind of token or the Interpress kind of token; by the time they get to the interpreter they are roughly the same thing. The Interpress 2.1 standard defines a particular encoding of Interpress, and gives bit and byte formats, decimal integer operator numbers, and so forth. This encoding is a full binary encoding, using all 8 bits of each byte, which means that it cannot always be sent over a serial character link. The Interpress standard encoding of a page description normally occupies a smaller number of bytes than the equivalent PostScript character representation. This is possible because binary encodings make more efficient use of the bits. Interpress files are clearly intended to be transmitted via XNS protocols over Ethernet. In its current form, without further processing or re-encoding, Interpress is not suitable for transmission over character-protocol lines. PostScript files are clearly intended to be transmitted over character-protocol lines. Like all character stream protocols, PostScript can also be transmitted over Ethernet, but a PostScript file will use more bytes than the corresponding Interpress file. Text files such as PostScript sources are highly redundant (i.e. they make inefficient use of their bits) and can be run through data compression programs (such as the Unix "compact" program) to reduce the amount of space they occupy in storage and during transfer. Data compression techniques will probably not yield much further compression of Interpress files, because the information is already quite tightly packed. After compression of both, the PostScript and Interpress representations of an image will likely occupy approximately the same number of bits. SYNTACTIC CONSIDERATIONS The syntactic issues (or issues of syntax, if you will) of a language are the means by which an interpreter for the language distinguishes variables from operators from constants from function calls from quoted strings, and by which it determines whether or not a certain sequence of characters or tokens is in fact a "legal" construct in the language. As languages in general go, both PostScript and Interpress are remarkably free of syntax. As token-oriented postfix languages, each token of the language is "executed" as soon as it is identified, and that execution will either succeed or fail depending on the state of the execution environment at that point. Nevertheless, both languages have a small amount of syntax, though they differ radically in the nature and application of this syntax. In fact, the primary area in which the PostScript language and the Interpress language are incontrovertibly and irrevocably different is in their syntax. As explained above (Lexical Issues) PostScript is defined in terms of character sequences. A PostScript program is a series of character tokens, separated by white space characters. That program is fed to an interpreter to be executed; the interpreter reads in the characters and assembles them into words (i.e. tokens), then looks up the tokens in dictionaries to determine their meaning. In this regard PostScript is similar to many other programming or command languages: if the PostScript interpreter sees the command "MOVETO", it finds the current definition of that string, and then performs whatever action is requested in that definition. By contrast, Interpress is defined in terms of byte codes, which behave more like the instruction codes of a hardware interpreter than like a traditional programming language. Instead of the letters "MOVETO", an Interpress file will have a byte whose binary value is 25; the number 25 is then used to index an operation code table which directs the interpreter to the program implementing the MOVETO operation. The byte codes of Interpress can be viewed as a compiled form of the character codes of PostScript. One could imagine a translator that passed over a PostScript file, looked up each name, and produced an output file whose contents was the binary identification of the thing found during the lookup. In fact, the Interpress standard document explains that the two forms are equivalent, and the Introduction to Interpress document explains how to write a program to convert one to another. There is, however, a crucial difference between the PostScript and Interpress naming schemes that makes them very different, and makes impossible the above-mentioned imagined compiler to translate PostScript into Interpress. That difference is best understood as a semantic difference, and will be explained in the next section. Returning to syntactic issues, an Interpress file has what is called "static structure" or "lexical structure". This means that you can look at an Interpress file and make structural assumptions about what you find there. For example, an Interpress file is defined to be a sequence of "bodies"; each body is a sequence of operators and operands. The first body is the "preamble", or setup code; all following bodies correspond to printed pages. If an Interpress file has 11 bodies, then it will print as 10 pages. By contrast, a PostScript file has no fixed lexical structure; it is just a stream of tokens to be processed by the interpreter. PostScript prints a page whenever the SHOWPAGE operator is executed. If a PostScript file contains a loop from 1 to 10, with a SHOWPAGE operator inside the loop, then it will print 10 pages even though there is only one actual call to SHOWPAGE in the file. However, since PostScript is a textual language, and since it has a "comment" facility like the C /*....*/ or Pascal {...}, it is possible for the creator of a PostScript file to represent whatever additional information is desired. It is a slight misnomer to call this a comment facility, because the normal use of the word "comment" in programming languages implies that the contents of the comment are irrelevant. PostScript comments are irrelevant in the sense that they do not affect the image produced by a PostScript file, but they do convey machine-readable information about the structure of the document. A PostScript client is free to choose any structuring scheme that he wants, and the tool that he has available to implement this structuring scheme is the PostScript comment. There is a particular "standard" structuring convention documented along with PostScript by which page boundaries and other lexical information can be marked. A PostScript file that follows that convention is called a "conforming" file, but it is a convention and not a rule; the printed image produced by a nonconforming PostScript file will be identical to that produced by the equivalent conforming PostScript file. Conversely, the structure of a PostScript file, as represented by the structuring convention, is completely independent of the appearance of the page images--the actual PostScript text appears to be a series of comments as far as the structuring systems are concerned. The technique of mixing two different languages in one file, so that a processor for one language sees the text of the other language as comments, is not new. Perhaps the most widely-known instance of this scheme is Don Knuth's "WEB" system, in which Pascal and TEX are woven together in such a way that the Pascal program looks like a comment to the TEX interpreter and the TEX source looks like a comment to the Pascal compiler. This absence of fixed lexical structure in PostScript is a two-edged sword. On the one hand, it offers more flexibility in creating page images, especially repetitive ones; on the other hand, it provides more opportunities to make mistakes. One final syntactic issue is perhaps worth mentioning, though it could also be considered a semantic issue. Interpress does not support "variables" so much as it supports "registers", in the hardware sense. All storage in Interpress is accessed by address and not by name. What would be called a "local variable" in a programming language is represented in Interpress by an integer subscript into the procedure's frame. All programming languages must ultimately reduce their variable names into memory locations; Interpress asks that this translation be performed by the creator of the Interpress file and not by the interpreter. An obvious benefit of this approach is efficiency--no name lookups need be performed as the file is being printed. An obvious drawback of this approach is the restricted name space available to the programmer and the extra care that must be taken to manage addresses instead of names. By contrast, PostScript supports ordinary named variables. SEMANTICS Since both Interpress and Postscript derive their semantics from the same source, it stands to reason that the semantics would be similar. Both use similar graphical semantics, the same imaging model, and both use very similar execution semantics. The differences are minor, though one could imagine that the consequences of those differences might be major. There are two substantive differences between the graphical semantics of PostScript and Interpress 2.1, namely that Interpress has no facility for describing curves, and the Interpress standard is completely silent on the issue of fonts. A curve can of course be approximated with a series of line segments, and if the line segments are short enough the resulting appearance will be identical, but many classes of curved lines, such as those appearing in fonts, can be described very succinctly in terms of the PostScript CURVETO operator while requiring a tedious collection of short line segments to describe in Interpress. Because of the importance of fonts to printed images, this seemingly minor omission could possibly have major consequences. On the issue of fonts, the Interpress standard states only that a font is an operator that will be executed for you when appropriate, and that the operators for that font are defined "in the Environment". A PostScript font is just an ordinary PostScript defined operator, and the PostScript manual gives explicit instructions for creating user-defined fonts and making those font definitions be part of a PostScript file. One could imagine that it is possible to write an Interpress composed operator (in Interpress, of course) to behave like a user-defined font, but the Interpress implementations do not currently have any mechanism for recognizing that an operator is in fact a user-defined font and should therefore receive any kind of special treatment. This is not a deficiency in Interpress, just a silence, accompanied by a deficiency in current implementations (this and other implementation issues are discussed in the last section). There are three consequential differences between PostScript execution semantics and Interpress execution semantics: user-defined operators, the nature of the "firewalls" between pieces of the program, and error recovery. In Interpress, a user-defined operator is syntactically different from an intrinsic operator, and requires an explicit "DO" operator to call it. In PostScript a user-defined operator is syntactically identical to an intrinsic operator, and in fact any intrinsic operation can be redefined by simply making a new entry for that operator's name in the appropriate dictionary. This is stylistically similar to the difference in lexical structure: Interpress guarantees that if a byte code 25--the MOVETO operator--is found in a file, that it will when executed perform a standard MOVETO. PostScript guarantees nothing because it enforces nothing. If you want to redefine the meaning of MOVETO, then you can do so, and when the characters "M O V E T O" are found in a PostScript file, the redefined operator will be executed instead. To execute a PostScript user-defined operator you just include its name, the same way you execute any other operator. To execute an Interpress user-defined operator, you execute the DO operator (or a variation of it), after pushing onto the stack the thing that you want to execute. Analogously with the static structural issues, The PostScript user-defined-operator scheme offers more flexibility than Interpress but carries with it more dangers. Like the old saw about giving one enough rope to hang himself, the additional flexibility of the PostScript scheme requires discipline on the part of the user. Furthermore, just as PostScript has a convention for the voluntary inclusion of static structure in a file, it has a mechanism by which a PostScript program can reference the true built-in version of an operator and not the current, possibly user-redefined, version of an operator. From the point of view of language design, this scheme is not terribly elegant, but it is quite practical, as it provides a mechanism for the solution of all of the problems associated with operator redefinition and the prevention thereof. It is this ability to redefine builtin operators that makes the compilation of a textual Postscript file into an encoded Interpress file (mentioned above under Syntax) impossible. A static analysis cannot determine the operator that will be executed when the textual token is interpreted. By contrast, it is easy to translate Interpress into PostScript, because all of Interpress' semantic capabilities have direct equivalents in PostScript, and the lexical translation is straightforward. Interpress has a distinction between "bodies" and "operators". A "body" is a sequence of Interpress tokens. The Interpress operator "MAKESIMPLECO" (make simple composed operator) translates a body into an operator. Like all other Interpress operators that reference bodies--referred to in the Interpress standard as "body operators"--the MAKESIMPLECO operator is prefix and not postfix. This was done to make it easier for small computers to implement Interpress interpreters; it has the interesting side-effect of making it impossible for an Interpress program to generate and then execute a piece of Interpress source code. I would guess that the entire reason for the distinction between Interpress bodies and operators is to enable a clean prefix implementation of body operators while at the same time permitting the more conventional postfix use of expressions of type "operator". By contrast, PostScript represents operator bodies as arrays of PostScript tokens. The PostScript lexical scanner processes a body by building an array out of the tokens that it finds in the input stream; that body is then handled as an ordinary data value in the language, and it can be stored into variables, executed, modified, searched or searched for, etc. The translation of a body into something like an Interpress operator consists merely of returning the address where the body is stored; that can be handled by the PostScript type system and does not require a special conversion operator. Consequently, a PostScript program is able to generate an array of PostScript operators, however it so chooses, and then declare that array to be a new PostScript operator and have it be executed just like any other PostScript operator. The second important semantic difference between PostScript and Interpress is the set of mechanisms that they offer for protecting one piece of the file from side effects in another. As you might be able to guess if you have read this far, the Interpress protection mechanism is static and mandatory while the PostScript protection mechanism is dynamic and optional. This kind of mechanism is often referred to as a "firewall". An Interpress file consists of a series of bodies. Each body is executed completely independently of each other body. In particular, at the beginning of each page body, the execution environment is restored to the state that it had at the end of execution of the preamble, so that each page body is executed as if it were the only page in the document. There is absolutely nothing that the code in one Interpress page can do that will have any effect on the execution of the code in any other Interpress page, and the Interpress language guarantees that independence. This permits, for example, the pages to be executed or printed in any order, front to back or back to front, or in folios of 16 pages at a time, with complete confidence that the appearance of the pages will not change. By contrast, a PostScript file has no static structure, so there is no convenient place to build automatic firewalls. PostScript provides, instead, two pairs of operators by which a PostScript user can build his own firewalls wherever he wants them. There is an operator called SAVE, and another operator called RESTORE. The RESTORE operator restores the execution state of the machine back to what it was when the last SAVE operator was executed. Thus, if a PostScript user wants to have pages that are firewalled against each other, then he puts a SAVE operator at the beginning of the page and a RESTORE operator at the end of the page. If the PostScript user wants to play tricks, and build PostScript files that do bizarre things with the execution state between pages, he is free to do so by leaving out the SAVE and RESTORE. By now you can probably see the fundamental philosophical difference between PostScript and Interpress. Interpress takes the stance that the language system must guarantee certain useful properties, while PostScript takes the stance that the language system must provide the user with the means to achieve those properties if he wants them. With very few exceptions, both languages provide the same facilities, but in Interpress the protection mechanisms are mandatory and in PostScript they are optional. Debates over the relative merits of mandatory and optional protection systems have raged for years not only in the programming language community but also among owners of motorcycle helmets. While the Interpress language mandates a particular organization, the PostScript language provides the tools (structuring conventions and SAVE/RESTORE) to duplicate that organization exactly, with all of the attendant benefits. However, the PostScript user need not employ those tools. Before taking a stand on this issue, you must remember that neither Interpress nor PostScript is engineered to be a general-purpose programming language, but rather to be a scheme for the description of page images, so it is not necessarily valid to apply programming language lore to these two systems. The third area in which there are significant semantic differences between PostScript and Interpress is in error handling and error recovery. The Interpress 2.1 standard is slightly vague as to what happens when various error conditions occur; one assumes that the implementors of Interpress printers will do something reasonable. The PostScript language provides a user-extensible error-recovery mechanism that is keyed on PostScript's ability to redefine intrinsic operators. Whenever an error of any kind occurs in PostScript, be it the printer out of paper, the file asking for a font that doesn't exist, or a division by zero, the PostScript interpreter responds by executing an "error operator". If the error operator has not been redefined, then some standard action is taken; sometimes the standard action is to do nothing, while sometimes the standard action is to abort or to retry. The standard action is merely the execution of the error operator. The Interpress documentation does not offer much explanation, one way or another, of error handling. The Interpress standard describes certain kinds of error conditions that can occur, such as "appearance error" or "master error", but does not specify exactly what will happen if those errors occur. I assume that the reason the standard is vague is to provide leeway to the implementors in error handling. The Interpress language standard does not describe any technique by which an Interpress master can control or modify the error recovery actions. When a PostScript error occurs, an error operator is executed. There is a set of built-in error operators provided as part of PostScript, and documented like all other operators. If a PostScript user wants to change the error handling of a PostScript printer, he simply changes the dictionary entry for the relevant error operator. Depending on the relative position of that redefinition with respect to SAVE and RESTORE operators in the PostScript file, the redefinition will have a certain lifetime. A SAVE and RESTORE pair is wrapped around each separate file printed by a PostScript printer, so that the redefinition does not carry over to other jobs. The manager of an installation can change the overall default of the printer by sending it a redefinition, during printer startup, before entering the SAVE/RESTORE loop around each print job. Like so much of PostScript's flexibility, the ability to redefine operators is a two-edged sword. Redefining an operator can be used to advantage by clever and knowledgeable users, and it can be used as a technique for fixing bugs in a PostScript implementation. For example, if an accounting package were not provided as part of a PostScript implementation, the owners of a PostScript printer could add page accounting to their printer by downloading a redefinition of the SHOWPAGE operator that kept accounting information. However, a user might be able to disable that accounting by doing yet another redefinition that disabled the installation's accounting. To circumvent this class of problem, PostScript provides a mechanism for declaring certain objects to be read-only, or execute-only. The management of a shared PostScript printer can specify that part of its power-up or restart sequence is to load a configuration file; that configuration file can redefine certain operators--for the purpose of bug fixing or accounting or any other reason--and then, if desired, mark the redefined operators read-only so that they cannot be further redefined. As a language mechanism this is very clumsy, but as an operational technique it is effective. IMPLEMENTATION ISSUES The implementation considerations are the most difficult to review and compare, because it is next to impossible to determine the reason for some annoying property of an implementation; it is also not entirely proper to criticize a language for the state of its implementation. Nevertheless, the history of programming languages has repeatedly shown that good implementations of languages have longer-lasting impact than good designs. For example, I quite commonly encounter people who choose to run VMS on their Vax systems instead of Unix and who offer the explanation that they do this because the VMS implementation of Fortran is so good that their programs will run a lot faster. Naturally, other people have other reasons; this is just an example. The Interpress documentation is peppered with "fine print" explaining the possible limitations of various possible Interpress printers, and a chapter of the Interpress standard is devoted to a discussion of the various ways to subset Interpress so that stripped-down versions of the language can be implemented. Indeed, as of today (March 1, 1985) I am not aware of the existence of any printer that implements the full Interpress 2.1 language defined in the standard. Certainly none is offered now as a product, and if one has been announced the announcement has not yet reached me. The Xerox 8044 "Star" printer and the 5700 and 2700 printers all implement various subsets of Interpress. Perhaps there are others. The only one of these that I have used to any extent is the 8044. It implements a textual subset of Interpress, with the capability of a certain amount of line graphics, and has some unknown capacity for more sophisticated graphics. It does not implement very many of the features that distinguish Interpress from the older Press format, and in fact has some surprising limitations. For example, Interpress provides the ability to get rounded ends on line segments. The 8044 implementation of Interpress that I experimented with faked the circular arcs with sections of a 9-sided polygon. The Interpress standard promises the ability to rotate the coordinate system through arbitrary angles; all of the existing implementations of Interpress limit coordinate system rotations to multiples of 90 degrees. Xerox quite likely has been developing true Interpress printers, which implement the full documented language, but none has been demonstrated or announced. By contrast, the PostScript documentation makes no mention of any subset, or of any implementation restrictions. The entire PostScript language was fully implemented before any PostScript documentation was distributed or any printers shipped. There are four PostScript printers announced and demonstrated by three OEM vendors: the Apple LaserWriter (300 dots/inch) the QMS 1200A (300 dots/inch), the Mergenthaler P300 phototypesetter (2540, 1270, or 635 dots/inch), and the Mergenthaler P101 phototypesetter (1270 or 635 dots/inch). The Apple printer has been shipped to customers, the QMS printers are in Beta test, and the Mergenthaler machines will be shipped to customers by Fall of 1985. All implementations of PostScript printers can print any PostScript file, with no restrictions save the availability of fonts as licensed to that manufacturer. Circles come out as circles. A PostScript file that has been proof-printed on an Apple LaserWriter can be typeset on a Mergenthaler P101 without making any changes to the file. Naturally all device-independent page representation schemes have this ability as their goal, and many claim to be able to do it, or claim that they could do it if they had all of the necessary fonts available in all of the requisite sizes. The current set of PostScript printers actually do it. Given that Xerox has been working on Interpress for about twice as long as Adobe has been working on PostScript, and many of the graphics techniques necessary for the implementation are copiously described in the open literature, I find it surprising that there are no true Interpress printers on the market. I am puzzled by this, and as a student of programming languages I am very interested in learning whether or not there are any properties of the Interpress language itself that are somehow contributing to this difficulty, or whether this is just the usual sluggishness that one expects from all large companies. Brian Reid R...@SU-Glacier.ARPA Computer Systems Laboratory decwrl!glacier!reid Stanford University 415/323-6100

Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site uw-beaver Path: utzoo!watmath!clyde!burl!ulysses!mhuxr!ihnp4!houxm!vax135!cornell! uw-beaver!laser-lovers From: laser-lovers@uw-beaver Newsgroups: fa.laser-lovers Subject: Interpress and PostScript, A Second Comparison Message-ID: <892@uw-beaver> Date: Sun, 3-Mar-85 16:15:23 EST Article-I.D.: uw-beave.892 Posted: Sun Mar 3 16:15:23 1985 Date-Received: Mon, 4-Mar-85 20:31:45 EST Sender: daemon@uw-beaver Organization: U of Washington Computer Science Lines: 486 From: mendelson...@XEROX.ARPA I've been scooped!! I was about to send the message that is presented below the dotted line when Brian Reid's masterful message "PostScript and Interpress: a comparison", appeared in my mail. So I'll preface my original message with these remarks. I want to compliment Brian on a piece of work of outstanding excellence. I appreciate its objectivity. I agree almost completely with his technical characterizations of the differences between the two languages. I believe that that agreement will show up clearly in what I wrote, but my characterization is not nearly so scholarly as his. I hope it is as objective as his. We do have some differences of opinion and of knowledge, but I do not propose to address them here. We'll leave them for another day, after both comparisons have been presented. Jerry Mendelson --------------------------- The following information was prepared by Jerry Mendelson, and is submitted to Laser Lovers as a contribution to the general discussion of the characteristics of the two printing languages, Interpress and PostScript. Let me state up front that I am a retired Xerox Engineering Fellow, and that I am currently working as a consultant to Xerox. This report was commissioned by Xerox, but these are my observations, not those of Xerox. Xerox did not exercise technical control over the content, nor editorial control over the presentation, of the following information. I have tried to be as analytic, objective, and honest as I possibly could, but I know far more about Interpress than I do about PostScript. I welcome equally honest and objective responses to this discussion containing corrections, additions, other perspectives, what have you. I trust that these discussions will lead to well reasoned recommendations on a standard for the description of documents for representation on imaging devices. INTRODUCTION A comparison of Interpress and PostScript is perhaps most appropriately begun by recounting the history of the development of the two languages. The following additional material on the history of the developments of Interpress and PostScript is based on my first hand knowledge of events that have transpired within Xerox during the past ten years, plus direct conversations with the key participants in the developments of these two languages. It is factual and complete to the best of my knowledge. Others might be able to provide additional information of which I am unaware. MORE ON HISTORY A FORTH like graphics/printing language was developed by, among others, John Warnock before coming to Xerox/PARC. After coming to Xerox, John put together another similar language which he called JaM. Long before JaM was developed another printing format (Press) had been developed and placed in use at Xerox. It was largely the brainchild of Bob Sproull who is now at Carnegie-Mellon. Variants of this effort run most of the Xerox internal printers on the original 3 MBit/Second Ethernet which is still operational within Xerox. A graphics research program known as Cedar Graphics was also pursued at PARC during this same time frame. Finally, leveraging off of all of these earlier activities, what I designate as Research Interpress was created as a combined effort of a number of people at PARC under the management direction of Chuck Geschke. The principal developers were Bob Sproull, Butler Lampson, John Warnock, and Brian Reid, with significant participation from Bob Ayers. (There may have been others whom I have unintentionally slighted because I am simply unaware of their roles.) The resulting Interpress language contains benefits which derive from all of those very talented people. During the course of these developments Xerox permitted publication of some of the Cedar Graphics research effort in a paper authored by John Warnock and Douglas Wyatt titled "A Device Independent Graphics Imaging Model for Use with Raster Devices", in the July 1982 Computer Graphics Volume 16, number 3, pp. 313-320. Xerox also permitted Dr. Leo Guibas, a Stanford professor, to use some of the Interpress related research work as course material for a course at Stanford. I led the engineering effort that extracted a suitable subset of the Research Interpress language, and pushed it through the corporate standards activity and into the product line. The first subset of this language was published internal to Xerox in a Xerox System Integraton Standard (XSIS 048201), titled Interpress 82 Electronic Printing Standard, dated January 1982. Subsequent backward compatible revisions have been internally released. The current externally released version of of Interpress is designated as Interpress Electronic Printing Standard (XSIS 048404), Version 2.1, dated April 1984, and is contained in the documentation set available from Xerox. The Printing Instructions portion of Interpress were added to Research Interpress, and included in the internally published Version 2.0 in June, 1983. After Interpress 82 had been extracted, pushed through the corporate standards process, and incorporated into products, but before Interpress, Version 2.0 had been created, Chuck Geschke and John Warnock left PARC and formed Adobe Systems. They developed PostScript by extending a combination of the work that Warnock had done prior to his work at Xerox, plus the material that Xerox had permitted to enter the public domain through the publication of the above referenced paper, plus the material that Xerox had released to the Stanford course. They carefully avoided including any of the advanced features that Sproull, Lampson, and Warnock had incorporated in Interpress but that Xerox had not publicly released. GENERAL OVERVIEW Again, some caveats about this general overview. It is not presented as fact, but rather as a set of observations as interpreted by one reviewer who has more familiarity with Interpress than with PostScript. Another reviewer might reach a quite different set of conclusions. Also, please note that this discussion deals with attributes of the languages, and does not address issues related to particular implementations of the language. With their thread of common history it is quite natural that Interpress and PostScript turn out to be very similar languages. Their fundamental concepts and structures are substantially the same. I can only speculate about the causes for their differences, but it appears to me that the following issues are fundamental: 1. The developers of the two languages had different perceptions of the application environments in which they were intended to operate. 2. Adobe was precluded from pursuing some of the elegant Xerox proprietary techniques that were not part of the public domain. As a consequence Interpress appears to be richer in higher level structure. PostScript appears to be richer in its more basic structures. These issues are discussed below. Application Environment Considerations The languages appear to have distinctively different viewpoints of their application environments, and of how they properly fit into those environments. The designers of Interpress took the position that the functions of creation and composition are properly the role of creation devices. Printers should print, and should do so in a highly efficient and productive manner. Interpress was designed to describe the results of the creation process to printers in a printer independent fashion. The language recognized that there were some things in the printing domain that the printer could best deal with, and that the creator would not necessarily have knowledge about. It, therefore, made provision for the creator to describe how the document description should adapt itself to the specifics of what it finds in the printer's environment, without knowing a priori what that environment would be. Interpress can be used with a single printer that is tightly coupled to the creator source, but it is also designed so that it can operate in a networked environment in which there may be a multiplicity of printer/servers. In this latter environment the creator may be highly decoupled from the destination printer. Further, in such an environment the printer/server must at all times be in control of its own actions, and an Interpress document may not be allowed to cause the execution of operations that interfere with that control. Interpress is designed for machine to machine communication with no human interaction. In fact, Interpress is designed under the assumption that the document may be printed on a wide variety of printers over an extended period of time, i.e. stored over an extended period of time, later retrieved, and then printed. The designers of PostScript appear to have taken a different viewpoint. PostScript appears to assume an environment in which there is not so clear a separation between the creator and the printer. The creator appears to be in much closer contact with a specific target printer, and possesses much more information on the details of that printer's environment. In fact, a PostScript master appears to have the ability practically to take over complete control of the printer's resources, e.g. open, close, read, and write files. PostScript enables an almost interactive environment with a human creator in the loop. While the language is printer independent, a specific PostScript master appears to be more closely coupled to a target printer than does an Interpress master. PostScript enables much greater print time program control to the PostScript master, and relies on that master to perform many of the actions that Interpress defers to the printer. Further, Adobe has chosen to enrich the more basic aspects of their language to make it a more general purpose composition language. An example of an ideal application for PostScript would be that of a fully functioned Graphics Composition station working in conjunction with a designated printer of known characteristics. It would clearly be a better language for that application than would Interpress. Computing Load Allocation The printing of any complex page creates a significant computing task. Interpress's design tends to force this task to the creator side. PostScript's design tends to push this task to the printer side. As a result Interpress masters tend to enable a high printer throughput capability. PostScript masters tend to impose higher computing loads that lead to slower throughput capabilities at the printer. Note the use of the word "tends" in those last two sentences. What I am trying to say is that the normal use of the natural characteristics of the two languages would lead to the suggested results. However, these are very general observations about the inherent characteristics of the languages. It is clearly possible to create a PostScript master that is highly efficient, and that does not force heavy computation loads on the printer. It is also clearly possible to create an Interpress master that does force heavy computation loads on the printer. IDENTITIES AND STRONG SIMILARITIES The two languages are strongly similar in so many ways that an exhaustive presentation is beyond the scope of this brief analysis. Here are some major points of identity: Both Interpress and PostScript are document description languages. They use a language to describe how to construct the image they want printed on the page. They are device-independent. The description of the page image is in terms of the image, not of the device that the image is to be created on. It is up to the printer to interpret this description, and to create an image that matches it to the best of the printer's capability to do so. Both languages contain two distinct parts, a general purpose programming portion, and a special purpose image generating portion. They both use a FORTH-like postfix notation language, i.e. a reverse Polish notation in which the operands precede their associated operators in the presentation sequence. They both use a stack-oriented processing structure. Operands are pushed into a stack when they are received. Operands are popped from the stack when their associated operator is received. Operator execution results are generally pushed back onto the stack. They both transmit their document representations to the printer in byte streams. These bytes streams are broken down into "tokens", explicitly defined in the Interpress implementation, implicitly defined in the PostScript implementation. Both token streams can be generated on the generator side, and executed on the printer side, in one pass implementations. Both language operate on "typed" operands. Operand types are generally equivalent in the two languages. Both languages have a generalized array processing capability. Such an array is a collection of objects, not necessarily of the same type. Both languages provide means to access arrays by an integer index that designates a specific array entry by its relative position in the array. Arrays may also be organized by using key, value pairs in arbitrary, but paired locations. A value is then located by designating its associated key. Both languages use a universal coordinate system as a reference framework. Both languages are heavily dependent on the use of identical forms of transformation matrices. These are matrices that represent the transformation from a user coordinate system to the reference framework coordinates, and thence to the printer coordinate system. Both use transformations to rotate, scale, and position objects on the page. Both languages enable the construction of procedures that can be repetitively invoked. Procedures can be invoked by mechanisms that save and restore the printer environment so that their effects can be isolated from the rest of the page. Both languages use very similar imaging models. The model is one in which the image is incrementally built starting with a blank page. A page image element, e.g. a character, a pixel array, or a geometrically defined graphic structure, is added to the page by the following process. A "stencil" representing the object is obtained, scaled to size, rotated for orientation, and positioned at a desired location on the page. An opaque "ink" of any color is then "painted" through the stencil, overwriting anything that is currently present on the page. Colors do not mix, they overwrite. Both languages maintain a set of imaging control parameters in an array. Both provide operators that are used to change the state of these parameters. Both languages provide for formally structured representations of fonts. These representations include font name and font metrics as well as font shapes. Both languages enable the creation of fonts with arbitrary character generation techniques, subject only to the constraint that the generation process is describable within the language. PostScript includes language mechanisms for the control and use of a font cache which dynamically stores raster encoded character instances. MAJOR DIFFERENCES In spite of their common ancestry the two languages do contain some major and significant differences. These are catalogued in the following paragraphs. The general sequence of presentation is to present the strengths of Interpress first, followed by those of PostScript. Interpress Strengths Total Environment Interpress has all of the essential qualities of a stand-alone language. However, Interpress has been designed so that it also fits well into an applications and network environment such as the total XNS environment. PostScript is a purely stand-alone language. It makes little, if any, explicit provision to deal with its environment. However, it does contain a number of capabilities which would enable it to do so. Their explicit use for this purpose is not described within the available Adobe documentation. Examples of the integrability of Interpress with its environment abound. It contains a sequenceInsertFile function that enables the inclusion of files accessible to the printer within its total environment. Such files can contain Interpress masters, fragments of interpress masters, or even printer dependent object code that represent previously decomposed Interpress masters, e.g. forms. The printer can utilize the full XNS path name so that an Interpress master can reach out into the entire XNS universe that is attached to the printer network. Interpress takes explicit steps to establish a universal name space and to provide for a central registry mechanism for the distribution and control of names in this space. This provides a means for establishing a uniform environment for an extended family of distributed printers. Interpress contains printer instructions (see below) that are merged with those transmitted by the Printing Protocol that is used to invoke the printing of a document. This enables the document to contain some printing instructions that are document dependent, and the transmission protocol to provide some printing instructions that are user dependent. Interpress makes provision for the interrogation of the printer environment from the master so that the master can adapt itself to that environment. Page Independence Probably the most important difference between the two languages is the fact that an Interpress document possesses a well-defined structure. Among the attributes of this structure is that it guarantees page independence. Page independence means that the language description of each page is totally independent of that of any other page. Ensuring page independence is critically important for a number of reasons. It enables the decomposition and printing of documents in arbitrary page order. (Many printers find it desirable to print last page first. Full duplex printing may require unusual page printing sequences.) It enables the creation of utility routines to manipulate Interpress documents. Such routines can perform such tasks as creating a new Interpress master by pulling together pieces from existing masters, or creating two-up, head-to-toe, or signature masters from existing masters. Because of page independence these operations can be executed without any consideration of the Interpress representation of the document. They need only parse the Interpress master to locate the page breaks which are clearly delimited. PostScript documents are specifically free form and unstructured. The language enables page independence to be achieved through externally imposed programming disciplines, but the language itself provides no such guarantee. In fact, in the absence of a carefully controlled programming discipline it is practically certain that PostScript defined documents will not exhibit page independence. Printer Instructions Interpress contains an extensive set of Printing Instructions. These instructions enable the master to control the actions of the printer, e.g. to invoke two-sided printing or special finishing such as stapling. They also provide information necessary for the effective use of the printer within a multi-user environment, e.g. who printed the document, what its name is, whom to charge for the printing, the provision of passwords to control who can authorize the final printing, and so on. Printing Instructions also enable the declaration of resources that the document will require, e.g. the files that it will use, the fonts and font sizes that it will use. Not only do Printing Instructions enable the control of the printing environment, they also enable an up-front determination of the ability of a given printer to print a document and/or enable it to gather the resources it needs to do so in the most efficient fashion. These might be used by the printer itself, or by a printing environment manager/dispatcher who determines which of a number of printers should receive the document for printing. PostScript makes no explicit provision for any of these functions. Priority Important The availability of multiple colors with opaque printing quality means that rules must be established for the printing of overlapping objects of different colors. PostScript takes no explicit cognizance of this fact. The unambiguous appearance of the final page can only be obtained if one assumes that PostScript defined objects are printed in the sequence in which they are generated. Some printers may not find it convenient to print objects in the order in which they are created within the master. Interpress contains an imaging parameter, priorityImportant, that is used to designate when the printing sequence must match the Interpress presentation sequence. When priorityImportant is not true the printer is free to image objects in a sequence of its own choosing. This can increase printer performance in many cases. Compactness Interpress specifies a compact encoding for the transmission of a document. This encoding reduces to one or two bytes the number of bytes required to designate an Interpress operator. It also includes special encoding notations for compact representations of sequences of literals. These encoding techniques substantially reduce the number of bits required to either store or transmit a document. Simple utility routines are available for the bi-directional translation from this compact format to full English language form ASCII character representations suitable for human usage. PostScript only employs this latter form of more verbose representation. PostScript Strengths File Handling PostScript contains a much more extensive and more powerful set of file handling capabilities than those provided within Interpress. This reflects the viewpoint of the language that the creator is closely coupled to a dedicated printer. It permits the master to take over control of the printer's file resources at print time. The Interpress philosophy of multiple printers, highly decoupled in space and time, and possibly with different file handling techniqes precludes this capability. The Interpress philosophy of not letting the document take over the machine resources also precludes this capability. General Purpose Programming Capability PostScript provides a much richer set of general purpose processing capabilities. It provides a programmer wth a rich set of arithmetic, control, looping, string processing, conversion between object types, signalling, etc., operations. Its use of full ASCII text representation makes it convenient for human programmers to write programs in the language. All of this is in keeping with the PostScript viewpoint previously described. Use of these capabilities can impose heavy processing loads on the printer. Interpress provides a sufficiently rich set of general purpose processing capabilities to meet the needs of the environment for which it was designed. It has been explicitly constrained in these capabilities so as to reduce the processing load on the printer, and, in keeping with the Interpress philosophy, force the processing load to the creator. Procedure Calling Both PostScript and Interpress provide procedure calling capabilities. Both make provision for the establishment of a working space containing local variables for the procedure. Both make provision for saving and restoring the state of the imager control parameters so that the side effects of procedures can be well controlled. Interpress very tightly couples all of these effects in its procedure creation and calling mechanisms. PostScript provides a more flexible, and in many cases more efficient, capability by separating these effects. Graphic Imaging Capabilities PostScript has a much greater set of graphic image creating capabilities than that of the currently released version of Interpress. These include the ability to invoke circles and Bezier curves in either solid or dashed representations. PostScript also includes a clipping region capability, i.e. a defined curve whose interior defines the region in which the current imaging is permitted to occur. PostScript contain a richer set of line joining constructs, including mitering, bevelling, and rounding. The currently released version of Interpress only contains the ability to generate straight line bounded graphic constructs with mitering and rounding at line joins. Circles, conics, and Bezier curves are included in Research Interpress, and are scheduled for later releases of the published language. Thus, the two languages will ultimately be brought into much closer alignment in their graphic imaging capabilitapabilities now, and Interpress won't do so until later. SUMMARY In summary, Interpress was designed for use by a large variety of document creation and document output devices, to be suitable as a general purpose electronic printing standard, a language for printers capable of high performance. It was not designed as a general purpose composition language to be used directly by a programmer. The Interpress model assumes that the user creates Interpress Masters via any variety of document editing and composition systems, including graphics composition languages. It is clearly meant to be a printing standard with emphasis on an organization that presents a clear separation between theprocesses that Xerox believes belong in the creation domain, and processes that Xerox believes belong in the printer's domain. Postscript, on the other hand, appears to have been designed for use both as a programmers composition language and as a language for printers in a tightly coupled stand-alone configuration. It does not draw clear distinctions between creation domain processes and printer domain processes. It is an excellent language for full capability graphics composition applications.Relay-V