ELS 2014May 5–6, Paris, France.

2014

ASDF 3, or Why Lisp is Now an Acceptable Scripting Language

(Extended version)

François-René RideauGoogletunes@google.com

ASDF, the de facto standard build system for Common Lisp, has been vastly improved between 2009 and 2014. These and other improvements finally bring Common Lisp up to par with "scripting languages" in terms of ease of writing and deploying portable code that can access and "glue" together functionality from the underlying system or external programs. "Scripts" can thus be written in Common Lisp, and take advantage of its expressive power, well-defined semantics, and efficient implementations. We describe the most salient improvements in ASDF and how they enable previously difficult and portably impossible uses of the programming language. We discuss past and future challenges in improving this key piece of software infrastructure, and what approaches did or didn’t work in bringing change to the Common Lisp community.

Introduction

As of 2013, one can use Common Lisp (CL)Common Lisp, often abbreviated CL, is a language defined in the ANSI standard X3.226-1994 by technical committee X3J13. It’s a multi-paradigm, dynamically-typed high-level language. Though it’s known for its decent support for functional programming, its support for Object-Oriented Programming is actually what remains unsurpassed still in many ways; also, few languages even attempt to match either its syntactic extensibility or its support for interactive development. It was explicitly designed to allow for high-performance implementations; some of them, depending on the application, may rival compiled C programs in terms of speed, usually far ahead of "scripting" languages and their implementations. There are over a dozen maintained or unmaintained CL implementations. No single one is at the same time the best, shiniest, leanest, fastest, cheapest, and the one ported to the most platforms. For instance, SBCL is quite popular for its runtime speed on Intel-compatible Linux machines; but it’s slower at compiling and loading, won’t run on ARM, and doesn’t have the best Windows support; and so depending on your constraints, you might prefer Clozure CL, ECL, CLISP or ABCL. Or you might desire the technical support or additional libraries from a proprietary implementation. While it’s possible to write useful programs using only the standardized parts of the language, fully taking advantage of extant libraries that harness modern hardware and software techniques requires the use of various extensions. Happily, each implementation provides its own extensions and there exist libraries to abstract over the discrepancies between these implementations and provide portable access to threads (bordeaux-threads), Unicode support (cl-unicode), a "foreign function interface" to libraries written in C (cffi), ML-style pattern-matching (optima), etc. A software distribution system, Quicklisp, makes it easy to install hundreds of libraries that use ASDF. The new features in ASDF 3 were only the last missing pieces in this puzzle. to portably write the programs for which one traditionally uses so-called "scripting" languages: one can write small scripts that glue together functionality provided by the operating system (OS), external programs, C libraries, or network services; one can scale them into large, maintainable and modular systems; and one can make those new services available to other programs via the command-line as well as via network protocols, etc.

The last barrier to making that possible was the lack of a portable way to build and deploy code so a same script can run unmodified for many users on one or many machines using one or many different compilers. This was solved by ASDF 3.

ASDF has been the de facto standard build system for portable CL software since shortly after its release by Dan Barlow in 2002 (Barlow 2004). The purpose of a build system is to enable division of labor in software development: source code is organized in separately-developed components that depend on other components, and the build system transforms the transitive closure of these components into a working program.

ASDF 3 is the latest rewrite of the system. Aside from fixing numerous bugs, it sports a new portability layer UIOP. One can now use ASDF to write Lisp programs that may be invoked from the command line or may spawn external programs and capture their output ASDF can deliver these programs as standalone executable files; moreover the companion script cl-launch (see cl-launch) can create light-weight scripts that can be run unmodified on many different kinds of machines, each differently configured. These features make portable scripting possible. Previously, key parts of a program had to be configured to match one’s specific CL implementation, OS, and software installation paths. Now, all of one’s usual scripting needs can be entirely fulfilled using CL, benefitting from its efficient implementations, hundreds of software libraries, etc.

In this article, we discuss how the innovations in ASDF 3 enable new kinds of software development in CL. In What ASDF is, we explain what ASDF is about; we compare it to common practice in the C world; this section does not require previous knowledge of CL. In ASDF 3: A Mature Build, we describe the improvements introduced in ASDF 3 and ASDF 3.1 to solve the problem of software delivery; this section requires some familiarity with CL though some of its findings are independent from CL; for a historical perspective you may want to start with appendices A to F below before reading this section. In Code Evolution in a Conservative Community, we discuss the challenges of evolving a piece of community software, concluding with lessons learned from our experience; these lessons are of general interest to software programmers though the specifics are related to CL.

This is the extended version of this article. In addition to extra footnotes and examples, it includes several appendices with historical information about the evolution of ASDF before ASDF 3. There again, the specifics will only interest CL programmers, but general lessons can be found that are of general interest to all software practitioners. Roughly in chronological order, we have the initial successful experiment in Appendix A: ASDF 1, a defsystem for CL; how it became robust and usable in Appendix B: ASDF 2, or Productizing ASDF; the abyss of madness it had to bridge in Appendix C: Pathnames; improvements in expressiveness in Appendix D: ASDF 2.26, more declarative; various failures in Appendix E: Failed Attempts at Improvement; and the bug that required rewriting it all over again in Appendix F: A traverse across the build.

All versions of this article are available at http://fare.tunes.org/files/asdf3/: extended HTML, extended PDF, short HTML, short PDF (the latter was submitted to ELS 2014).

1 What ASDF is

1.1 ASDF : Basic Concepts

1.1.1 Components

ASDF is a build system for CL: it helps developers divide software into a hierarchy of components and automatically generates a working program from all the source code.

Top components are called systems in an age-old Lisp tradition, while the bottom ones are source files, typically written in CL. In between, there may be a recursive hierarchy of modules that may contain files or other modules and may or may not map to subdirectories.

Users may then operate on these components with various build operations, most prominently compiling the source code (operation compile-op) and loading the output into the current Lisp image (operation load-op).

Several related systems may be developed together in the same source code project. Each system may depend on code from other systems, either from the same project or from a different project. ASDF itself has no notion of projects, but other tools on top of ASDF do: Quicklisp (Beane 2011) packages together systems from a project into a release, and provides hundreds of releases as a distribution, automatically downloading on demand required systems and all their transitive dependencies.

Further, each component may explicitly declare a dependency on other components: whenever compiling or loading a component(as contrasted with running it) relies on declarations or definitions of packages, macros, variables, classes, functions, etc., present in another component, the programmer must declare that the former component depends-on the latter.

1.1.2 Example System Definitions

Below is how the fare-quasiquote system is defined (with elisions) in a file fare-quasiquote.asd. It contains three files, packages, quasiquote and pp-quasiquote (the .lisp suffix is automatically added based on the component class; see Appendix C: Pathnames). The latter files each depend on the first file, because this former file defines the CL packagesPackages are namespaces that contain symbols; they need to be created before the symbols they contain may even be read as valid syntax.Each CL process has a global flat two-level namespace: symbols, named by strings, live in packages, also named by strings; symbols are read in the current *package*, but the package may be overridden with colon-separated prefix, as in other-package:some-symbol. However, this namespace isn’t global across images: packages can import symbols from other packages, but a symbol keeps the name in all packages and knows its "home" package; different CL processes running different code bases may thus have a different set of packages, where symbols have different home packages; printing symbols on one system and reading them on another may fail or may lead to subtle bugs.:

( defsystem "fare-quasiquote" ... :depends-on ( "fare-utils" ) :components ( ( :file "packages" ) ( :file "quasiquote" :depends-on ( "packages" ) ) ( :file "pp-quasiquote" :depends-on ( "quasiquote" ) ) ) )

Among the elided elements were metadata such as :license "MIT", and extra dependency information :in-order-to ((test-op (test-op "fare-quasiquote-test"))), that delegates testing the current system to running tests on another system. Notice how the system itself depends-on another system, fare-utils, a collection of utility functions and macros from another project, whereas testing is specified to be done by fare-quasiquote-test, a system defined in a different file, fare-quasiquote-test.asd, within the same project.

The fare-utils.asd file, in its own project, looks like this (with a lot of elisions):

( defsystem "fare-utils" ... :components ( ( :file "package" ) ( :module "base" :depends-on ( "package" ) :components ( ( :file "utils" ) ( :file "strings" :depends-on ( "utils" ) ) ... ) ) ( :module "filesystem" :depends-on ( "base" ) :components ... ) ... ) )

This example illustrates the use of modules: The first component is a file package.lisp, that all other components depend on. Then, there is a module base; in absence of contrary declaration, it corresponds to directory base/; and it itself contains files utils.lisp, strings.lisp, etc. As you can see, dependencies name sibling components under the same parent system or module, that can themselves be files or modules.

1.1.3 Action Graph

The process of building software is modeled as a Directed Acyclic Graph (DAG) of actions, where each action is a pair of an operation and a component. The DAG defines a partial order, whereby each action must be performed, but only after all the actions it (transitively) depends-on have already been performed.

For instance, in fare-quasiquote above, the loading of (the output of compiling) quasiquote depends-on the compiling of quasiquote, which itself depends-on the loading of (the output of compiling) package, etc.

Importantly, though, this graph is distinct from the preceding graph of components: the graph of actions isn’t a mere refinement of the graph of components but a transformation of it that also incorporates crucial information about the structure of operations.

Unlike its immediate predecessor mk-defsystem, ASDF makes a plan of all actions needed to obtain an up-to-date version of the build output before it performs these actions. In ASDF itself, this plan is a topologically sorted list of actions to be performed sequentially: a total order that is a linear extension of the partial order of dependencies; performing the actions in that order ensures that the actions are always performed after the actions they depend on.

It’s of course possible to reify the complete DAG of actions rather than just extracting from it a single consistent ordered sequence. Andreas Fuchs did in 2006, in a small but quite brilliant ASDF extension called POIU, the "Parallel Operator on Independent Units". POIU compiles files in parallel on Unix multiprocessors using fork, while still loading them sequentially into a main image, minimizing latency. We later rewrote POIU, making it both more portable and simpler by co-developing it with ASDF. Understanding the many clever tricks by which Andreas Fuchs overcame the issues with the ASDF 1 model to compute such a complete DAG led to many aha moments, instrumental when writing ASDF 3 (see Appendix F: A traverse across the build).

Users can extend ASDF by defining new subclasses of operation and/or component and the methods that use them, or by using global, per-system, or per-component hooks.

ASDF is an "in-image" build system, in the Lisp defsystem tradition: it compiles (if necessary) and loads software into the current CL image, and can later update the current image by recompiling and reloading the components that have changed. For better and worse, this notably differs from common practice in most other languages, where the build system is a completely different piece of software running in a separate process.Of course, a build system could compile CL code in separate processes, for the sake of determinism and parallelism: our XCVB did (Brody 2009); so does the Google build system. On the one hand, it minimizes overhead to writing build system extensions. On the other hand, it puts great pressure on ASDF to remain minimal.

Qualitatively, ASDF must be delivered as a single source file and cannot use any external library, since it itself defines the code that may load other files and libraries. Quantitatively, ASDF must minimize its memory footprint, since it’s present in all programs that are built, and any resource spent is paid by each program.This arguably mattered more in 2002 when ASDF was first released and was about a thousand lines long: By 2014, it has grown over ten times in size, but memory sizes have increased even faster.

For all these reasons, ASDF follows the minimalist principle that anything that can be provided as an extension should be provided as an extension and left out of the core. Thus it cannot afford to support a persistence cache indexed by the cryptographic digest of build expressions, or a distributed network of workers, etc. However, these could conceivably be implemented as ASDF extensions.

1.2 Comparison to C programming practice

Most programmers are familiar with C, but not with CL. It’s therefore worth contrasting ASDF to the tools commonly used by C programmers to provide similar services. Note though how these services are factored in very different ways in CL and in C.

To build and load software, C programmers commonly use make to build the software and ld.so to load it. Additionally, they use a tool like autoconf to locate available libraries and identify their features.ASDF 3 also provides functionality which would correspond to small parts of the libc and of the linker ld. In many ways these C solutions are better engineered than ASDF. But in other important ways ASDF demonstrates how these C systems have much accidental complexity that CL does away with thanks to better architecture.

Lisp makes the full power of runtime available at compile-time, so it’s easy to implement a Domain-Specific Language (DSL): the programmer only needs to define new functionality, as an extension that is then seamlessly combined with the rest of the language, including other extensions. In C, the many utilities that need a DSL must grow it onerously from scratch; since the domain expert is seldom also a language expert with resources to do it right, this means plenty of mutually incompatible, misdesigned, power-starved, misimplemented languages that have to be combined through an unprincipled chaos of expensive yet inexpressive means of communication.

Lisp provides full introspection at runtime and compile-time alike, as well as a protocol to declare features and conditionally include or omit code or data based on them. Therefore you don’t need dark magic at compile-time to detect available features. In C, people resort to horribly unmaintainable configuration scripts in a hodge podge of shell script, m4 macros, C preprocessing and C code, plus often bits of python, perl, sed, etc.

ASDF possesses a standard and standardly extensible way to configure where to find the libraries your code depends on, further improved in ASDF 2. In C, there are tens of incompatible ways to do it, between libtool, autoconf, kde-config, pkg-config, various manual ./configure scripts, and countless other protocols, so that each new piece of software requires the user to learn a new ad hoc configuration method, making it an expensive endeavor to use or distribute libraries.

ASDF uses the very same mechanism to configure both runtime and compile-time, so there is only one configuration mechanism to learn and to use, and minimal discrepancy.There is still discrepancy inherent with these times being distinct: the installation or indeed the machine may have changed. In C, completely different, incompatible mechanisms are used at runtime (ld.so) and compile-time (unspecified), which makes it hard to match source code, compilation headers, static and dynamic libraries, requiring complex "software distribution" infrastructures (that admittedly also manage versioning, downloading and precompiling); this at times causes subtle bugs when discrepancies creep in.

Nevertheless, there are also many ways in which ASDF pales in comparison to other build systems for CL, C, Java, or other systems:

ASDF isn’t a general-purpose build system. Its relative simplicity is directly related to it being custom made to build CL software only. Seen one way, it’s a sign of how little you can get away with if you have a good basic architecture; a similarly simple solution isn’t available to most other programming languages, that require much more complex tools to achieve a similar purpose. Seen another way, it’s also the CL community failing to embrace the outside world and provide solutions with enough generality to solve more complex problems.ASDF 3 could be easily extended to support arbitrary build actions, if there were an according desire. But ASDF 1 and 2 couldn’t: their action graph was not general enough, being simplified and tailored for the common use case of compiling and loading Lisp code; and their ability to call arbitrary shell programs was a misdesigned afterthought (copied over from mk-defsystem) the implementation of which wasn’t portable, with too many corner cases.

At the other extreme, a build system for CL could have been made that is much simpler and more elegant than ASDF, if it could have required software to follow some simple organization constraints, without much respect for legacy code. A constructive proof of that is quick-build (Bridgewater 2012), being a fraction of the size of ASDF, itself a fraction of the size of ASDF 3, and with a fraction of the bugs — but none of the generality and extensibility (See package-inferred-system).

ASDF it isn’t geared at all to build large software in modern adversarial multi-user, multi-processor, distributed environments where source code comes in many divergent versions and in many configurations. It is rooted in an age-old model of building software in-image, what’s more in a traditional single-processor, single-machine environment with a friendly single user, a single coherent view of source code and a single target configuration. The new ASDF 3 design is consistent and general enough that it could conceivably be made to scale, but that would require a lot of work.

2 ASDF 3: A Mature Build

2.1 A Consistent, Extensible Model

Surprising as it may be to the CL programmers who used it daily, there was an essential bug at the heart of ASDF: it didn’t even try to propagate timestamps from one action to the next. And yet it worked, mostly. The bug was present from the very first day in 2001, and even before in mk-defsystem since 1990 (Kantrowitz 1990), and it survived till December 2012, despite all our robustification efforts since 2009 (Goldman 2010). Fixing it required a complete rewrite of ASDF’s core.

As a result, the object model of ASDF became at the same time more powerful, more robust, and simpler to explain. The dark magic of its traverse function is replaced by a well-documented algorithm. It’s easier than before to extend ASDF, with fewer limitations and fewer pitfalls: users may control how their operations do or don’t propagate along the component hierarchy. Thus, ASDF can now express arbitrary action graphs, and could conceivably be used in the future to build more than just CL programs.

The proof of a good design is in the ease of extending it. And in CL, extension doesn’t require privileged access to the code base. We thus tested our design by adapting the most elaborate existing ASDF extensions to use it. The result was indeed cleaner, eliminating the previous need for overrides that redefined sizable chunks of the infrastructure. Chronologically, however, we consciously started this porting process in interaction with developing ASDF 3, thus ensuring ASDF 3 had all the extension hooks required to avoid redefinitions.

See the entire story in Appendix F: A traverse across the build.

2.2 Bundle Operations

Bundle operations create a single output file for an entire system or collection of systems. The most directly user-facing bundle operations are compile-bundle-op and load-bundle-op: the former bundles into a single compilation file all the individual outputs from the compile-op of each source file in a system; the latter loads the result of the former. Also lib-op links into a library all the object files in a system and dll-op creates a dynamically loadable library out of them. The above bundle operations also have so-called monolithic variants that bundle all the files in a system and all its transitive dependencies.

Bundle operations make delivery of code much easier. They were initially introduced as asdf-ecl, an extension to ASDF specific to the implementation ECL, back in the day of ASDF 1.Most CL implementations maintain their own heap with their own garbage collector, and then are able to dump an image of the heap on disk, that can be loaded back in a new process with all the state of the former process. To build an application, you thus start a small initial image, load plenty of code, dump an image, and there you are. ECL, instead, is designed to be easily embeddable in a C program; it uses the popular C garbage collector by Hans Boehm & al., and relies on linking and initializer functions rather than on dumping. To build an application with ECL (or its variant MKCL), you thus link all the libraries and object files together, and call the proper initialization functions in the correct order. Bundle operations are important to deliver software using ECL as a library to be embedded in some C program. Also, because of the overhead of dynamic linking, loading a single object file is preferable to a lot of smaller object files. asdf-ecl was distributed with ASDF 2, though in a way that made upgrade slightly awkward to ECL users, who had to explicitly reload it after upgrading ASDF, even though it was included by the initial (require "asdf"). In May 2012, it was generalized to other implementations as the external system asdf-bundle. It was then merged into ASDF during the development of ASDF 3 (2.26.7, December 2012): not only did it provide useful new operations, but the way that ASDF 3 was automatically upgrading itself for safety purposes (see Upgradability) would otherwise have broken things badly for ECL users if the bundle support weren’t itself bundled with ASDF.

In ASDF 3.1, using deliver-asd-op, you can create both the bundle from compile-bundle-op and an .asd file to use to deliver the system in binary format only.

Note that compile-bundle-op, load-bundle-op and deliver-asd-op were respectively called fasl-op, load-fasl-op and binary-op in the original asdf-ecl and its successors up until ASDF 3.1. But those were bad names, since every individual compile-op has a fasl (a fasl, for FASt Loading, is a CL compilation output file), and since deliver-asd-op doesn’t generate a binary. They were eventually renamed, with backward compatibility stubs left behind under the old name.

2.3 Understandable Internals

After bundle support was merged into ASDF (see Bundle Operations above), it became trivial to implement a new concatenate-source-op operation. Thus ASDF could be developed as multiple files, which would improve maintainability. For delivery purpose, the source files would be concatenated in correct dependency order, into the single file asdf.lisp required for bootstrapping.

The division of ASDF into smaller, more intelligible pieces had been proposed shortly after we took over ASDF; but we had rejected the proposal then on the basis that ASDF must not depend on external tools to upgrade itself from source, another strong requirement (see Upgradability). With concatenate-source-op, an external tool wasn’t needed for delivery and regular upgrade, only for bootstrap. Meanwhile this division had also become more important, since ASDF had grown so much, having almost tripled in size since those days, and was promising to grow some more. It was hard to navigate that one big file, even for the maintainer, and probably impossible for newcomers to wrap their head around it.

To bring some principle to this division (2.26.62), we followed the principle of one file, one package, as demonstrated by faslpath (Etter 2009) and quick-build (Bridgewater 2012), though not yet actively supported by ASDF itself (see package-inferred-system). This programming style ensures that files are indeed providing related functionality, only have explicit dependencies on other files, and don’t have any forward dependencies without special declarations. Indeed, this was a great success in making ASDF understandable, if not by newcomers, at least by the maintainer himself;On the other hand, a special setup is now required for the debugger to locate the actual source code in ASDF; but this price is only paid by ASDF maintainers. this in turn triggered a series of enhancements that would not otherwise have been obvious or obviously correct, illustrating the principle that good code is code you can understand, organized in chunks you can each fit in your brain.

2.4 Package Upgrade

Preserving the hot upgradability of ASDF was always a strong requirement (see Upgradability). In the presence of this package refactoring, this meant the development of a variant of CL’s defpackage that plays nice with hot upgrade: define-package. Whereas the former isn’t guaranteed to work and may signal an error when a package is redefined in incompatible ways, the latter will update an old package to match the new desired definition while recycling existing symbols from that and other packages.

Thus, in addition to the regular clauses from defpackage, define-package accepts a clause :recycle: it attempts to recycle each declared symbol from each of the specified packages in the given order. For idempotence, the package itself must be the first in the list. For upgrading from an old ASDF, the :asdf package is always named last. The default recycle list consists in a list of the package and its nicknames.

New features also include :mix and :reexport. :mix mixes imported symbols from several packages: when multiple packages export symbols with the same name, the conflict is automatically resolved in favor of the package named earliest, whereas an error condition is raised when using the standard :use clause. :reexport reexports the same symbols as imported from given packages, and/or exports instead the same-named symbols that shadow them. ASDF 3.1 adds :mix-reexport and :use-reexport, which combine :reexport with :mix or :use in a single statement, which is more maintainable than repeating a list of packages.

2.5 Portability Layer

Splitting ASDF into many files revealed that a large fraction of it was already devoted to general purpose utilities. This fraction only grew under the following pressures: a lot of opportunities for improvement became obvious after dividing ASDF into many files; features added or merged in from previous extensions and libraries required new general-purpose utilities; as more tests were added for new features, and were run on all supported implementations, on multiple operating systems, new portability issues cropped up that required development of robust and portable abstractions.

The portability layer, after it was fully documented, ended up being slightly bigger than the rest of ASDF. Long before that point, ASDF was thus formally divided in two: this portability layer, and the defsystem itself. The portability layer was initially dubbed asdf-driver, because of merging in a lot of functionality from xcvb-driver. Because users demanded a shorter name that didn’t include ASDF, yet would somehow be remindful of ASDF, it was eventually renamed UIOP: the Utilities for Implementation- and OS- PortabilityU, I, O and P are also the four letters that follow QWERTY on an anglo-saxon keyboard.. It was made available separately from ASDF as a portability library to be used on its own; yet since ASDF still needed to be delivered as a single file asdf.lisp, UIOP was transcluded inside that file, now built using the monolithic-concatenate-source-op operation. At Google, the build system actually uses UIOP for portability without the rest of ASDF; this led to UIOP improvements that will be released with ASDF 3.1.2.

Most of the utilities deal with providing sane pathname abstractions (see Appendix C: Pathnames), filesystem access, sane input/output (including temporary files), basic operating system interaction — many things for which the CL standard lacks. There is also an abstraction layer over the less-compatible legacy implementations, a set of general-purpose utilities, and a common core for the ASDF configuration DSLs.ASDF 3.1 notably introduces a nest macro that nests arbitrarily many forms without indentation drifting ever to the right. It makes for more readable code without sacrificing good scoping discipline. Importantly for a build system, there are portable abstractions for compiling CL files while controlling all the warnings and errors that can occur, and there is support for the life-cycle of a Lisp image: dumping and restoring images, initialization and finalization hooks, error handling, backtrace display, etc. However, the most complex piece turned out to be a portable implementation of run-program.

With ASDF 3, you can run external commands as follows:

( run-program ` ( "cp" "-lax" "--parents" "src/foo" , destination ) )

On Unix, this recursively hardlinks files in directory src/foo into a directory named by the string destination , preserving the prefix src/foo . You may have to add :output t :error-output t to get error messages on your *standard-output* and *error-output* streams, since the default value, nil , designates /dev/null . If the invoked program returns an error code, run-program signals a structured CL error , unless you specified :ignore-error-status t .

This utility is essential for ASDF extensions and CL code in general to portably execute arbitrary external programs. It was a challenge to write: Each implementation provided a different underlying mechanism with wildly different feature sets and countless corner cases. The better ones could fork and exec a process and control its standard-input, standard-output and error-output; lesser ones could only call the system(3) C library function. Moreover, Windows support differed significantly from Unix. ASDF 1 itself actually had a run-shell-command, initially copied over from mk-defsystem, but it was more of an attractive nuisance than a solution, despite our many bug fixes: it was implicitly calling format; capturing output was particularly contrived; and what shell would be used varied between implementations, even more so on Windows.Actually, our first reflex was to declare the broken run-shell-command deprecated, and move run-program to its own separate system. However, after our then co-maintainer (and now maintainer) Robert Goldman insisted that run-shell-command was required for backward compatibility and some similar functionality expected by various ASDF extensions, we decided to provide the real thing rather than this nuisance, and moved from xcvb-driver the nearest code there was to this real thing, that we then extended to make it more portable, robust, etc., according to the principle: Whatever is worth doing at all is worth doing well (Chesterfield).

ASDF 3’s run-program is full-featured, based on code originally from XCVB ’s xcvb-driver (Brody ASDF 3.0.3, it can also control the standard-input and error-output. It accepts either a list of a program and arguments, or a shell command string. Thus your previous program could have been: 3’sis full-featured, based on code originally from’s 2009 ). It abstracts away all these discrepancies to provide control over the program’s standard-output, using temporary files underneath if needed. Since3.0.3, it can also control the standard-input and error-output. It accepts either a list of a program and arguments, or a shell command string. Thus your previous program could have been:

( run-program ( format nil "cp -lax --parents src/foo ~S" ( native-namestring destination ) ) :output t :error-output t )

where (UIOP)’s native-namestring converts the pathname object destination into a name suitable for use by the operating system, as opposed to a CL namestring that might be escaped somehow.

You can also inject input and capture output:

( run-program ' ( "tr" "a-z" "n-za-m" ) :input ' ( "uryyb, jbeyq" ) :output :string )

returns the string "hello, world" . It also returns secondary and tertiary values nil and 0 respectively, for the (non-captured) error-output and the (successful) exit code.

run-program only provides a basic abstraction; a separate system inferior-shell was written on top of UIOP , and provides a richer interface, handling pipelines, zsh style redirections, splicing of strings and/or lists into the arguments, and implicit conversion of pathnames into native-namestrings, of symbols into downcased strings, of keywords into downcased strings with a - - prefix. Its short-named functions run , run/nil , run/s , run/ss , respectively run the external command with outputs to the Lisp standard- and error- output, with no output, with output to a string, or with output to a stripped string. Thus you could get the same result as previously with:

( run/ss ' ( pipe ( echo ( uryyb ", " jbeyq ) ) ( tr a-z ( n-z a-m ) ) ) )

Or to get the number of processors on a Linux machine, you can:

( run ' ( grep -c "^processor.:" ( < /proc/cpuinfo ) ) :output #' read )

2.7 Configuration Management

ASDF always had minimal support for configuration management. ASDF 3 doesn’t introduce radical change, but provides more usable replacements or improvements for old features.

For instance, ASDF 1 had always supported version-checking: each component (usually, a system) could be given a version string with e.g. :version "3.1.0.97", and ASDF could be told to check that dependencies of at least a given version were used, as in :depends-on ((:version "inferior-shell" "2.0.0")). This feature can detect a dependency mismatch early, which saves users from having to figure out the hard way that they need to upgrade some libraries, and which.

Now, ASDF always required components to use "semantic versioning", where versions are strings made of dot-separated numbers like 3.1.0.97. But it didn’t enforce it, leading to bad surprises for the users when the mechanism was expected to work, but failed. ASDF 3 issues a warning when it finds a version that doesn’t follow the format. It would actually have issued an error, if that didn’t break too many existing systems.

Another problem with version strings was that they had to be written as literals in the .asd file, unless that file took painful steps to extract it from another source file. While it was easy for source code to extract the version from the system definition, some authors legitimately wanted their code to not depend on ASDF itself. Also, it was a pain to repeat the literal version and/or the extraction code in every system definition in a project. ASDF 3 can thus extract version information from a file in the source tree, with, e.g. :version (:read-file-line "version.text") to read the version as the first line of file version.text. To read the third line, that would have been :version (:read-file-line "version.text" :at 2) (mind the off-by-one error in the English language). Or you could extract the version from source code. For instance, poiu.asd specifies :version (:read-file-form "poiu.lisp" :at (1 2 2)) which is the third subform of the third subform of the second form in the file poiu.lisp. The first form is an in-package and must be skipped. The second form is an (eval-when (...) body...) the body of which starts with a (defparameter *poiu-version* ...) form. ASDF 3 thus solves this version extraction problem for all software — except itself, since its own version has to be readable by ASDF 2 as well as by who views the single delivery file; thus its version information is maintained by a management script using regexps, of course written in CL.

Another painful configuration management issue with ASDF 1 and 2 was lack of a good way to conditionally include files depending on which implementation is used and what features it supports. One could always use CL reader conditionals such as #+(or sbcl clozure) but that means that ASDF could not even see the components being excluded, should some operation be invoked that involves printing or packaging the code rather than compiling it — or worse, should it involve cross-compilation for another implementation with a different feature set. There was an obscure way for a component to declare a dependency on a :feature, and annotate its enclosing module with :if-component-dep-fails :try-next to catch the failure and keep trying. But the implementation was a kluge in traverse that short-circuited the usual dependency propagation and had exponential worst case performance behavior when nesting such pseudo-dependencies to painfully emulate feature expressions.

ASDF 3 gets rid of :if-component-dep-fails: it didn’t fit the fixed dependency model at all. A limited compatibility mode without nesting was preserved to keep processing old versions of SBCL. As a replacement, ASDF 3 introduces a new option :if-feature in component declarations, such that a component is only included in a build plan if the given feature expression is true during the planning phase. Thus a component annotated with :if-feature (:and :sbcl (:not :sb-unicode)) (and its children, if any) is only included on an SBCL without Unicode support. This is more expressive than what preceded, without requiring inconsistencies in the dependency model, and without pathological performance behavior.

2.8 Standalone Executables

One of the bundle operations contributed by the ECL team was program-op, that creates a standalone executable. As this was now part of ASDF 3, it was only natural to bring other ASDF-supported implementations up to par: CLISP, Clozure CL, CMUCL, LispWorks, SBCL, SCL. Thus UIOP features a dump-image function to dump the current heap image, except for ECL and its successors that follow a linking model and use a create-image function. These functions were based on code from xcvb-driver, which had taken them from cl-launch.

ASDF 3 also introduces a defsystem option to specify an entry point as e.g. :entry-point "my-package:entry-point". The specified function (designated as a string to be read after the package is created) is called without arguments after the program image is initialized; after doing its own initializations, it can explicitly consult *command-line-arguments*In CL, most variables are lexically visible and statically bound, but special variables are globally visible and dynamically bound. To avoid subtle mistakes, the latter are conventionally named with enclosing asterisks, also known in recent years as earmuffs. or pass it as an argument to some main function.

Our experience with a large application server at ITA Software showed the importance of hooks so that various software components may modularly register finalization functions to be called before dumping the image, and initialization functions to be called before calling the entry point. Therefore, we added support for image life-cycle to UIOP. We also added basic support for running programs non-interactively as well as interactively based on a variable *lisp-interaction*: non-interactive programs exit with a backtrace and an error message repeated above and below the backtrace, instead of inflicting a debugger on end-users; any non-nil return value from the entry-point function is considered success and nil failure, with an appropriate program exit status.

Starting with ASDF 3.1, implementations that don’t support standalone executables may still dump a heap image using the image-op operation, and a wrapper script, e.g. created by cl-launch, can invoke the program; delivery is then in two files instead of one. image-op can also be used by all implementations to create intermediate images in a staged build, or to provide ready-to-debug images for otherwise non-interactive applications.

Running Lisp code to portably create executable commands from Lisp is great, but there is a bootstrapping problem: when all you can assume is the Unix shell, how are you going to portably invoke the Lisp code that creates the initial executable to begin with?

We solved this problem some years ago with cl-launch. This bilingual program, both a portable shell script and a portable CL program, provides a nice colloquial shell command interface to building shell commands from Lisp code, and supports delivery as either portable shell scripts or self-contained precompiled executable files.cl-launch and the scripts it produces are bilingual: the very same file is accepted by both language processors. This is in contrast to self-extracting programs, where pieces written in multiple languages have to be extracted first before they may be used, which incurs a setup cost and is prone to race conditions.

Its latest incarnation, cl-launch 4 (March 2014), was updated to take full advantage of ASDF 3. Its build specification interface was made more general, and its Unix integration was improved. You may thus invoke Lisp code from a Unix shell:

cl -sp lisp-stripper \ -i "(print-loc-count \"asdf.lisp\")"

You can also use cl-launch as a script "interpreter", except that it invokes a Lisp compiler underneath: The Unix expert may note that despite most kernels coalescing all arguments on the first line (stripped to 128 characters or so) into a single argument, cl-launch detects such situations and properly restores and interprets the command-line.

#!/usr/bin/cl -sp lisp-stripper -E main (defun main (argv) (if argv (map () 'print-loc-count argv) (print-loc-count *standard-input*)))

In the examples above, option -sp, shorthand for --system-package, simultaneously loads a system using ASDF during the build phase, and appropriately selects the current package; -i, shorthand for --init evaluates a form at the start of the execution phase; -E, shorthand for --entry configures a function that is called after init forms are evaluated, with the list of command-line arguments as its argument.Several systems are available to help you define an evaluator for your command-line argument DSL: command-line-arguments, clon, lisp-gflags. As for lisp-stripper, it’s a simple library that counts lines of code after removing comments, blank lines, docstrings, and multiple lines in strings.

cl-launch automatically detects a CL implementation installed on your machine, with sensible defaults. You can easily override all defaults with a proper command-line option, a configuration file, or some installation-time configuration. See cl-launch --more-help for complete information. Note that cl-launch is on a bid to homestead the executable path /usr/bin/cl on Linux distributions; it may slightly more portably be invoked as cl-launch.

A nice use of cl-launch is to compare how various implementations evaluate some form, to see how portable it is in practice, whether the standard mandates a specific result or not:

for l in sbcl ccl clisp cmucl ecl abcl \ scl allegro lispworks gcl xcl ; do cl -l $l -i \ '(format t "'$l': ~S~%" `#5(1 ,@`(2 3)))' \ 2>&1 | grep "^$l:" # LW, GCL are verbose done

cl-launch compiles all the files and systems that are specified, and keeps the compilation results in the same output-file cache as ASDF 3, nicely segregating them by implementation, version, ABI, etc.Historically, it’s more accurate to say that ASDF imported the cache technology previously implemented by cl-launch, which itself generalized it from common-lisp-controller. Therefore, the first time it sees a given file or system, or after they have been updated, there may be a startup delay while the compiler processes the files; but subsequent invocations will be faster as the compiled code is directly loaded. This is in sharp contrast with other "scripting" languages, that have to slowly interpret or recompile everytime. For security reasons, the cache isn’t shared between users.

ASDF 3.1 introduces a new extension package-inferred-system that supports a one-file, one-package, one-system style of programming. This style was pioneered by faslpath (Etter 2009) and more recently quick-build (Bridgewater 2012). This extension is actually compatible with the latter but not the former, for ASDF 3.1 and quick-build use a slash "/" as a hierarchy separator where faslpath used a dot ".".

This style consists in every file starting with a defpackage or define-package form; from its :use and :import-from and similar clauses, the build system can identify a list of packages it depends on, then map the package names to the names of systems and/or other files, that need to be loaded first. Thus package name lil/interface/all refers to the file interface/all.lisp Since in CL, packages are traditionally designated by symbols that are themselves traditionally upcased, case information is typically not meaningful. When converting a package name to system name or filename, we downcase the package names to follow modern convention. under the hierarchy registered by system lil , defined as follows in lil.asd as using class package-inferred-system :

( defsystem "lil" ... :description "LIL: Lisp Interface Library" :class :package-inferred-system :defsystem-depends-on ( "asdf-package-system" ) :depends-on ( "lil/interface/all" "lil/pure/all" ... ) ... )

The :defsystem-depends-on ("asdf-package-system") is an external extension that provides backward compatibility with ASDF 3.0, and is part of Quicklisp. Because not all package names can be directly mapped back to a system name, you can register new mappings for package-inferred-system . The lil.asd file may thus contain forms such as:

( register-system-packages :closer-mop ' ( :c2mop :closer-common-lisp :c2cl ... ) )

Then, a file interface/order.lisp under the lil hierarchy, that defines abstract interfaces for order comparisons, starts with the following form, dependencies being trivially computed from the :use and :mix clauses:

( uiop:define-package :lil/interface/order ( :use :closer-common-lisp :lil/interface/definition :lil/interface/base :lil/interface/eq :lil/interface/group ) ( :mix :fare-utils :uiop :alexandria ) ( :export ... ) )

This style provides many maintainability benefits: by imposing upon programmers a discipline of smaller namespaces, with explicit dependencies and especially explicit forward dependencies, the style encourages good factoring of the code into coherent units; by contrast, the traditional style of "everything in one package" has low overhead but doesn’t scale very well. ASDF itself was rewritten in this style as part of ASDF 2.27, the initial ASDF 3 pre-release, with very positive results.

Since it depends on ASDF 3, package-inferred-system isn’t as lightweight as quick-build, which is almost two orders of magnitude smaller than ASDF 3. But it does interoperate perfectly with the rest of ASDF, from which it inherits the many features, the portability, and the robustness.

2.11 Restoring Backward Compatibility

ASDF 3 had to break compatibility with ASDF 1 and 2: all operations used to be propagated sideway and downward along the component DAG (see Appendix F: A traverse across the build). In most cases this was undesired; indeed, ASDF 3 is predicated upon a new operation prepare-op that instead propagates upward.Sideway means the action of operation o on component c depends-on the action of o (or another operation) on each of the declared dependencies of c. Downward means that it depends-on the action of o on each of c’s children; upward, on c’s parent (enclosing module or system). Most existing ASDF extensions thus included workarounds and approximations to deal with the issue. But a handful of extensions did expect this behavior, and now they were broken.

Before the release of ASDF 3, authors of all known ASDF extensions distributed by Quicklisp had been contacted, to make their code compatible with the new fixed model. But there was no way to contact unidentified authors of proprietary extensions, beside sending an announcement to the mailing-list. Yet, whatever message was sent didn’t attract enough attention. Even our co-maintainer Robert Goldman got bitten hard when an extension used at work stopped working, wasting days to figure out the issue.

Therefore, ASDF 3.1 features enhanced backward-compatibility. The class operation implements sideway and downward propagation on all classes that do not explicitly inherit from any of the propagating mixins downward-operation , upward-operation , sideway-operation or selfward-operation , unless they explicitly inherit from the new mixin non-propagating-operation . ASDF 3.1 signals a warning at runtime when an operation class is instantiated that doesn’t inherit from any of the above mixins, which will hopefully tip off authors of a proprietary extension that it’s time to fix their code. To tell ASDF 3.1 that their operation class is up-to-date, extension authors may have to define their non-propagating operations as follows:

(defclass my-op (#+asdf3.1 non-propagating-operation operation) ())

This is a case of "negative inheritance", a technique usually frowned upon, for the explicit purpose of backward compatibility. Now ASDF cannot use the CLOS Meta-Object Protocol (MOP), because it hasn’t been standardized enough to be portably used without using an abstraction library such as closer-mop, yet ASDF cannot depend on any external library, and this is too small an issue to justify making a sizable MOP library part of UIOP. Therefore, the negative inheritance is implemented in an ad hoc way at runtime.

3 Code Evolution in a Conservative Community

3.1 Feature Creep? No, Mission Creep

Throughout the many features added and tenfold increase in size from ASDF 1 to ASDF 3, ASDF remained true to its minimalism — but the mission, relative to which the code remains minimal, was extended, several times: In the beginning, ASDF was the simplest extensible variant of defsystem that builds CL software (see Appendix A: ASDF 1, a defsystem for CL). With ASDF 2, it had to be upgradable, portable, modularly configurable, robust, performant, usable (see Appendix B: ASDF 2, or Productizing ASDF). Then it had to be more declarative, more reliable, more predictable, and capable of supporting language extensions (see Appendix D: ASDF 2.26, more declarative). Now, ASDF 3 has to support a coherent model for representing dependencies, an alternative one-package-per-file style for declaring them, software delivery as either scripts or binaries, a documented portability layer including image life-cycle and external program invocation, etc. (see ASDF 3: A Mature Build).

3.2 Backward Compatibility is Social, not Technical

As efforts were made to improve ASDF, a constant constraint was that of backward compatibility: every new version of ASDF had to be compatible with the previous one, i.e. systems that were defined using previous versions had to keep working with new versions. But what more precisely is backward compatibility?

In an overly strict definition that precludes any change in behavior whatsoever, even the most uncontroversial bug fix isn’t backward-compatible: any change, for the better as it may be, is incompatible, since by definition, some behavior has changed!

One might be tempted to weaken the constraint a bit, and define "backward compatible" as being the same as a "conservative extension": a conservative extension may fix erroneous situations, and give new meaning to situations that were previously undefined, but may not change the meaning of previously defined situations. Yet, this definition is doubly unsatisfactory. On the one hand, it precludes any amendment to previous bad decisions; hence, the jest if it’s not backwards, it’s not compatible. On the other hand, even if it only creates new situations that work correctly where they were previously in error, some existing analysis tool might assume these situations could never arise, and be confused when they now do.

Indeed this happened when ASDF 3 tried to better support secondary systems. ASDF looks up systems by name: if you try to load system foo, ASDF will search in registered directories for a file call foo.asd. Now, it was common practice that programmers may define multiple "secondary" systems in a same .asd file, such as a test system foo-test in addition to foo. This could lead to "interesting" situations when a file foo-test.asd existed, from a different, otherwise shadowed, version of the same library, resulting in a mismatch between the system and its tests.Even more "interesting" was a case when you’d load your foo.asd, which would define a secondary system foo-test, at the mere reference of which ASDF would try to locate a canonical definition; it would not find a foo-test.asd, instead Quicklisp might tell it to load it from its own copy of foo.asd, at which the loading of which would refer to system foo, for which ASDF would look at the canonical definition in its own foo.asd, resulting in an infinite loop. ASDF 2 was robustified against such infinite loops by memoizing the location of the canonical definition for systems being defined. To make these situations less likely, ASDF 3 recommends that you name your secondary system foo/test instead of of foo-test, which should work just as well in ASDF 2, but with reduced risk of clash. Moreover, ASDF 3 can recognize the pattern and automatically load foo.asd when requested foo/test, in a way guaranteed not to clash with previous usage, since no directory could contain a file thus named in any modern operating system. In contrast, ASDF 2 has no way to automatically locate the .asd file from the name of a secondary system, and so you must ensure that you loaded the primary .asd file before you may use the secondary system. This feature may look like a textbook case of a backward-compatible "conservative extension". Yet, it’s the major reason why Quicklisp itself still hasn’t adopted ASDF 3: Quicklisp assumed it could always create a file named after each system, which happened to be true in practice (though not guaranteed) before this ASDF 3 innovation; systems that newly include secondary systems using this style break this assumption, and will require non-trivial work for Quicklisp to support.

What then, is backward compatibility? It isn’t a technical constraint. Backward compatibility is a social constraint. The new version is backward compatible if the users are happy. This doesn’t mean matching the previous version on all the mathematically conceivable inputs; it means improving the results for users on all the actual inputs they use; or providing them with alternate inputs they may use for improved results.

3.3 Weak Synchronization Requires Incremental Fixes

Even when some "incompatible" changes are not controversial, it’s often necessary to provide temporary backward compatible solutions until all the users can migrate to the new design. Changing the semantics of one software system while other systems keep relying on it is akin to changing the wheels on a running car: you cannot usually change them all at once, at some point you must have both kinds active, and you cannot remove the old ones until you have stopped relying on them. Within a fast moving company, such migration of an entire code base can happen in a single checkin. If it’s a large company with many teams, the migration can take many weeks or months. When the software is used by a weakly synchronized group like the CL community, the migration can take years.

When releasing ASDF 3, we spent a few months making sure that it would work with all publicly available systems. We had to fix many of these systems, but mostly, we were fixing ASDF 3 itself to be more compatible. Indeed, several intended changes had to be forsaken, that didn’t have an incremental upgrade path, or for which it proved infeasible to fix all the clients.

A successful change was notably to modify the default encoding from the uncontrolled environment-dependent :default to the de facto standard :utf-8; this happened a year after adding support for encodings and :utf-8 was added, and having forewarned community members of the future change in defaults, yet a few systems still had to be fixed (see Encoding Support).

On the other hand, an unsuccessful change was the attempt to enable an innovative system to control warnings issued by the compiler. First, the *uninteresting-conditions* mechanism allows system builders to hush the warnings they know they don’t care for, so that any compiler output is something they care for, and whatever they care for isn’t drowned into a sea of uninteresting output. The mechanism itself is included in ASDF 3, but disabled by default, because there was no consensually agreeable value except an empty set, and no good way (so far) to configure it both modularly and without pain. Second, another related mechanism that was similarly disabled is deferred-warnings, whereby ASDF can check warnings that are deferred by SBCL or other compilers until the end of the current compilation-unit. These warnings notably include forward references to functions and variables. In the previous versions of ASDF, these warnings were output at the end of the build the first time a file was built, but not checked, and not displayed afterward. If in ASDF 3 you (uiop:enable-deferred-warnings), these warnings are displayed and checked every time a system is compiled or loaded. These checks help catch more bugs, but enabling them prevents the successful loading of a lot of systems in Quicklisp that have such bugs, even though the functionality for which these systems are required isn’t affected by these bugs. Until there exists some configuration system that allows developers to run all these checks on new code without having them break old code, the feature will have to remain disabled by default.

3.4 Underspecification Creates Portability Landmines

The CL standard leaves many things underspecified about pathnames in an effort to define a useful subset common to many then-existing implementations and filesystems. However, the result is that portable programs can forever only access but a small subset of the complete required functionality. This result arguably makes the standard far less useful than expected (see Appendix C: Pathnames). The lesson is don’t standardize partially specified features. It’s better to standardize that some situations cause an error, and reserve any resolution to a later version of the standard (and then follow up on it), or to delegate specification to other standards, existing or future.

There could have been one pathname protocol per operating system, delegated to the underlying OS via a standard FFI. Libraries could then have sorted out portability over N operating systems. Instead, by standardizing only a common fragment and letting each of M implementations do whatever it can on each operating system, libraries now have to take into account N*M combinations of operating systems and implementations. In case of disagreement, it’s much better to let each implementation’s variant exist in its own, distinct namespace, which avoids any confusion, than have incompatible variants in the same namespace, causing clashes.

Interestingly, the aborted proposal for including defsystem in the CL standard was also of the kind that would have specified a minimal subset insufficient for large scale use while letting the rest underspecified. The CL community probably dodged a bullet thanks to the failure of this proposal.

3.5 Safety before Ubiquity

Guy Steele has been quoted as vaunting the programmability of Lisp’s syntax by saying: If you give someone Fortran, he has Fortran. If you give someone Lisp, he has any language he pleases. Unhappily, if he were speaking about CL specifically, he would have had to add: but it can’t be the same as anyone else’s.

Indeed, syntax in CL is controlled via a fuzzy set of global variables, prominently including the *readtable*. Making non-trivial modifications to the variables and/or tables is possible, but letting these modifications escape is a serious issue; for the author of a system has no control over which systems will or won’t be loaded before or after his system — this depends on what the user requests and on what happens to already have been compiled or loaded. Therefore in absence of further convention, it’s always a bug to either rely on the syntax tables having non-default values from previous systems, or to inflict non-default values upon next systems. What is worse, changing syntax is only useful if it also happens at the interactive REPL and in appropriate editor buffers. Yet these interactive syntax changes can affect files built interactively, including, upon modification, components that do not depend on the syntax support, or worse, that the syntax support depends on; this can cause catastrophic circular dependencies, and require a fresh start after having cleared the output file cache. Systems like named-readtables or cl-syntax help with syntax control, but proper hygiene isn’t currently enforced by either CL or ASDF, and remains up to the user, especially at the REPL.

Build support is therefore strongly required for safe syntax modification; but this build support isn’t there yet in ASDF 3. For backward-compatibility reasons, ASDF will not enforce strict controls on the syntax, at least not by default. But it is easy to enforce hygiene by binding read-only copies of the standard syntax tables around each action. A more backward-compatible behavior is to let systems modify a shared readtable, and leave the user responsible for ensuring that all modifications to this readtable used in a given image are mutually compatible; ASDF can still bind the current *readtable* to that shared readtable around every compilation, to at least ensure that selection of an incompatible readtable at the REPL does not pollute the build. A patch to this effect is pending acceptance by the new maintainer. Note that for full support of readtable modification, other tools beside ASDF will have to be updated too: SLIME, the Emacs mode for CL, as well as its competitors for VIM, climacs, hemlock, CCL-IDE, etc.

Until such issues are resolved, even though the Lisp ideal is one of ubiquitous syntax extension, and indeed extension through macros is ubiquitous, extension though reader changes are rare in the CL community. This is in contrast with other Lisp dialects, such as Racket, that have succeeded at making syntax customization both safe and ubiquitous, by having it be strictly scoped to the current file or REPL. Any language feature has to be safe before it may be ubiquitous; if the semantics of a feature depend on circumstances beyond the control of system authors, such as the bindings of syntax variables by the user at his REPL, then these authors cannot depend on this feature.

3.6 Final Lesson: Explain it

While writing this article, we had to revisit many concepts and pieces of code, which led to many bug fixes and refactorings to ASDF and cl-launch. An earlier interactive "ASDF walk-through" via Google Hangout also led to enhancements. Our experience illustrates the principle that you should always explain your programs: having to intelligibly verbalize the concepts will make you understand them better.

Appendices

4 Appendix A: ASDF 1, a defsystem for CL

4.1 A brief history of ASDF

In the early history of Lisp, back when core memory was expensive, all programs fit in a deck of punched cards. As computer systems grew, they became files on a tape, or, if you had serious money, on a disk. As they kept growing, programs would start to use libraries, and not be made of a single file; then you’d just write a quick script that loaded the few files your code depended on. As software kept growing, manual scripts proved unwieldy, and people started developing build systems. A popular one, since the late 1970s, was make.

Ever since the late 1970s, Lisp Machines had a build system called DEFSYSTEM. In the 1990s, a portable free software reimplementation, mk-defsystem, became somewhat popular. By 2001, it had grown crufty and proved hard to extend, so Daniel Barlow created his own variant, ASDF, that he published in 2002, and that became an immediate success. Dan’s ASDF was an experiment in many ways, and was notably innovative in its extensible object-oriented API and its clever way of locating software. See Appendix A: ASDF 1, a defsystem for CL.

By 2009, Dan’s ASDF 1 was used by hundreds of software systems on many CL implementations; however, its development cycle was dysfunctional, its bugs were not getting fixed, those bug fixes that existed were not getting distributed, and configuration was noticeably harder that it should have been. Dan abandoned CL development and ASDF around May 2004; ASDF was loosely maintained until Gary King stepped forward in May 2006. After Gary King resigned in November 2009, we took over his position, and produced a new version ASDF 2, released in May 2010, that turned ASDF from a successful experiment to a product, making it upgradable, portable, configurable, robust, performant and usable. See Appendix B: ASDF 2, or Productizing ASDF.

The biggest hurdle in productizing ASDF was related to dealing with CL pathnames. We explain a few salient issues in Appendix C: Pathnames.

While maintaining ASDF 2, we implemented several new features that enabled a more declarative style in using ASDF and CL in general: declarative use of build extensions, selective control of the build, declaration of file encoding, declaration of hooks around compilation, enforcement of user-defined invariants. See Appendix D: ASDF 2.26, more declarative.

Evolving ASDF was not monotonic progress, we also had many failures along the way, from which lessons can be learned. See Appendix E: Failed Attempts at Improvement.

After fixing all the superficial bugs in ASDF, we found there remained but a tiny bug in its core. But the bug ended up being a deep conceptual issue with its dependency model; fixing it required a complete reorganization of the code, yielding ASDF 3. See Appendix F: A traverse across the build.

4.2 DEFSYSTEM before ASDF

Ever since the late 1970s, Lisp implementations have each been providing their variant of the original Lisp Machine DEFSYSTEM (Moon 1981). These build systems allowed users to define systems, units of software development made of many files, themselves often grouped into modules; many operations were available to transform those systems and files, mainly to compile the files and to load them, but also to extract and print documentation, to create an archive, issue hot patches, etc.; DEFSYSTEM users could further declare dependency rules between operations on those files, modules and systems, such that files providing definitions should be compiled and loaded before files using those definitions.

Since 1990, the state of the art in free software CL build systems was mk-defsystem (Kantrowitz 1990).The variants of DEFSYSTEM available on each of the major proprietary CL implementations (Allegro, LispWorks, and formerly, Genera), seem to have been much better than mk-defsystem. But they were not portable, not mutually compatible, and not free software, and therefore mk-defsystem became de facto standard for free software. Like late 1980s variants of DEFSYSTEM on all Lisp systems, it featured a declarative model to define a system in terms of a hierarchical tree of components, with each component being able to declare dependencies on other components. The many subtle rules constraining build operations could be automatically deduced from these declarations, instead of having to be manually specified by users.

However, mk-defsystem suffered from several flaws, in addition to a slightly annoying software license. These flaws were probably justified at the time it was written, several years before the CL standard was adopted, but were making it less than ideal in the world of universal personal computing. First, installing a system required editing the system definition files to configure pathnames, and/or editing some machine configuration file to define "logical pathname translations" that map to actual physical pathnames the "logical pathnames" used by those system definition files. Back when there were a few data centers each with a full time administrator, each of whom configured the system once for tens of users, this was a small issue; but when hundreds of amateur programmers were each installing software on their home computer, this situation was less than ideal. Second, extending the code was very hard: Mark Kantrowitz had to restrict his code to functionality universally available in 1990 (which didn’t include CLOS), and to add a lot of boilerplate and magic to support many implementations. To add the desired features in 2001, a programmer would have had to modify the carefully crafted file, which would be a lot of work, yet eventually would probably still break the support for now obsolete implementations that couldn’t be tested anymore.

4.3 ASDF 1: A Successful Experiment

In 2001, Dan Barlow, a then prolific CL hacker, dissatisfied with mk-defsystem, wrote a new defsystem variant, ASDF.In a combined reverence for tradition and joke, ASDF stands for "Another System Definition Facility", as well as for consecutive letters on a QWERTY keyboard. Thus he could abandon the strictures of supporting long obsolete implementations, and instead target modern CL implementations. In 2002, he published ASDF, made it part of SBCL, and used it for his popular CL software. It was many times smaller than mk-defsystem (under a thousand line of code, instead of five thousand), much more usable, easy to extend, trivial to port to other modern CL implementations, and had an uncontroversial MIT-style software license. It was an immediate success.

ASDF featured many brilliant innovations in its own right.

Perhaps most importantly as far as usability goes, ASDF cleverly used the *load-truename* feature of modern Lisps, whereby programs (in this case, the defsystem form) can identify from which file they are loaded. Thus, system definition files didn’t need to be edited anymore, as was previously required with mk-defsystem, since pathnames of all components could now be deduced from the pathname of the system definition file itself. Furthermore, because the truename resolved Unix symlinks, you could have symlinks to all your Lisp systems in one or a handful directories that ASDF knew about, and it could trivially find all of them. Configuration was thus a matter of configuring ASDF’s *central-registry* with a list of directories in which to look for system definition files, and maintaining "link farms" in those directories — and both aspects could be automated. (See Configurability for how ASDF 2 improved on that.)

Also, following earlier suggestions by Kent Pitman (Pitman 1984), Dan Barlow used object-oriented style to make his defsystem extensible without the need to modify the main source file.Dan Barlow may also have gotten from Kent Pitman the idea of reifying a plan then executing it in two separate phases rather than walking the dependencies on the go. Using the now standardized Common Lisp Object System (CLOS), Dan Barlow defined his defsystem in terms of generic functions specialized on two arguments, operation and component, using multiple dispatch, an essential OO feature unhappily not available in lesser programming languages, i.e. sadly almost all of them — they make do by using the "visitor pattern". Extending ASDF is a matter of simply defining new subclasses of operation and/or component and a handful of new methods for the existing generic functions, specialized on these new subclasses. Dan Barlow then demonstrated such simple extension with his sb-grovel, a system to automatically extract low-level details of C function and data structure definitions, so they may be used by SBCL’s foreign function interface.

4.4 Limitations of ASDF 1

ASDF was a great success at the time, but after many years, it was also found to have its downsides: Dan Barlow was experimenting with new concepts, and his programming style was to write the simplest code that would work in the common case, giving him most leeway to experiment. His code had a lot of rough edges: while ASDF worked great on the implementation he was using for the things he was doing with it, it often failed in ugly ways when using other implementations, or exercising corner cases he had never tested. The naïve use of lists as a data structure didn’t scale to large systems with thousands of files. The extensibility API while basically sane was missing many hooks, so that power users had to redefine or override ASDF internals with modified variants, which made maintenance costly.

Moreover, there was a vicious circle preventing ASDF bugs from being fixed or features from being added (Rideau 2009): Every implementation or software distribution (e.g. Debian) had its own version, with its own bug fixes and its own bugs; so developers of portable systems could not assume anything but the lowest common denominator, which was very buggy. On the other hand, because users were not relying on new features, but instead wrote kluges and workarounds that institutionalized old bugs, there was no pressure for providers to update; indeed the pressure was to not update and risk be responsible for breakage, unless and until the users would demand it. Thus, one had to assume that no bug would ever be fixed everywhere; and for reliability one had to maintain one’s own copy of ASDF, and closely manage the entire build chain: start from a naked Lisp, then get one’s fixed copy of ASDF compiled and loaded before any system could be loaded.It was also impossible to provide a well configured ASDF without pre-loading it in the image; and it was impossible to upgrade ASDF once it was loaded. Thus Debian couldn’t reliably provide "ready to go" images that would work for everyone who may or may not need an updated ASDF, especially not with stability several years forward. In the end, there was little demand for bug fixes, and supply followed by not being active fixing bugs. And so ASDF development stagnated for many years.

5 Appendix B: ASDF 2, or Productizing ASDF

In November 2009, we took over ASDF maintainership and development. A first set of major changes led to ASDF 2, released in May 2010. The versions released by Dan Barlow and the maintainers who succeeded him, and numbered 1.x are thereafter referred to as ASDF 1. These changes are explained in more detail in our ILC 2010 article (Goldman 2010).

5.1 Upgradability

The first bug fix was to break the vicious circle preventing bug fixes from being relevant. We enabled hot upgrade of ASDF, so that users could always load a fixed version on top of whatever the implementation or distribution did or didn’t provide. In ASDF 3, some of the upgrade complexity described in our 2010 paper was done away with: even though CL makes dynamic data upgrade extraordinarily easy as compared to other languages, we found that it’s not easy enough to maintain; therefore instead of trying hard to maintain that code, we "punt" and drop in-memory data if the schema has changed in incompatible ways; thus we do not try hard to provide methods for update-instance-for-redefined-class. The only potential impact of this reduction in upgrade capability would be users who upgrade code in a long-running live server; but considering how daunting that task is, properly upgrading ASDF despite reduced support might be the least of their problems. To partly compensate this issue, ASDF 3 preemptively attempts to upgrade itself at the beginning of every build (if an upgrade is available as configured) — that was recommended but not enforced by ASDF 2. This reduces the risk of either having data to drop from a previous ASDF, or much worse, being caught upgrading ASDF in mid-flight. In turn, such special upgrading of ASDF itself makes code upgrade easier. Indeed, we had found that CL support for hot upgrade of code may exist but is anything but seamless. These simpler upgrades allow us to simply use fmakunbound everywhere, instead of having to unintern some functions before redefinition. Soon enough, users felt confident relying on bug fixes and new features, and all implementations started providing ASDF 2.

These days, you can (require "asdf") on pretty much any CL implementation, and start building systems using ASDF. Most implementations already provide ASDF 3. A few still lag with ASDF 2, or fail to provide ASDF; the former can be upgraded with (asdf:upgrade-asdf); all but the most obsolete ones can be fixed by an installation script we provide with ASDF 3.1.

Upgradability crucially decoupled what ASDF users could rely on from what implementations provided, enabling a virtuous circle of universal upgrades, where previously everyone was waiting for others to upgrade, in a deadlock. Supporting divergence creates an incentive towards convergence.

5.2 Portability

A lot of work was spent on portability. Originally written for sbcl, ASDF 1 eventually supported 5 more implementations: allegro, ccl, clisp, cmucl, ecl. Each implementation shipped its own old version, often slightly edited; system definition semantics often varied subtly between implementations, notably regarding pathnames. ASDF 2.000 supported 9 implementations, adding: abcl, lispworks, gcl; system definition semantics was uniform across platforms. ASDF 2.26 (last in the ASDF 2 series) supported 15, adding: cormanlisp, genera, mkcl, rmcl, scl, xcl. Since then, new implementations are being released with ASDF support: mocl, and hopefully soon clasp.

ASDF as originally designed would only reliably work on Unix variants (Linux, BSD, etc., maybe cygwin, and now also MacOS X, Android, iOS). It can now deal with very different operating system families: most importantly Windows, but also the ancient MacOS 9 and Genera.

Portability was achieved by following the principle that we must abstract away semantic discrepancies between underlying implementations. This is in contrast with the principle apparently followed by ASDF 1, to "provide a transparent layer on top of the implementation, and let users deal with discrepancies". ASDF 2 thus started growing an abstraction layer that works around bugs in each implementation and smoothes out incompatibilities, which made the ASDF code itself larger, but allowed user code to be smaller for portable results.

The greatest source of portability woes was in handling pathnames: the standard specification of their behavior is so lacking, and the implementations so differ in their often questionable behavior, that instead of the problem being an abundance of corner cases, the problem was a dearth of common cases. So great is the disaster of CL pathnames, that they deserve their own appendix to this article (see Appendix C: Pathnames).

Lisp programmers can now "write once, run anywhere", as far as defining systems go. They still have to otherwise avoid non-standardized behavior and implementation-specific extensions if they want their programs to be portable. There are a lot of portability libraries to assist programmers, but neither ASDF nor any of them can solve these issues intrinsic to CL, short of portably implementing a new language on top of it.

Portability decoupled which implementation and operating system were used to develop a system from which it could be compiled with, where previously any non-trivial use of pathnames, filesystem access or subprocess invocation was a portability minefield.

5.3 Configurability

ASDF 1 was much improved over what preceded it, but its configuration mechanism was still lacking: there was no modular way for whoever installed software systems to register them in a way that users could see them; and there was no way for program writers to deliver executable scripts that could run without knowing where libraries were installed.

One key feature introduced with ASDF 2 (Goldman 2010) was a new configuration mechanism for programs to find libraries, the source-registry, that followed this guiding principle: Each can specify what they know, none need specify what they don’t.

Configuration information is taken from multiple sources, with the former partially or completely overriding the latter: argument explicitly passed to initialize-source-registry, environment variable, central user configuration file, modular user configuration directory, central system configuration files, modular system configuration directories, implementation configuration, with sensible defaults. Also, the source-registry is optionally capable of recursing through subdirectories (excluding source control directories), where *central-registry* itself couldn’t. Software management programs at either user or system level could thus update independent configuration files in a modular way to declare where the installed software was located; users could manually edit a file describing where they manually downloaded software; users could export environment variables to customize or override the default configuration with context-dependent information; and scripts could completely control the process and build software in a predictable, deterministic way; it’s always possible to take advantage of a well-configured system, and always possible to avoid and inhibit any misconfiguration that was out of one’s control.

A similar mechanism, the output-translations, also allows users to segregate the output files away from the source code; by default it uses a cache under the user’s home directory, keyed by implementation, version, OS, ABI, etc. Thus, whoever or whichever software manages installation of source code doesn’t have to also know which compiler is to be used by which user at what time. The configuration remains modular, and code can be shared by all who trust it, without affecting those who don’t. The idea was almost as old as ASDF itself, but previous implementations all had configurability issues.Debian’s common-lisp-controller (CLC) popularized the technique as far back as 2002, and so did the more widely portable asdf-binary-locations (ABL) after it: by defining an :around method for the output-files function, it was possible for the user to divert where ASDF would store its output. The fact that this technique could be developed as an obvious extension to ASDF without the author explicitly designing the idea into it, and without having to modify the source code, is an illustration of how expressive and modular CLOS can be. But apart from its suffering from the same lack of modularity as the *central-registry*, CLC and ABL also had a chicken-and-egg problem: you couldn’t use ASDF to load it, or it would itself be compiled and loaded without the output being properly diverted, negating any advantage in avoiding clashes for files of other systems. ABL thus required special purpose loading and configuration in whichever file did load ASDF, making it not modular at all. CLC tried to solve the issue by managing installation or all CL software; it failed eventually, because these efforts were only available to CL programmers using Debian or a few select other Linux software distributions, and only for the small number of slowly updated libraries, making the maintainers a bottleneck in a narrow distribution process. CLC also attempted to institute a system-wide cache of compiled objects, but this was ultimately abandoned for security issues; a complete solution would have required a robust and portable build service, which was much more work than justified by said narrow distribution process. The solution in ASDF 2 was to merge this functionality in ASDF itself, according to the principle to make it as simple as possible, but no simpler. But whereas ASDF 1 followed this principle under the constraint that the simple case should be handled correctly, ASDF 2 updated the constraint to include handling all cases correctly. Dan Barlow’s weaker constraint may have been great for experimenting, it was not a good one for a robust product. Another, more successful take on the idea of CLC, is Zach Beane’s Quicklisp (2011): it manages the loading and configuration of ASDF, and can then download and install libraries. Because it does everything in the user’s directory, without attempts to share between users and without relying on support from the system or software distribution, it can be actually ubiquitous. Thanks to ASDF 2’s modular configuration, the Quicklisp-managed libraries can complement the user’s otherwise configured software rather than one completely overriding the other.

Configurability decoupled use and installation of software: multiple parties could now each modularly contribute some software, whether applications, libraries or implementations, and provide configuration for it without being required to know configuration of other software; previously, whoever installed software couldn’t notify users, and users had to know and specify configuration of all software installed.

5.4 Robustness

ASDF 1 used to pay little attention to robustness. A glaring issue, for instance, was causing much aggravation in large projects: killing the build process while a file was being compiled would result in a corrupt output file that would poison further builds until it was manually removed: ASDF would fail the first time, then when restarted a second time, would silently load the partially compiled file, leading the developer to believe the build had succeeded when it hadn’t, and then to debug an incomplete system. The problem could be even more aggravating, since a bug in the program itself could be causing a fatal error during compilation (especially since in CL, developers can run arbitrary code during compilation). The developer, after restarting compilation, might not see the issue; he would then commit a change that others had to track down and painfully debug. This was fixed by having ASDF compile into a temporary file, and move the outputs to their destination only in case of success, atomically where supported by implementation and OS. A lot of corner cases similarly had to be handled to make the build system robust.

We eventually acquired the discipline to systematically write tests for new features and fixed bugs. The test system itself was vastly improved to make it easier to reproduce failures and debug them, and to handle a wider variety of test cases. Furthermore, we adopted the policy that the code was not to be released unless every regression test passed on every supported implementation (the list of which steadily grew), or was marked as a known failure due to some implementation bugs. Unlike ASDF 1, that focused on getting the common case working, and letting users sort out non-portable uncommon cases with their implementation, ASDF 2 followed the principle that code should either work of fail everywhere the same way, and in the latter case, fail early for everyone rather than pass as working for some and fail mysteriously for others. These two policies led to very robust code, at least compared to previous CL build systems including ASDF 1.

Robustness decoupled the testing of systems that use ASDF from testing of ASDF itself: assuming the ASDF test suite is complete enough, (sadly, all too often a preposterous assumption), systems defined using ASDF 2 idioms run the same way in a great variety of contexts: on different implementations and operating systems, using various combinations of features, after some kind of hot software upgrade, etc. As for the code in the system itself — it might still require testing on all supported implementations in case it doesn’t strictly adhere to a portable subset of CL (which isn’t automatically enforceable so far), since the semantics of CL are not fully specified but leave a lot of leeway to implementers, unlike e.g. ML or Java.

5.5 Performance

ASDF 1 performance didn’t scale well to large systems: Dan Barlow was using the list data structure everywhere, leading to worst case planning time no less than O(n⁴) where n is the total size of systems built. We assume he did it for the sake of coding simplicity while experimenting, and that his minimalism was skewed by the presence of many builtin CL functions supporting this old school programming style. ASDF did scale reasonably well to a large number of small systems, because it was using a hash-table to find systems; on the other hand, there was a dependency propagation bug in this case (see Appendix F: A traverse across the build). In any case, ASDF 2 followed the principle that good data structures and algorithms matter, and should be tailored to the target problem; it supplemented or replaced the lists used by ASDF 1 with hash-tables for name lookup and append-trees to recursively accumulate actions, and achieved linear time complexity. ASDF 2 therefore performed well whether or not the code was split in a large number of systems.

Sound performance decoupled the expertise in writing systems from the expertise in how systems are implemented. Now, developers could organize or reorganize their code without having to shape it in a particular way to suit the specific choice of internals by ASDF itself.

5.6 Usability

Usability was an important concern while developing ASDF 2. While all the previously discussed aspects of software contribute to usability, some changes were specifically introduced to improve the user experience.

As a trivial instance, the basic ASDF invocation was the clumsy (asdf:operate ’asdf:load-op :foo) or (asdf:oos ’asdf:load-op :foo). With ASDF 2, that would be the more obvious (asdf:load-system :foo)load-system was actually implemented by Gary King, the last maintainer of ASDF 1, in June 2009; but users couldn’t casually rely on it being there until ASDF 2 in 2010 made it possible to rely on it being ubiquitous. Starting with ASDF 3.1, (asdf:make :foo) is also available, meaning "do whatever makes sense for this component"; it defaults to load-system, but authors of systems not meant to be loaded can customize it to mean different things..

ASDF 2 provided a portable way to specify pathnames by adopting Unix pathname syntax as an abstraction, while using standard CL semantics underneath. It became easy to specify hierarchical relative pathnames, where previously doing it portably was extremely tricky. ASDF 2 similarly provided sensible rules for pathname types and type overrides. (See Appendix C: Pathnames.) ASDF made it hard to get pathname specifications right portably; ASDF 2 made it hard to get it wrong or make it non-portable.

Usability decoupled the knowledge of how to use ASDF from both the knowledge of ASDF internals and CL pathname idiosyncrasies. Any beginner with a basic understanding of CL and Unix pathnames could now use ASDF to portably define non-trivial systems, a task previously reserved to experts and/or involving copy-pasting magical incantations. The principle followed was that the cognitive load on each kind of user must be minimized.

6 Appendix C: Pathnames

6.1 Abstracting over Pathnames

The CL standard specifies a pathname class that is to be used when accessing the filesystem. In many ways, this provides a convenient abstraction, that smoothes out the discrepancies between implementations and between operating systems. In other ways, the standard vastly underspecifies the behavior in corner cases, which makes pathnames quite hard to use. The sizable number of issues dealt by ASDF over the years were related to pathnames and either how they don’t match either the underlying operating system’s notion, or how their semantics vary between implementations. We also found many outright bugs in the pathname support of several implementations, many but not all of which have been fixed after being reported to the respective vendors.

To make ASDF portable, we started writing abstraction functions during the end year of ASDF 1, they grew during the years of ASDF 2, and exploded during the heavy portability testing that took place before ASDF 3. The result is now part of the UIOP library transcluded in ASDF 3. However, it isn’t possible to fully recover the functionality of the underlying operating system in a portable way from the API that CL provides. This API is thus what Henry Baker calls an abstraction inversion (Baker 1992) or, in common parlance, putting the cart before the horse.

UIOP thus provides its functionality on a best-effort basis, within the constraint that it builds on top of what the implementation provides. While the result is robust enough to deal with the kind of files used while developing software, that makes UIOP not suitable for dealing with all the corner cases that may arise when processing files for end-users, especially not in adversarial situations. For a complete and well-designed reimplementation of pathnames, that accesses the operating system primitives and libraries via CFFI, exposes them and abstracts over, in a way portable across implementations (and, to the extent that it’s meaningful, across operating systems), see IOLib instead. Because of its constraint of being self-contained and minimal, however, ASDF cannot afford to use IOLib (or any library).

All the functions in (UIOP) have documentation strings explaining their intended contract.

6.2 Pathname Structure

In CL, a pathname is an object with the following components:

A host component, which is often but not always nil on Unix implementations, or sometimes :unspecific; it can also be a string representing a registered "logical pathname host" (see Logical Pathnames); and in some implementations, it can be a string or a structured object representing the name of a machine, or elements of a URL, etc.

A device component, which is often but not always nil on Unix implementations, can be a string representing a device, like "c" or "C" for the infamous C: on Windows, and can be other things on other implementations.

A directory component, which can be nil (or on some implementations :unspecific), or a list of either :absolute or :relative followed by "words" that can be strings that each name a subdirectory, or :wild (wildcard matching any subdirectory) or :wild-inferiors (wildcard matching any recursive subdirectory), or some string or structure representing wildcards. e.g. (:absolute "usr" "bin"), or (:relative ".svn"). For compatibility with ancient operating systems without hierarchical filesystems, the directory component on some implementations can be just a string, which is then an :absolute.

A name component, which is often a string, can be nil (or on some implementations :unspecific), or :wild or some structure with wildcards, though the wildcards can be represented as strings.

A type component, which is often a string, can be nil (or on some implementations :unspecific), or :wild or some structure with wildcards, though the wildcards can be represented as strings.

A version component, which can be nil (or on some implementations :unspecific), and for backward compatibility with old filesystems that supported this kind of file versions, it can be a positive integer, or :newest. Some implementations may try to emulate this feature on a regular Unix filesystem.

Previous versions of SCL tried to define additional components to a pathname, so as to natively support URLs as pathnames, but this ended up causing extra compatibility issues, and recent versions fold all the additional information into the host component.

:unspecific is only fully or partly supported on some implementations, and differs from nil in that when you use merge-pathnames, nil means "use the component from the defaults" whereas :unspecific overrides it.

Already, UIOP has to deal with normalizing away the ancient form of the directory component and bridging over ancient implementations that don’t support pathnames well with function make-pathname*, or comparing pathnames despite de-normalization with function pathname-equal.

6.3 Namestrings

The CL standard accepts a strings as a pathname designator in most functions where pathname is wanted. The string is then parsed into a pathname using function parse-namestring. A pathname can be mapped back to a string using namestring. Due to the many features regarding pathnames, not all strings validly parse to a pathname, and not all pathnames have a valid representation as a namestring. However, the two functions are reasonably expected to be inverse of each other when restricted to their respective output domains.Still, on many implementations, filesystem access functions such as probe-file or truename will return a subtly modified pathname that looks the same because it has the same namestring, but doesn’t compare as equal because it has a different version of :version :newest rather than :version nil. This can cause a lot of confusion and pain. You therefore should be careful to never compare pathnames or use them as keys in a hash-table without having normalized them, at which point you may as well use the namestring as the key.

These functions are completely implementation-dependent, and indeed, two implementations on the same operating system may do things differently. Moreover, parsing is relative to the host of a default pathname (itself defaulting to *default-pathname-default