The Explorer

The Adventures of a Pythonista in Schemeland/23

by Michele Simionato

May 19, 2009



Summary

In episode 19 I noticed that the R6RS module system allows for separate compilation, but I have not mentioned the subtilities associated with it. This episode discusses the topic, the concept of visit time and the intricacies of the "import" semantics.


Separate compilation and import semantics

Scheme is all about times: there is a run-time, an expand-time, and a discrete set of times associated to the meta-levels. When separate compilation is taken in consideration, there is also another set of times: the times when the libraries are separately compiled. Finally, if the separately compiled libraries define macros which are used in client code, there is yet another set of times, the visit times.

To explain what the visit time is, suppose you have a low level library L , compiled yesterday, defining a macro you want to use in another middle level library M , to be compiled today. The compiler needs to know about the macro defined in L at the time of the compilation of M , because it has to expand code in M . Therefore, the compiler must look at L and re-evaluate the macro definition today (the process is called visiting). The visit time is different from the time of the compilation of L as it happens just before the compilation of M .

Here is a concrete example. Consider the following low level library L , defining a macro m and an integer variable a :

#!r6rs (library (experimental L) (export m a) (import (rnrs) (sweet-macros)) (def-syntax m (begin (display "visiting L

") (lambda (x) #f))) (define a 42) (display "L instantiated

") )

You may compile it with PLT Scheme:

$ plt-r6rs --compile L.sls [Compiling /usr/home/micheles/gcode/scheme/experimental/L.sls] visiting L

Since the right hand side of a macro definition is evaluated at compile time the message visiting L is printed during compilation, as expected. Consider now the following middle level library M using the macro m :

#!r6rs (library (experimental M) (export a) (import (rnrs) (experimental L)) (m); this line is expanded at compile-time (display "M instantiated

"); at run-time )

In this example the compiler needs to visit L in order to compile M . This is actually what happens:

$ plt-r6rs --compile M.sls [Compiling /usr/home/micheles/gcode/scheme/experimental/M.sls] visiting L

If you comment out the line with the macro call, the compiler does not need to visit L anymore; some implementations may take advantage of this fact (Ypsilon and Ikarus do). However, PLT Scheme will continue to visit L in any case.

The mysterious import semantics It is time to ask ourselves the crucial question: what does it mean to import a library? For a Pythonista, things are very simple: importing a library means executing it at run-time. For a Schemer, things are somewhat complicated: importing a library implies that some basic operations are performed at compile time - such as looking at the exported identifiers and at the dependencies of the library - but there is also a lot of unspecified behavior which may happen both a compile-time and at run-time. In particular at compile-time a library may be only visited, i.e. its macro definitions can be re-evaluated - or can be only instantiated, or both. Different things happens in different situations and in the same situation different implementations can perform different operations. The example of the previous paragraph is useful in order to get a feeling of what is portable behavior and what is not. Let me first consider what happens in Ikarus. If I want to compile L and M in Ikarus, I need to introduce a helper script H.ss , since Ikarus has no direct way to compile a library from the command line. Here is the script: $ cat H.ss #!r6rs (import (rnrs) (experimental M)) (display a) Here is what I get: $ ikarus --compile-dependencies H.ss visiting L Serializing "/home/micheles/gcode/scheme/experimental/M.sls.ikarus-fasl" ... Serializing "/home/micheles/gcode/scheme/experimental/L.sls.ikarus-fasl" ... Ikarus is lazier than PLT: for instance, if you comment the line invoking the macro in M.sls and you recompile the dependencies, then the library M is not visited. Both PLT and Ikarus do not instantiate L in order to compile M (it is not needed) but Ypsilon does. You may check that if you introduce a dummy macro in M , depending on the variable a defined in L (for instance if you add a line (def-syntax dummy (lambda (x) a)) ) then the library L must be instantiated in order to compile M , and all implementations do so. Let us consider the peculiarities of Ypsilon, now. Ypsilon does not have a switch to compile a library without executing it - even if this is possible by invoking the low level compiler API - so we must execute H.ss to compile its dependencies: $ ypsilon --r6rs H.ss L instantiated visiting L M instantiated 42 There are several things to notice here, since the output of Ypsilon is quite different from the output of Ikarus $ ikarus --r6rs-script H.ss L instantiated 42 and the output of PLT: $ plt-r6rs H.ss visiting L visiting L L instantiated M instantiated 42 The first thing to notice is that both in Ikarus and in PLT we relied on the fact that the libraries were precompiled, so in order to perform a fair comparison we must run Ypsilon again (this second time the libraries L and M will be precompiled): $ ypsilon --r6rs H.ss L instantiated M instantiated 42 You my notice that this time the library L is not visited: it was visited the first time, in order to compile M , but there is no need to do so now. During compilation of M macros has been expanded and the byte-code of M contains the expanded version of the library; moreover the helper script H does not use any macro so it does not really need to visit L or M to be compiled. The same happens for Ikarus. PLT instead visits L twice to compile H.ss . In PLT all dependencies (both direct and indirect) are always visited when compiling. Only if we compile the script once and for all $ plt-r6rs --compile H.ss [Compiling /usr/home/micheles/gcode/scheme/experimental/H.ss] [Compiling /home/micheles/.plt-scheme/4.1.5.5/collects/experimental/M.sls] visiting L visiting L the visiting L message will not be printed: $ plt-r6rs H.ss L instantiated M instantiated 42

More implementation-dependent details Having performed the right number of compilations now the output of PLT and Ypsilon are the same; nevertheless, the output of Ikarus is different, since Ikarus does not instantiate the middle level library M . The reason is the implicit phasing semantics of Ikarus (other implementations based on psyntax would exhibit the same behavior): the helper script H.ss is printing the variable a which really comes from the library L . Ikarus is clever enough to recognize this fact and lazy enough to avoid instantiating M without need. On the other hand, Ypsilon performs eager instantiation and it instantiates (once) all the libraries it imports (both directly and indirectly), even at compile time and even in situations when the instantiation would not be needed for compilation of the client library. As you see, Scheme implementations have a lot of latitude in such matters. The implementations based on psyntax are the smartest out there, but begin smart is not always the same thing as being good. It is good to avoid instantiating a library if the instantiation is really unneeded; it is bad if the library has some side effect, since the side effect will mysteriously disappear. In our example the side effect is just printing the message M instantiated , in more sophisticated examples the side effect could be writing a log on a database, or initializing some variable, or registering an object, or something else. For instance, suppose you want to collect a bunch of functions into a global registry acting as a dictionary of functions. You may do so as follows: (library (my-library) (export) (import (registry)) (registry-set! 'f1 (lambda (x) 'something1)) (registry-set! 'f2 (lambda (x) 'something2)) ... ) The library here does not export anything, since it relies on side effects to populate the global registry of functions; the idea is to access the functions later, with a call of kind (registry-ref <func-name>) . This design as it is is not always portable to systems based on psyntax, because such systems will not instantiate the library (the library does not export any variable, nothing of the library can be used in client code!). This can easily be fixed, by introducing an initialization function to be exported and called explicitly from client code, which is a good idea in any case. Analogously, a library based on side effects at visit time, i.e. in the right hand side of macro definitions, is not portable, since systems based on psyntax will not visit a library with macros which are not used. This is relevant if you want to use the technique described in the You want it when? paper: in order to make sure that the technique work on systems based on psyntax, you must make sure that the library exports at least one macro which is used in client code. Curious readers will find the gory details in this thread on the PLT mailing list. Generally speaking, you cannot rely on the number of times a library will be instantiated, even within the same implementation! Abdulaziz Ghuloum gave a nice example in the Ikarus and PLT lists. You have the following libraries: (library (T0) (export) (import (rnrs)) (display "T0

")) (library (T1) (export) (import (for (T0) run expand))) (library (T2) (export) (import (for (T1) run expand))) (library (T3) (export) (import (for (T2) run expand))) and the following script: #!r6rs (import (T3)) Running the script (without precompilation) results in printing T0: 0 times for Ikarus and Mosh 1 time for Larceny and Ypsilon 10 times for plt-r6rs 13 times for mzscheme 22 times for DrScheme T0 is not printed in psyntax-based implementations, since it does not export any identifier that can be used. T0 is printed once in Larceny and Ypsilon since they are single instantiation implementations with eager import. The situation in PLT Scheme is subtle, and you can find a detailed explanation of what it is happening in this other thread. Otherwise, you will have to wait for the next (and last!) episode of this series, where I will explain the reason why PLT is instantiating (and visiting) modules so many times.

Talk Back!

Have an opinion? Be the first to post a comment about this weblog entry.

RSS Feed

If you'd like to be notified whenever Michele Simionato adds a new entry to his weblog, subscribe to his RSS feed.

About the Blogger

Michele Simionato started his career as a Theoretical Physicist, working in Italy, France and the U.S. He turned to programming in 2003; since then he has been working professionally as a Python developer and now he lives in Milan, Italy. Michele is well known in the Python community for his posts in the newsgroup(s), his articles and his Open Source libraries and recipes. His interests include object oriented programming, functional programming, and in general programming metodologies that enable us to manage the complexity of modern software developement.

This weblog entry is Copyright © 2009 Michele Simionato. All rights reserved.