[Chicken-users] macro systems and chicken (long)

From: Alex Shinn Subject: [Chicken-users] macro systems and chicken (long) Date: Fri, 04 Apr 2008 20:56:59 +0900 User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.1.50 (darwin)

There seems to be a lot of confusion in the Chicken community, and the Lisp community in general, about the different macro systems, so I thought provide some background information and discussion of the eggs available in Chicken and their uses. --- Background --- There are two completely orthogonal aspects of macro systems - whether they are hygienic or unhygienic, and whether they are low-level or high-level. Low-level means direct manipulation of sexps to produce sexps - you're generating code expressions by hand. High-level means you use some higher abstraction like templating - the underlying processing may or may not make use of sexps at all. Low-level of course offers the most control. High-level has nice benefits such as providing a location in source code for line-number debug info, and easier analysis by other tools like analysers and editors. Neither of these have anything to do with hygiene. Hygiene is a relatively newer concept, so all the old macro systems were either unhygienic + low-level or unhygienic + high-level. defmacro is the former - it's low-level manipulation of sexps. The C preprocessor can be thought of as the weakest, most poorly designed instance of unhygienic high-level macros. It's a templating system without any kind of destructuring, conditionals or polymorphism. Other alternatives like the m4 macro preprocessor and pretty much every assembly preprocessor are more powerful instances of high-level macro systems. Anyway, in the Lisp community we had defmacro, and it was good. You had to be careful to use gensyms, and never to shadow or redefine any core procedures anywhere in your program, but if you stuck to those rules there weren't many problems. Then Scheme came along, and nicely unified the CL namespace mess into a single consistent namespace. The problem was this made conflicts much more likely. It became much more important to be able to automatically avoid problems, without burdening the programmer with mentally keeping track of everything in all lexical scopes. Thus hygiene was born. A good description of why hygiene is necessary can be found at http://community.schemewiki.org/?hygiene-versus-gensym. A very brief time-line: 1986: Kohlbecker - introduced the idea of hygiene, low-level, used an O(n^2) coloring algorithm 1987: Kohlbecker - introduced declare-syntax, high-level, the precursor to syntax-rules 1988: Bawden & Rees - "Syntactic closures," low-level, faster than Kohlbecker's algorithm 1991: Clinger & Rees - Explicit renaming, low-level, based on syntactic-closures but also supports syntax-rules 1992: Dybvig - Syntax-case, primary motivation to remove the distinction between low-level and high-level You can find the papers for these at library.readscheme.org. --- Using The Low-level Systems --- I'm assuming everyone is familiar with syntax-rules, if not there are good tutorials available elsewhere. I'm also going to skip Kohlbecker's original system since it isn't used anywhere. The syntactic closures idea is very simple. Instead of the macro just being passed the expression to transform, it's passed the expression plus environment information. You can think of it like (define-syntax foo (lambda (form usage-environment macro-environment) ...)) which is indeed how it's implemented, but you never use that directly, you use one of the transformer abstractions. The most basic is sc-macro-transformer ("sc" is for syntactic closures). A good discussion can be found at http://community.schemewiki.org/?syntactic-closures or in the MIT Scheme reference manual http://www.gnu.org/software/mit-scheme/documentation/mit-scheme-ref/SC-Transformer-Definition.html but basically the idea is you write macros like (define-syntax foo (sc-macro-transformer (lambda (form usage-environment) ...))) You can then manipulate FORM as a normal sexp just like in defmacro. The resulting sexp is then interpreted in the macro's syntactic environment. To make parts of FORM refer to their bindings in the calling environment, you need to wrap them in syntactic-closures with the USAGE-ENVIRONMENT parameter. As an example, (define-syntax swap! (sc-macro-transformer (lambda (form env) (let ((a (make-syntactic-closure env '() (cadr form))) (b (make-syntactic-closure env '() (caddr form)))) `(let ((value ,a)) (set! ,a ,b) (set! ,b value)))))) FORM is the full form (swap! var1 var2), so we're binding A to var1 and B to var2, in the context of the usage environment. The other identifiers in the returned sexp (LET, VALUE and SET!) all refer to the original macro environment, so even if they had been locally shadowed in the usage environment, this will still work. The second argument to make-syntactic-closure (just '() above) is used when you want to deliberately break hygiene. See the other links for details. The next transformer is rsc-macro-transformer, which is essentially the reverse - the env parameter is the macro environment, and bare identifiers in the result are implicitly handled in the usage environment. (define-syntax swap! (rsc-macro-transformer (lambda (form env) (let ((a (cadr form)) (b (caddr form)) (value (make-syntactic-closure env '() 'value)) (let-r (make-syntactic-closure env '() 'let)) (set!-r (make-syntactic-closure env '() 'set!))) `(,let-r ((,value ,a)) (,set!-r ,a ,b) (,set!-r ,b ,value)))))) Here A and B are just passed as-is, and the normal Scheme constructs (LET and SET!) need to explicitly refer to the macro environment. It looks a little more busy - since most of what you write in a macro expansion will be new code, rather than rearranging the old cold. However, if you look at that, the reason we make VALUE a syntactic-closure is so that it won't conflict with any instances of VALUE in A or B. An alternate way of achieving the same result would be to use gensym. Now, if you're programming by the old defmacro conventions of never redefining or shadowing core forms and functions like LET and SET!, they would have the same meaning in both environments. So, a safe-only-by-convention way of writing this is: (define-syntax swap! (rsc-macro-transformer (lambda (form env) (let ((a (cadr form)) (b (caddr form)) (value (gensym))) `(let ((,value ,a)) (set! ,a ,b) (set! ,b ,value)))))) But that's exactly the way you write this in defmacro! People who argue against hygiene saying "you can have defmacro when you take it from my cold, dead hands" are simply unaware that you can do *exactly* the same style of programming with hygiene. The only extra code above is the rsc-macro-transformer line. ============================================================ = IF YOU FIND HYGIENE CONFUSING, WRITE EVERYTHING WITH = = RSC-MACRO-TRANSFORMER AS THOUGH IT WERE DEFMACRO. YOU = = CAN ADD IN HYGIENE SEEMLESSLY IF AND WHEN ANY PROBLEMS = = ARISE. = ============================================================ The next transformer is er-macro-transformer, where "er" stands for "Explicit Renaming." (define-syntax swap! (er-macro-transformer (lambda (form rename compare) (let ((a (cadr form)) (b (caddr form))) `(,(rename 'let) ((,(rename 'value) ,a)) (,(rename 'set!) ,a ,b) (,(rename 'set!) ,b ,(rename 'value))))))) The result is handled just like in rsc-macro-transformer - raw identifiers are handled in the usage environment. Instead of an reference to the macro environment, we're given a RENAME procedure which explicitly makes a syntactic closure for the macro environment. RENAME is referentially transparent, so even though it's called twice on VALUE above the results are the same, where sameness is by comparison with the COMPARE procedure. I.e. (compare (rename 'foo) (rename 'foo)) => #t Though you can of course rename everything you need once outside to preserve readability of the expression. --- The Hybrid System --- The syntax-case macro system is sort of a hybrid, combining high-level and low-level features. Our example would become: (define-syntax swap! (lambda (stx) (syntax-case stx () ((swap! a b) (syntax (let ((value a)) (set! a b) (set! b value))))))) The SYNTAX-CASE form destructures the STX input just like SYNTAX-RULES does. However, the body isn't a template, but rather is evaluated normally. SYNTAX is another special form that works to instantiate a template. Because this use of SYNTAX occurs lexically inside the (swap! a b) pattern, the instances of A and B in the syntax template hygienically refer to those parameters of the macro. If you moved the SYNTAX to a helper function it would break. So you can think of SYNTAX-CASE as unhygienically inserting some lexical environment information that SYNTAX refers to. The nice thing is that this example actually does more than any of our previous SWAP! definitions in that it checks syntax and will signal a syntax error if not given two arguments. On the other hand, for this example SYNTAX-RULES beats everybody: (define-syntax swap! (syntax-rules () ((swap! a b) (let ((value a)) (set! a b) (set! b value))))) The advantage of SYNTAX-CASE over SYNTAX-RULES is that you don't have to just use SYNTAX, you can perform some arbitrary computation on sexps and then convert it to syntax. The basic pattern here would be: (define-syntax swap! (lambda (stx) (syntax-case stx () ((swap! a b) (let ((a (syntax-object->datum (syntax a))) (b (syntax-object->datum (syntax b)))) (datum->syntax-object (syntax swap!) `(let ((value ,a)) (set! ,a ,b) (set! ,b value)))))))) That is, you destructure with SYNTAX-CASE, access the destructured info with SYNTAX, convert these to sexps with syntax-object->datum, perform arbitrary defmacro-style computations, and then convert it back to syntax with datum->syntax-object. Got it? There are also utilities to streamline this somewhat like quasisyntax (which even gets its own new read syntax) and with-syntax, and a whole huge library of stuff. It's a very large and baroque system. For pure template-style syntax, SYNTAX-RULES wins, and for pure low-level handling the other systems win because they don't get in your way as much. SYNTAX-CASE has a niche in medium-level complexity macros that benefit from destructuring plus a small amount of computation. On the other hand, if you don't tightly bind one specific destructuring idiom to your macro system, you can take your pick of any external matching or syntax-verifying libraries you want (e.g. use explicit renaming macros with the MATCH macro for destructuring). Oh, there's another really serious problem here. Remember how all the syntactic closures based transformers were wrapped in a macro like sc-macro-transformer or er-macro-transformer? Well, the syntax case macro system doesn't have that: (define-syntax foo (lambda (stx) ...)) i.e. a macro is _always_ _explicitly_ a procedure of one argument, which is a syntax object (and contains all the syntax case semantics thereof). So if you implemented your system such that macros take three arguments (the form and the two environments), you're screwed. You have to resort to very ugly hacks to get these to work together. So to summarize SYNTAX-CASE does let you write both high and low level macros and preserve hygiene, and has some nice ideas, but I really dislike it and discourage it's use for the following reasons: 1) very, very large and baroque API and reader extensions 2) forces a single destructuring idiom tightly integrated with the macro system, when this should be a purely orthogonal concept 3) makes it very difficult to play along with alternate macro systems 4) implicit unhygienic interaction between SYNTAX-CASE and SYNTAX, and in general confusing semantics 5) identifier syntax (another huge, ugly can of worms I won't even get into here) --- Macros in Chicken --- OK, so now what macro systems are available in Chicken and how do we use them? Core chicken by itself has define-macro, which is unhygienic. All of the alternative systems hook into Chicken by registering themselves as the macro expander, thus effectively throwing away any existing macros. They then reload their own versions of the standard Chicken macros (use, cond-expand, when, unless, define-macro, etc.). Thereafter (yes, load order matters here) any new macros are defined in terms of the new macro system. The hygienic systems mostly do provide a define-macro definition, but it should be avoided, as it becomes even more fragile than usual when combined with hygienic macros (interleaving them basically just doesn't work). The alternative hyienic macro systems are: syntax-rules? low-level? compiled-macros? alexpander O X X syntax-case O O X simple-macros O O X syntactic-closures O O X riaxpander O O O alexpander is a simple, lightweight implementation of syntax-rules only, with a few extensions, written by Al* Petrofsky. It's the only option here that doesn't have any low-level macros. syntax-case is as bashed thoroughly above. If you use it I reserve the right to taunt you endlessly :P simple-macros is a more recent system by Andre van Tonder which contains a full implementation of syntax-case. So it's an even bigger system, and I believe more semantically complex, but I'm not too familiar with it. The syntactic-closures egg is the original implementation by Bawden, modified heavily by Chris Hanson, and is the macro system currently used in MIT Scheme. It provides all three of the transformers (sc-, rsc- and er-) described above, as well as SYNTAX-RULES. It's the most light-weight of the low-level hygienic macro systems. The riaxpander egg is a recent, clean implementation of syntactic-closures by Taylor Campbell, and is fully compatible with the syntactic-closures egg. I also recently added support for compiled macros so it loads an order of magnitude faster than any of the above systems. Which should you use? If you're defining your own local macros for compile-time it's not really a big deal - use whatever is most convenient. If you're _exporting_ macros, then this becomes an important decision. Exporting unhygienic macros is a bad idea. If at all possible, exported macros should be written with SYNTAX-RULES because that's universally supported by the alternatives. If you really need to use low-level macros, then you have to choose between the syntax-case API and the syntactic-closures API. Obviously I prefer the latter :) You *don't* need to specify an implementation in either case - the user can choose whichever s/he prefers. For example, if our made a swap egg that exported our swap macro as an explicit renaming macro, then someone who wanted to use it would write (use syntactic-closures swap) or (use riaxpander swap) If you want to be very friendly, you can actually support all of the systems with judicious use of COND-EXPAND. If you look at the source to the matchable egg, it's 95% SYNTAX-RULES with a couple of COND-EXPANDed definitions for either syntax-case, syntactic-closures, or pure syntax-rules (the alexpander case). If you look at the test egg, it just provides a few macros which are fully COND-EXPANDed to support any system - including Chicken's core define-macro. -- Alex

reply via email to

