( defn morph-form [tree pred f] ( loop [loc ( zip /seq-zip tree)] ( if ( zip /end? loc) ( zip /root loc) ( recur ( zip /next ( if (pred ( zip /node loc)) ( zip /replace loc (f ( zip /node loc))) loc)))))) ( defn remove-form [tree pred] ( loop [loc ( zip /seq-zip tree)] ( if ( zip /end? loc) ( zip /root loc) ( recur ( zip /next ( if (pred ( zip /node loc)) ( zip /remove loc) loc)))))) ( defn is-form? [& s] ( fn [f] ( and (seq? f) (some true? (map #(= % (first f)) s)))))

During each pass we iterate over the nodes in the form using morph-form and remove-form, they both take a s-expression and a predicate if the predicate returns true, morph-form will call f passing the current node as an argument and replace that node with f's return value, remove-form on the other hand does what its name suggests and removes the node when predicate returns true.

( defn dispatch-reader-macro [ch fun] ( let [dm (.get ( doto (.getDeclaredField clojure.lang.LispReader "dispatchMacros" ) (.setAccessible true )) nil )] (aset dm (int ch) fun))) ( defn native-string [rdr letter-u] ( loop [s (str ) p \space c (char (.read rdr))] ( if ( and (= c \# ) (= p \> )) s ( recur (str s p) c (char (.read rdr)))))) (dispatch-reader-macro \< native-string)

We install a custom reader macro, what it does is turn everything between #< and ># into a string, this makes life so much easier when you need to embed native code into a function, otherwise it is a nightmare to indent native code in a string.

( defn process [form] ( ->> (expand-macros form) (add-built-in) (expand-macros) (vector->list) ( let->fn ) (do->fn) (closure-conversion) (symbol-conversion) (vector->list)))

Forms go through eight transformations before they are passed to the code generation phase.

( defn expand-macros [form] ( let [macros ( ->> (read-string (str \( (read-from-url "runtime.clj" ) \) )) ;; get built in macros (filter (is-form? 'defmacro)) ;; merge user defined macros (concat (filter (is-form? 'defmacro) form))) form (remove-form form (is-form? 'defmacro)) temp-ns (gensym)] (create-ns temp-ns) ( binding [ *ns* (the-ns temp-ns)] (refer 'clojure.core :exclude (concat (map second macros) ['fn 'let 'def])) (use 'clojure.contrib.macro-utils) ( doseq [m macros] (eval m))) ( let [form (morph-form form (apply is-form? (map second macros)) ( fn [f] ( binding [ *ns* (the-ns temp-ns)] (macroexpand-all f))))] (remove-ns temp-ns) form)))

First we read all the macros present in runtime.clj then add to that user defined macros, they are evaluated in a temporary namespace, using morph-form we iterate all the macros used in the code that we are compiling and expand them in the temporary namespace then the node is replaced with its expanded form.

( defn add-built-in ([form] ( let [built-in ( ->> (read-string (str \( (read-from-url "runtime.clj" ) \) )) (filter (is-form? 'defn)) (reduce ( fn [h v] (assoc h (second v) v)) {})) fns (ref {'list (built-in 'list)}) form (add-built-in form built-in fns)] (concat (vals @fns) form))) ([form built-in fns] (morph-form form symbol? #( do ( if-let [f (built-in % )] ( when (not (@fns % )) ( do ( dosync (alter fns assoc % f)) (add-built-in (expand-macros (drop 3 f)) built-in fns)))) % ))))

In order to keep the generated C++ code compact only the functions used will be present in the generated source file. Which means if you don't use println anywhere in the code it won't be defined in the final C++ file, but if you use it, it and everything it uses will be defined, in the case of println it will pull apply, print and newline with it.

( defn vector->list [form] (morph-form form vector? #(reverse (into '() % ))))

Since there is no support for vectors, they are converted to lists.

( defn let->fn [form] (morph-form form (is-form? 'let) ( fn [[_ bindings & body]] ( let [bindings (partition 2 bindings) vars (flatten (map first bindings)) defs (map #(cons 'define-var % ) bindings) body-fn (cons (concat ['fn vars] body) vars)] (list (concat ['fn []] defs [body-fn]))))))

let forms are transformed into nested functions which are then called immediately, bindings are setup in the outer function, expressions are placed in the inner function which takes the bindings as arguments.

So following form,

( let->fn '( let [a 1 b 2] (+ a b)))

after transformation becomes,

(( fn [] ( define-var a 1) ( define-var b 2) (( fn (a b) (+ a b)) a b)))

( defn do->fn [form] (morph-form form (is-form? 'do) #(list (concat ['fn []] (rest % )))))

A similar method is used for the do form, expressions are wrapped in a fn that takes no parameters and executed in place.

(do->fn '( do (+ 1 1)))

(( fn [] (+ 1 1)))

( defn lambda-defined? [fns env args body] ( let [f (concat [env args] body) name (reduce ( fn [h v] ( let [[_ n & r] v] ( if (= r f) n))) nil @fns)] ( when name (apply list 'lambda-object name env)))) ( defn define-lambda [fns env args body] ( let [n (gensym)] ( dosync (alter fns conj (concat ['define-lambda n env args] body))) (apply list 'lambda-object n env))) ( defn closure-conversion ([form] ( let [fns (ref []) form (closure-conversion form fns)] (vector->list (concat @fns form)))) ([form fns & env] (morph-form form (is-form? 'fn) ( fn [[_ args & body]] ( let [env ( if (nil? env) '() (first env)) body (closure-conversion body fns (concat args env))] ( if-let [n (lambda-defined? fns env args body)] n ( define-lambda fns env args body)))))))

closure-conversion handles the problem of free variables,

( defn make-adder [x] ( fn [n] (+ x n)))

in the above snippet x is a free variable, the function make-adder returns, has to have a way of referencing that variable when it is used. The way we do this is that, every function will pass its arguments to inner functions (if any) it contains.

(closure-conversion '( fn [x] ( fn [n] (+ x n))))

Above form will be converted to,

( define-lambda G__265 (x) (n) (+ x n)) ( define-lambda G__266 () (x) (lambda-object G__265 x))

What this means is, define a functor named G_ 265 that holds a reference to x, and another functor G_ 266 that has no state. When we create an instance of G_ 265 we pass x to its constructor. Since every thing is already converted to fns this mechanism allows variables to be referenced down the line and solves the free variable problem.

( defn symbol-conversion [form] ( let [c (comp #(symbol (escape { \- \_ \* "_star_" \+ "_plus_" \/ "_slash_" \< "_lt_" \> "_gt_" \= "_eq_" \? "_QMARK_" } (str % ))) #( cond (= 'not % ) '_not_ :default % ))] (morph-form form symbol? c)))

Final step converts all symbols that are not legal C++ identifiers into valid ones.