From : oleg AT okmij.org

: oleg AT okmij.org To : caml-list AT inria.fr

: caml-list AT inria.fr Subject : [Caml-list] ANN: Brand-new BER MetaOCaml for OCaml 4.00.1

: [Caml-list] ANN: Brand-new BER MetaOCaml for OCaml 4.00.1 Date: 31 Jan 2013 07:49:03 -0000

BER MetaOCaml N100 is now available. It is a strict superset of OCaml4.00.1, extending it with staging annotations to construct and runtyped code values. BER MetaOCaml has been completely re-implementedand thus caught up with OCaml. For those who don't know what stagingor MetaOCaml is, a short introduction follows the news summary.BER MetaOCaml is in the spirit of the original MetaOCaml by Taha andCalcagno, but has been completely re-written using differentalgorithms and techniques. BER MetaOCaml has been re-structured tominimize the number of the changes to the OCaml type-checker and toseparate the `kernel' (type-checking and constructing code values)from the `user-level'. Various ways of running the code -- compilingto byte-code or machine instructions and executing them, ortranslating code values to C or LLVM, or printing them -- can be donein `user-level' libraries, without any need to hack into (Meta)OCaml.(Printing and byte-code execution are included in BER N100.) Thisrelease of BER MetaOCaml is meant to incite future research intotype-safe meta-programming.To the user, the two major differences of BER N100 from the oldMetaOCaml are:-- constructor restriction: all data constructors and record labelsused within brackets must come from the types that are declared inseparately compiled modules.-- scope extrusion check: attempting to build code values withunbound or mistakenly bound variables (which is possible withmutation or other effects) is caught early, raising an exceptionwith good diagnostics.Both are explained after the short introduction to staging. Smallervisible differences are better printing of cross-stage persistentvalues and the full support for labeled arguments. A long-standingproblem with records with polymorphic fields has fixed itself. The BERN100 code is now extensively commented, and has a regression testsuite. BER N100 is much less invasive into OCaml: compare the size ofthe patch to the main OCaml type-checker typing/typecore.ml, whichcontains the bulk of the changes for type checking the stagingconstructs. In the previous version BER N004, the patch had 564 linesof additions, deletions and context; now, only 328 lines.(The core MetaOCaml kernel is trx.ml, with 1800 lines.)BER MetaOCaml N100 is available:-- as a set of patches to the OCaml 4.00.1 distribution.See the INSTALL document in that archive. You need the sourcedistribution of OCaml 4.00.1, see the following URL for details.-- as a GIT bundle containing the changes relative to OCaml 4.00.1First, you have to obtain the basegit clone https://github.com/ocaml/ocaml.git -b 4.00 ometa4then switch to tag 4.00.1 and then apply the bundle.Jacques Carette has extensively re-written the printing of codevalues, and is currently maintaining this part. I'm grateful to himfor encouragement and discussions.Introduction to staging and MetaOCamlThe standard example of meta-programming -- the running example ofA.P.Ershov's 1977 paper that begat meta-programming -- is the powerfunction, computing x^n. In OCaml:let square x = x * xlet rec power n x =if n = 0 then 1else if n mod 2 = 0 then square (power (n/2) x)else x * (power (n-1) x)(* val power : int -> int -> int = *)Suppose our program has to compute x^7 many times. We may definelet power7 x = power 7 xIn MetaOCaml, we may also specialize the power function to aparticular value n, obtaining the code which will later receive xand compute x^n. We re-write power n x annotating expressions ascomputed `now' (when n becomes known) and `later' (when x is given).let rec spower n x =if n = 0 then .<1>.else if n mod 2 = 0 then . .else . .(* val spower : int -> ('cl, int) code -> ('cl, int) code = *)The two annotations, or staging constructs, are brackets .< e >. andescape .~e . Brackets .< e >. `quasi-quote' the expression e,annotating it as computed later. Escape .~e, which must be used withinbrackets, tells that e is computed now but produces the result forlater. That result, the code, is spliced-in the containing bracket.The inferred type of spower is different. The result is no longer anint, but ('cl, int) code -- the code of expressions that compute an int.The first type argument to 'code', often named 'cl, is a so-calledenvironment classifier and can be skipped on first reading. The typeof spower spells out which argument is received now, and whichlater. To specialize spower to 7, we definelet spower7_code = . .~(spower 7 . .)>.;;(*val spower7_code : ('cl, int -> int) code = . (x_1 *(((* cross-stage persistent value (id: square) *))(x_1 * (((* cross-stage persistent value (id: square) *)) (x_1 *1)))))>.*)and obtain the code of a function that will receive x and returnx^7. Code, even of functions, can be printed, which is what MetaOCamltoplevel did. The print-out contains so-called cross-stage persistentvalues, or CSP, which `quote' present-stage values such as square tobe used later. One may think of CSP as references to `externallibraries' -- only in our case the program acts as a library for thecode it generates.If we want to use thus specialized x^7 now, in our code, we shouldcompile spower7_code and link it back to our program. This is called`running the code'let spower7 = .! spower7_code(*val spower7 : int -> int = *)The specialized spower7 has the same type as the partially appliedpower7 above. They behave identically. However, power7 x will dorecursion on n, checking n's parity, etc. In contrast, specializedspower7 has no recursion (as can be seen from spower7_code). Alloperations on n have been done when the spower7_code was computed,producing the straight-lined code spower7 that operates only on x.MetaOCaml supports arbitrary number of later stages, letting us writenot only code generators but also generators of code generators, etc.Data constructor restrictionBER MetaOCaml N100 imposes the restriction that all data constructorsand record labels used within brackets must come from the types thatare declared in separately compiled modules. For example, thefollowing all work:. .;;. .;;. .;;..;;let open Complex in ..;;. .;;because data types bool, option, list, Complex are either Pervasive ordefined in the (separately compiled) standard library. However, thefollowing are not allowed and flagged as compile-time error:type foo = Bar;;. .module Foo = struct exception E end;;. .The type declaration foo or the module declaration Foo must be movedinto a separate file. The corresponding .cmi file must also beavailable at run-time: either placed into the same directory as theexecutable, or somewhere within the OCaml library search path.Scope extrusion testAlthough MetaOCaml permits manipulation and splicing of open code, itstype system statically ensures that only closed code can be printed orrun: We can't run the code we haven't finished constructing. For example,. .~(let y = .! . . in . .)>.;;gives a type error since . . is obviously open code.This static guarantee holds only for pure code. Effects such asstoring code values in mutable cells void the guarantee. Here is theexample using the _old_ MetaOCaml from 2006 (version 3.09.1 alpha030). (The problem can be illustrated simpler, but the followingexample is more realistic and devious.)let c =let r = ref . z>. inlet f = . .~(r := . x>.; .<0>.)>. in. .~f (.~(!r) 1)>. ;;(*val c : ('a, '_b -> int) code =. ((fun x_2 -> 0) ((fun y_3 -> x_2) 1))>.*)One must look hard to see that x_2 is actually unbound in the resultingcode. The problem is revealed when we attempt to run that code:.! c;;(*Characters 77-78:Unbound value x_2Exception: Trx.TypeCheckingError.*)Since we get the error anyway (without much diagnostics though), onemay discount the problem. Alas, sometimes scope extrusion results inno error -- just in wrong results.let c1 =let r = ref . z>. inlet _ = . .~(r := . x>.; .<0>.)>. in!r;;(*val c1 : ('a, '_b -> '_b) code = . x_2>.*)we then use c1 to construct the code c2:let c2 = . fun x -> .~c1>.;;(*val c2 : ('a, 'b -> 'c -> '_d -> '_d) code =. fun x_2 -> fun y_3 -> x_2>.*)which contains no unbound variables and can be run without problems.(.! c2) 1 2 3;;(* - : int = 2 *)It is most likely that the user did not intend for 'fun x ->' in c2 tobind x in c1. This is the blatant violation of lexical scope. And yet weget no error or other overt indication that something went wrong.BER MetaOCaml N100 has none of this. Although the type system stillpermits code values with escaped variables, attempting to use suchcode in any way -- splice, print or run -- immediately raises anexception. For example, entering the expression c in the top-level BERMetaOCaml N100 givesException: FailureScope extrusion at Characters 89-99:let f = . .~(r := . x>.; .<0>.)>. in^^^^^^^^^^for the identifier x_7 bound at Characters 74-75:let f = . .~(r := . x>.; .<0>.)>. inThe exception message tells which variable got away, where it wasbound and where it eloped.The file NOTES.txt in the BER MetaOCaml distribution describes thefeatures of BER MetaOCaml in more detail and outlines directions forfurther development. Hopefully the release of BER MetaOCaml N100 wouldstimulate using and researching typed meta-programming.