MyDef Manual

Table of Contents

1 Introduction

1.1 Introduction to MyDef

MyDef is a general purpose preprocessor, in the sense that it processes input and generates output, rearranging blocks of text based on a small but powerful set of preprocessing directives as well as expanding macros that are marked with special syntax. MyDef adds a meta-layer on top of any programming languages, which allows factoring code and customize syntax at a higher abstract level.

A typical programming language consists of semantics layer and syntax layer. The former defines entities such as data types, variables and functions and their mechanism; the latter defines the text form that can describe these entities. MyDef works purely on the syntax layer and provides extra control on how the code could be write and read.

At its base level, MyDef is used for code factoring and code rearrangement. The former cases include examples such as boiler-plate code and repetitive code. The latter include examples such as organizing code in a top-down form or group semantic related definitions, types, variables and code together. With MyDef , it is possible to put all feature related code in a single file, e.g. feature_A.def , and selectively including or excluding features become including or commenting out the inclusion of feature_A.def in the main file. This is in contrast with the common practice of scattering feature related code across source code with #ifdef .

1.2 Problems and Bugs

If you encounter problems with MyDef , please feel encouraged to raise an issue at https://github.com/hzhou/MyDef/issues. You are also welcome to send e-mail to mydef at hzsolutions.net. However, there is no guarantee that your issues or questions will be addressed in any time frame.

Because MyDef works only on syntax layer, almost all its error will result in syntax error and typical language compilers are very good at catching or reporting syntax errors. Syntax errors are generally easy to fix. The base features of MyDef is fairly robust. However, the development of MyDef is constantly adding and experimenting extra features. In addition, due to the flexibility of MyDef , users can develop custom plug-ins that introduces features that are fragile in nature. If you encounter errors from using certain features, in addition to learn more about the feature, there is always the option of bypassing the feature altogether. MyDef 's syntax are designed to be distinct from most language syntax. You can always write your code in vanilla form and MyDef will pass to the output directly.

1.3 Using this manual

This manual contains a number of examples of MyDef input and output, and a simple notation is used to distinguish input, output and error messages from MyDef . Examples are set out from the normal text, and shown in a fixed width font, like this

page : test module : perl $print Hello World!

To illustrate command line examples, a shell prompt &lsquot;$ &rsquot; will be shown along with the command line input, while the program output will be shown without the prompt, like this:

$ mydef_run test.def PAGE: t --> [./t.pl] perl ./t.pl Hello World!

2 Installations

MyDef repositories are separated into base repository and individual output-module repositories. Output modules depend on the MyDef base code and implements language or application specific features. The base repository implements features that are common to all output modules.

2.1 install base MyDef

Check your system have perl , make , git installed. To install MyDef for the first time, use the bootstrap script as following:

$ git clone https://github.com/hzhou/MyDef.git $ cd MyDef $ sh bootstrap.sh

By default, bootstrap.sh installs into $HOME/bin , $HOME/lib/perl5 , and $HOME/lib/MyDef . In order to make MyDef work, you also need set following environment variables, preferably inside your ~/.bashrc :

export PATH=$HOME/bin:[rest of your path] export PERL5LIB=$HOME/lib/perl5 export MYDEFLIB=$HOME/lib/MyDef

bootstrap.sh

It is also recommended to keep MyDef repository after installation. You will need the base repository to install additional output-modules. For that matter, you will also need to set environment variable MYDEFSRC :

export MYDEFSRC=[path to your MyDef repository]

MyDef repository can be updated by running git pull . During the bootstrap process, a Makefile is created so upon updates or files been locally edited, you may simply run make to install the updates:

$ make $ make install

mydef_make

Makefile

2.2 What is installed

The following are installed in $HOME/bin (they are all perl scripts):

mydef_page: compiles from .def to output(s).

compiles from to output(s). mydef_make: checks .def files in the current folder and config file if exist, output Makefile .

checks files in the current folder and file if exist, output . mydef_run: convenient script to compile and run a single program.

convenient script to compile and run a single program. mydef_install: installs files into $MYDEFLIB , $PERL5LIB , or the first location of $PATH .

The following are installed in $PERL5LIB :

mydef.pm: defines global variables, loads output_modules, etc.

defines global variables, loads output_modules, etc. MyDef/parseutil.pm: defines routines for loading .def files.

defines routines for loading files. MyDef/compileutil.pm: defines routines for translating into output, defines macros and preprocessing functions.

defines routines for translating into output, defines macros and preprocessing functions. MyDef/dumpout.pm: defines routines for final output.

defines routines for final output. MyDef/utils.pm: defines some helper routines.

defines some helper routines. MyDef/output_general.pm: the default output module.

the default output module. MyDef/output_perl.pm: output module for perl code.

The following are installed in $MYDEFLIB :

std_general.def: macros automatically loaded by output_general.pm; by default it is empty.

macros automatically loaded by output_general.pm; by default it is empty. std_perl.def: macros automatically loaded by output_perl.pm

These are the essential files that are needed for basic MyDef functions. There may be additional files being installed, which may add debugging functions or extra def libraries.

2.3 Install additional output_modules

The base MyDef installs output_general.pm and will be used by default when no module option is given -- in config file, .def source, or on the command line of mydef_page or mydef_run . output_general only translates base MyDef macros and preprocessing directives, and it can be used to generate any text files, including .txt , .pl , .c , or source code for any programming languages. The extension of the output files can be individually specified inside the .def source. By default, .txt is assumed.

Although output_general module can be used for any programming language, language or application specific output modules can be developed to add language specific features. For example, the output_perl and output_c modules can automatically add semicolons or curly braces as needed so the programmer can optionally omit them. output_c also can manage automatic variable and function declaration with some type inference logic. For another example, output_win32 can add features that automatically manage WNDPROC message handlers to make win32 programming more flexible.

Other than output_general and output_perl , all output modules have their own repositories and need to be individually installed. To install these additional output_modules, you need make sure that you have the updated base MyDef repository, and have environment variable $MYDEFSRC points to its location. The installation process are very similar for all modules. For example, output_c module can be installed like this:

$ git clone https://github.com/hzhou/output_c.git $ cd output_c $ mydef_make $ make $ make install

It will compile and install output_c.pm into $PERL5LIB/MyDef/ , std_c.def and some other standard def libraries into $MYDEFLIB .

The following output_modules are currently available from https://github.com/hzhou/:

output_c, output_python, output_java, output_www, output_win32, output_xs, output_fortran, output_pascal, output_tcl, output_go, output_rust, output_glsl

Not all modules are equally developed or tested.

3 Basic MyDef syntax

3.1 Basic structure of .def file

The basic structure of .def file is indentation based. An example .def file may look like this:

include : common.def include : macros/utils.def page : test Line 1 Line 2 $call A lines after subcode : A line a - $(msg_a) line b - $(msg_b) macros : msg_a : this is message a msg_b : this is message b

At top indentation level, only headings include: , page: , subcode: , and macros: are recognized by parseutil.pm . However, any additional lines not under these headings does not cause MyDef error. They will be silently ignore. So a Knuth's style literate programming can be achieved with MyDef by freely mixing documentation (at top indentation level or under any non-recognized headings) and code (under these recognized headings).

Explicit comments can be introduced with # at any indentation level or between code or at the end of any lines.

include: adds additional files to a list and will be loaded in the order of addition. Note this is different from C preprocessor. A file can be included multiple times but only will be loaded once. The included file will be searched in current working directory, include_path settings in config file, and environment variable $MYDEFLIB .

page: defines the output file name and its main code. Each line in the main code will be copied, macro-expanded, or translated, and finally dumped to the output file. For output_modules other than output_general , there may be additional lines inserted or appended to the main code. For example, output_perl will automatically insert #! lines and use strict; for perl scripts.

subcode: defines blocks of code that can be inserted into main code or another subcode by the syntax like $call codename . The syntax resembles function definition and function calls in a programming language. However, it is more appropriate to think subcode as merely milt-line macros that are being expanded at block or line levels. It's usage are very similar to inline macros and will be expanded regardless of underlying language syntax. subcodes allows reorganization of code according to higher logic than semantics of a programming languages.

macros: defines a sets of name-value definitions. They are used as inline macros. To use these macros and have them expanded in the output, one can use syntax like $(msg_a) . This syntax was chosen to avoid collision to common language syntax. Having macros stand out makes the subtle differences between macros and certain programming language entities such as variables and functions explicit. This is in fact helpful for both writing and reading programs.

The code block under page: is actually a subcode with name main . In fact, it is the same to declare main code like this:

page : test subcode : main line 1 line 2 ...

Upon compilation, each page output looks for a main subcode. If not found, the output will be blank regardless how much subcodes or macros are defined in the def files.

Not all def files may contain page: _ directives. In practice, most of the def files contains only macro and subcode definitions and they are used by include: into the main def file.

The over all structure of def files are declarative. Orders of include: , page: , subcode: , and macros: do not matter -- for most cases. A few cases do matter will be discussed in the following chapters where the particular details are being discussed.

The indentation need be strictly observed directly under page: , subcode: , and macros: . More specifically, all lines that are supposed to be under the directive need to have indentations at the same or below the first line indentations. Otherwise, part of the codeblock or definitions may be ignored during parsing. The code under subcode , which include the main code under page , may contain additional indentations within. All these internal indentation will be preserved and passed to output. However, during the parsing time, all the leading spaces of a line are stripped and encoded with merely indentation level. Upon output, the indentation is added back by inserting 4 spaces for each indentation level. So the output of code may look slightly different from what is defined in subcode . For most languages, leading spaces are insignificant and only the indentation level matters, so MyDef 's indentation treatment should not pose any problem. In fact, it helps to make the output code more readable.

3.2 Scope

Both subcode and macros can be nested and they follow the visibility of the scope. It is possible to structure code like this:

macros : A : global macro subcode : A global subcode. Inline macros and subcodes have different namespace. page : test m : page-level macros can be directly defined here, but only directly under page: directive macros : p : this is also a page scope macros subcode : main These subcodes (include main) are defined at the page scope $call A $call a $call c subcode : a This is at page scope subcode : c nested subcode macros : q : only visible inside subcode: a

The scopes are observed during expansion time. In the above example, normally subcode: a inside page is not visible to subcode:A . However, when A is called inside the page , the scope of subcode:A become nested under page scope, and subcode: a become accessible. Why this expansion time scoping rule are flexible and can be used to enhance code readability for certain scenarios, it is also very easy to be abused to create spaghetti code. Apply common sense.

3.3 Dynamic inline macros

Using macros: directive is not the only way to define inline macros. Because macros: directive have to be used away from where the macros are being used, it is only useful for macros that need be visible across larger scopes. For local macro usage, there is another syntax to define them. Like this:

subcode : A $(set:a=A_very_long_array_variable[i]) $(a) = $(a) * $(a)

At global context, long and descriptive names are desirable; but at small local context, short names are desirable. Macros provide simple and straight solutions. It is nice to be able to define it right where it is being used so programmers don't have to scroll far to infer the meaning. The local macro automatically expires at the end of block context, which saves the worry of name pollution.

3.4 Subcode with parameters

There are two ways to pass parameters into a subcode . We always can set macros right before $call the subcode as mentioned in previous section. The drawbacks of this method is the signature of the parameters are not defined where the subcode is defined. It can be remedied by adding comments, but nevertheless less ideal. Also the calling interface is somewhat clumsy.

We can directly define subcode with following syntax:

subcode : A (greet, name) printf( " $(greet) , $(name) !

" ); page : t $call A, Hello, World $call A, Howdy, Guest

Multiple parameter names can be defined in the subcode: line inside parentheses separated by comma. Correspondingly $call need append actual values for the parameters after the subcode name separated with comma. In implementation, parameters are essentially same as inline macros and used equivalently as macros.

The number of values passed in at $call time must match the number parameters defined at subcode . Mismatch will raise errors. The values at $call is split simply by comma (and any spaces that follow). This can pose problem when the value itself may contain comma. In practice, there are a few heuristic implementation rules that make these situations tolerable. For one, if the comma is enclosed in parentheses or quotation marks, it will not be treated as separation marks:

subcode : A (greeting) printf( $(greeting) ); page : t $call A, "Hello, World!

"

Of course this will not automatically solve all use cases. In fact, in some case you may want the comma to act as separation despite there is quotation marks or parenthesis around it. For tricky situations, MyDef may choose simply not to support. Supporting tricky situations often complicates both code and usages, and sometime explicitly non-support may be the better option. You always can avoid tricky feature usages by not to use the feature at all and write your code directly and let MyDef pass through to output.

That said, in the case that you want the usage to absorb all remaining text as a single parameter, there is a syntax for that:

subcode : A (greet, @names) $print $(greet) , $(names) ! page : t $call A, Hello, Alice, Bob, and Carol

Prepending @ to the last parameter name at subcode: definition will signal MyDef compiler to absorb all remaining text at $call time. Obviously, @param will only work for the last (or only one) parameter. In addition, @param also signals MyDef compiler that it is OK to have empty values for that parameter so MyDef will not raise error even when the values is missing at $call time. So used it with caution.

3.5 Block call

One of the main purpose of subcode is to capture text patterns. A common pattern may consist of an opening part and closing part. In many language and other implementations, this kind of pattern requires two or more separated subcode definitions, and being called separately as well. It is ugly. For this type of patterns, MyDef supports &call syntax, like this:

subcode : tag (name) < $(name) > BLOCK </ $(name) > page : t &call tag, html &call tag, body &call tag, h1 Hello, World!

&call requires an indented block follows the call where codes are expected to be inserted inside the subcode expansion replacing the keyword BLOCK . BLOCK is a few MyDef keywords that are special in addition to the $(macro...) syntax.

The code that replaces BLOCK will have its own scope and is nested inside the subcode being called, and therefore can access all macros and subcodes that are defined within. For example, you may use block call purely for context:

subcode : Parsing (line) BLOCK subcode : parse_string $if $(line) =~/ "([^" ]*)"/ BLOCK subcode : parse_number ... page : t &call Parsing, text &call parse_string $print got string $1

3.6 Map call

For subcodes with single parameter, sometime it is desirable to call it multiple times with various parameters concisely. For this, we can use $map call:

subcode : greet (name) push @guests, " $(name) " $print Hello $(name) ! page : t $map greet, Alice, Bob, Carol

Less often used, $map syntax can also be used with subcodes that need multiple parameters by sharing common parameters:

subcode : A (greet, name) $print $(greet) , $(name) ! page : t $map A(Hello), Alice, Bob, Carol

One of the benefits of $map call is to allow certain use of macros:

macros : Guests : Alice, Bob, Carol subcode : A (greet, name) $print $(greet) , $(name) ! page : t $map A(Hello), $(Guests)

It should be noted that there are preprocessing directives that will introduced in latter chapter to achieve similar result.

3.7 Nest call

A similar but even more exotic syntax is the $nest call:

subcode : each (A) for(int i_ $(A) =0;i_ $(A) <n_ $(A) ;i_ $(A) ++){ $(set:i=$(A) _list[i_ $(A) ]) $call append_string, s, $(i) BLOCK page : t $map init_list, A, B, C, D $nest each, A, B, C, D $print s

$nest is the syntax sugar for:

&call each, A &call each, B &call each, C &call each, D $print s

3.8 Multiply defined subcode

A common usage pattern is to use subcode as place holders, for example

page : t $call @main_init $call main_loop $call finish

main_init is a place holder that will have initialization code that are related to many different features. The @ prepended to the name suppresses warning even when the subcode is not defined, perfect for place holders.

Now it is not desirable to mix codes that are related to different semantic groups together even though they all need to happen at the same run-time context. For that, MyDef allows a subcode to be multiply defined.

subcode :: main_init feature A related initialization subcode : other_subcode_related_to_feature_A ...

subcode :: main_init feature B related initialization subcode : other_subcode_related_to_feature_B ...

Pay attention to the double colons after subcode . If double colons are omitted, only the first subcode: definition will get loaded. This is where include files loading order matters. Unintentional subcode name collision should be avoided.

Finally in the main .def file, we can simply include the files:

include : feature_A.def page : t $call @main_init ...

It should be mentioned that in addition to subcode:: , signifying concatenation, there are also subcode:@ and subcode:- . Both advises MyDef on what to do when subcode names collide. subcode:@ signifies lower priority. This is used to supply default subcode definition in a library and any colliding subcode definition will overwrite it. subcode:- is similar to subcode:: and signals that the code should be concatenated. Unlike double colon, subcode:- always prepends the code to whichever definitions that are loaded earlier. It goes without saying that the effect of these special syntax will depend on the loading order of MyDef parser and can be difficult to follow. In general, their use should be discouraged.

3.9 subcode: _autoload

There is a special subcode called _autoload , if defined, it will automatically called before loading the page 's main code.

The functions for _autoload is similar to above multiply defined subcode except it is called automatically. For its role, subcode: _autoload can be defined in multiple place and the code will be concatenated together even when the subcode: line didn't use double colons.

3.10 DUMP_STUB

A very similar effect as multiply defined subcode is DUMP_STUB . It is used like this:

page : t DUMP_STUB main_init $call feature_A $call feature_B subcode : feature_A $(block:main_init) code initialization related to feature A feature A implementation subcode : feature_B $(block:main_init) code initialization related to feature B feature B implementation

DUMP_STUB is a keyword for place holders that will be replaced in the end with a named block of code. The syntax $(block:...) places the code into that named block.

$(block:...) integrates with other relevant code more tightly and shares the same subcode scope. Sometime this more readable than the alternative of having separate subcode.

3.11 Inline macros with parameters

It is also possible to define and use inline macros with parameters. For example:

$(set:m=$1[$2*$(N) +$3]) $for i=0: $(N) $for j=0: $(N) $for k=0: $(N) $(m:C,i,j) += $(m:A,i,k) * $(m:B,k,j)

Inline macro parameter are not explicitly declared, they are simply marked by $1-9 and replaced at expansion time with a comma separated list of values.

Only single digit following $ is significant. For example:

macros : A : $11 subcode : test This is $(A:x)

3.12 Preprocessing: $(if:...)

Basic switching logic is supported with preprocessing directive if:... , elif:... , and else :

$(if:!A) Code list when macro $(A) is not defined $(elif:A=0) Code list when $(A) equals to "0" (as text) $(elif:A=othercase) Code list when $(A) equals to "othercase" $(elif:A>10) Code list when $(A) is numerically > 10 $(else) Code list for the remaining cases

The logic is similar to the switch statement to most programming languages.

We should explain how the condition is being evaluated. Here we believe it easier to provide a pointer on where these logic is being implemented. You should have retained a copy of MyDef repository. If not, it is easy to clone it again from github. The code logic for $(if:...) conditionals is contained in source file macros_compile/preproc.def , search for fncode: testcondition .

fncode : testcondition ($cond, $has_macro) ... $elif $cond=~/^\s*!(.*)/ return !testcondition($1) $elif $cond=~/ or / my @nlist=split / or /, $cond $foreach $n in @nlist $if testcondition($n) return 1; return 0; $elif $cond=~/ and / my @nlist=split / and /, $cond $foreach $n in @nlist $if !testcondition($n) return 0 return 1; $elsif $cond=~/^([01])$/ return $1 ...

So $(if:!...) , $(if:... or ...) , and $(if:... and ...) syntax are supported. However, it is not implemented as an expression, so nested use of combining ! , or , and and probably will break.

The preprocessing switch is mainly used for testing macros, so it will not work as intended if an expression other than simple macro name is passed in on the left side of comparison operators (e.g. = , < , > , <= , >= ). The right side of the operator is treated as normal expression. Therefore, for example: $(if:A>$1) , $(a) has to be quoted as a macro while A is assumed to be a macro without quote.

As discussed, $(if:0) or $(if:1) should not work as neither 0 or 1 is a macro name. However, it is often useful to use preprocessing to disable or enable a block of code. Therefore, both $(if:0) and $(if:1) are specially allowed. Any other direct numbers will not work.

To be complete, here is additional syntax as supported in macros_compile/preproc.def :

... $elsif $cond=~/^hascode:\s*(\w+)/ my $codelib = get_def_attr( "codes" , $1) $if $codelib return 1 $elsif $cond=~/^(string|number|word):(.*)/ my $test=$1 my $t=get_def($2) $if $test eq "string" and $t=~/^['"]/ return 1 $elif $test eq "number" and $t=~/^\d+/ return 1 $elif $test eq "word" and $t=~/^[a-zA-Z_]\w*$/ return 1

We can use these syntax like this:

$(if:hascode:parse_cond) $call parse_cond $(if:string:P) n = int( $(P) ); $(if:number:P) s = $(P) +''; $(if:word:P) code list

These special syntax are only useful for special cases. They are listed here also to show how to hack or extend MyDef if necessary.

There is also a simple regex condition, defined in fncode: testcondition . Just for convenience, here is the snippet:

fncode : test_op ($a, $test) $if $test=~/^:(\d+)/ $test=$' $a=substr($a, 0, $1); $if $test=~/^\s*(!?)~(.*)/ my ($not, $b) = ($1, $2) $if $b=~/(.*)\$$/ if($a=~/$1$/){ return !$not;} $else if($a=~/^$b/){ return !$not;} return $not $elif $test=~/^\s*in\s+(.*)/ return test_in($a, $1) $elif $test=~/^\s*([!=<>]+)(.*)/ ...

For up-to-date implementations, please directly reference source.

3.13 Preprocessing: $(for:...)

The $for... syntax provides quick way to multiplex code:

$(for:a in Alice, Bob, Carol) Hi, $(a) $(set:namelist=Alice,Bob,Carol) $(for:a in $(namelist) ) $(if:a=Alice) Hi $(a) , I love you! $(else) Hi $(a) !

There is a short-cut syntax for specifying simple range based list:

$(for:a in 1-11) $print $(a) $(for:a in a-z) $print $(a)

Sometime we may need multiplex over multiple lists:

$(for:a, i in x,y,z and 0,1,2) $(a) = $(i)

There is also a syntax to use &dollar(for:...) anonymously:

$(for:Alice, Bob, Carol) Hi $1

$1 is used to refer to the implicit macro for each case.

Anonymous multiplexing over multiple lists also works:

$(for:x,y,z and i,j,k) A_$1 = $2;

Up to nine lists are allowed, but you probably want to limit to a much smaller lists.

3.14 Preprocessing: $(set:...)

We already learned that $(set:...) can be used to set local macros within the scope. What if we want to set macros to the parent scope? We can use code>$1:

$call set_macros $print A = $(A) subcode : set_macros $(set-1:A=3.14)

Direct $(set:...) won't work here as it will expire upon exit the subcode.

We also can set macros at global level using $(setmacro:...) :

subcode : section (title) $(setmacro:id+=1) Section $(id) . $(title) page : t $call section, Hello $call section, Bye

We haven't introduced the += operator yet. Intuitively, += increments the macro definition numerically.

+= works for $(set:...) and code>$1 as well. However, it is difficult to track consistent macros across scopes, therefore, it makes most sense to only use these operators in $(setmacro:...) .

In addition to += , there are also -= , *= , and /= , which does their corresponding arithmetic updates. They are not as useful as += though.

There is also a string concatenation operator .= , which appends the new definition to the old one.

Very often when we need concatenate strings, we would like some separators into between the items. For that, we have another syntax:

subcode : add_name (name) $(setmacro:namelist[,]=$(name) ) Hi $(name) ! page : t $call add_name, Alice $call add_name, Bob We have gathered: $(namelist) .

Because the scoping rules, very rarely we need unset macros. When we need it, it is often only needed to shield certain macros from child scopes. For that, we have the $(code:$(unset:...).

$(unset:A,B,C) $call subcode_that_checks_macros

Note that $(unset:...) only unsets macros in the current scope, but will effectively shield macros defined in parent scopes. It only accepts a list of simple words.

$(set...) sets one macro at a time, so it does not need any deliminators, and you can easily define macros with comma in it.

Sometime it is desirable to set multiple macros in a single line. For that, we can use $(mset:...) :

$(mset:A=x,B=y,C=z)

You only can set macros in the current scope with $(mset:...) and you cannot have commas in the definitions.

3.15 Preprocessing: scope rules

Both $(for:...) and $(if:...) contains nested blocks. It should be noted that $(for:...) will create additional child scopes, but $(if:...) (and elif/else ) will not.

The latter is useful as in the following case:

subcode : greet (type, name) $(if:type=hot) $(set:greeting=It is such an honor to meet) $(set:ending=!!!) $(else) $(set:greeting=Hello,) $(set:ending=.) $(greeting) $(name)

3.16 Special macro syntax

MyDef implements a small set of special macro syntax for convenience:

Type name 10 times: $(x10:NAME ) Type name 10 times: $(x10, :NAME) List of digits: $(join:pat $1:, :1-3) Duplex list: $(join:$1_$2:, :x-z and 1-3) Multiplex list: $(join:$1_$2:, :x-z mul 1-3) $(set:A=word) Perl one-liners: $(eval:ucfirst( "$(A) " ))

MyDef is written in Perl, so it is trivial to add eval and expose the entire Perl engine. There is also a similar interface for eval subcode, which essentially allows direct perl code to manipulate text transformations. However, it is necessary to understand MyDef 's internal details to effectively work with the rest of MyDef processing. We'll discuss details in later chapters.

For more update-to-date details, please reference source.

4 Invoking MyDef

4.1 config file

The suite of MyDef tools will always check for a config file (a text file named config ) in the same directory where the main .def file is located.

Strictly speaking, a config file is not always necessary. MyDef options can be set in three places: in side .def source, in the command line option, and in the config file. config file is a convenient mean to share default settings across entire project.

When config file is missing and if we invoke mydef_make to create Makefile , it will always create a config file, so further invocation of mydef_make will not prompt you questions again and again.

Basic config file only contains two settings. Following is the one for the MyDef repository:

output_dir: MyDef module: perl

output_dir specifies the directory where the output files should be put. If this option is missing, the default is to put output files in the same directory of the .def sources. For big projects, some separation is desirable.

module specifies the output module to use. When this option is missing, the default is to use output_general . MyDef is written in Perl, so perl it is (mean to use output_perl ).

Another often used options is include_path , which is a ' : ' separated list of directories for searching for include: files. The option if provided gets prepended to environment variable MYDEFLIB . The current folder will always get searched first.

output_general module do not need many options, but many language and application specific output modules do need additional options to control behaviors. config file can arbitrary options, and it is the output module's responsibility to check these options.

Internally, MyDef do not really differentiate macros and options. Both are simply name-value pairs. Everything specified in MyDef is internally available in a global reference hash variable $MyDef::var .

4.2 mydef_page

mydef_page is the MyDef compiler. It reads in a .def (and all the included .def files included within), translates it, and writes to output files, one per page: directive.

$ mydef_page -m[module] -o[outputdir] src.def

-m specifies output module to use. -o specifies output directory that output files will go. There should not be spaces between the option and value.

Both options can be omitted. If the module is missing, it takes from config file. If config does not exist or the module option is not set within, it will check for module option inside the source .def file. If still not found, mydef_page will complain and exit.

However, when the module option is given via command line or config file, that module will be used. And when the module option under page: directive mismatches this module, the corresponding page will be ignored. For example:

$ mydef_page -mgeneral t.def skipped 1 pages (due to module mismatch), use -m to override default module.

It is possible to have multiple pages defined in a single .def file all the pages match the module option will gets compiled in a single invocation. If multiple pages require different output modules, multiple invocation of mydef_page with different module option is necessary to compile every page.

-o specifies output locations. If not specified, it will search similarly in config file. If not found, it will set to current working directory.

If the source .def file also specifies output_dir under the page: directive, it will depend on whether absolute or relative path is given.

If an absolute path is given, it will be used as the output location.

If a relative path is given instead, it will be appended to the path given on the command line or config file, or the default . (current directory).

When every thing goes well, it goes like following:

$ mydef_page t.def PAGE: t --> [./t.pl]

If multiple pages are being compiled, each of them will be listed.

If MyDef encounter errors, there will be error messages:

$ mydef_page t.def PAGE: t [t.def:4] Code A not found! [t.def:5] Macro B not defined --> [./t.pl]

The output file will still be produced despite the errors. It is important to pay attention to error messages.

The file locations are usually reported along with errors. However, it should be noted that the current implementation of file location tracking is buggy and often can be off by a few lines. Nevertheless, you should be able to locate the error with the information given in the error message. This bug certainly will get fixed at some point. However, since it will not affect the actual compilation, only the error reporting, it is currently not at a high priority.

Although it is possible, it is not recommended to directly output scripts (such as perl/python scripts) directly into installation destination. It is prudent to always test locally and run the extra install steps when satisfied. The exception being generating web pages. Web pages need to be viewed through browser anyway, thus it is convenient to have output_dir directly point to your web root.

4.3 mydef_make

When projects grow to contain more than one output file, a Makefile is helpful to make edit-compile-debug cycle easier. We believe in keeping things simple, and believe a simplified, hand-edited Makefile is better than using typical build tools. Build tools generate Makefile without going over the decisions. In the effort to accommodate the most complicated projects, they make the most complicated solution for even the simplest projects. That is the opposite of "keeping it simple".

Nevertheless, hand generating Makefile can be tedious. mydef_make can be used to generate the initial Makefile by scanning all .def files within the directory as well as all sub-directories. It will generate all the rules from def files to output files using mydef_page commands. For simple projects, the Makefile from mydef_make will be good enough. Both MyDef projects and its output module projects are examples where mydef_make is good enough.

When it is not good enough, you are encouraged to hand edit your Makefile . If you insist on using build tools, you always can opt to have def source files outside of the build tree and having mydef_page to output into the build tree. That way, your build tool can work as if you are directly editing the code rather are generating them using def files. There may even appear to be no difference to your colleagues. One could edit the Makefile and having make automatically invoke the build tool upon updates.

If you run mydef_make the first time (when config file does not exist):

$ mydef_make Please enter the path to compile into [out]: Please enter module type [perl]: output_dir: out Create output folder out ...

mydef_make prompts you for output_dir and module type. It creates config file and a Makefile . If the output_dir does not exist, it will create it for you.

If you run it again, it will only update Makefile :

$ mydef_make output_dir: out

4.4 mydef_run

At the opposite end of building big projects, you may be coding a single program that consists of a single file (not necessarily single def file). In this case, invoking build tools or even generating Makefile appears to be an overkill. All we would like to is single keypress from source to result.

mydef_run is specifically for this purpose. It can be invoked like this:

$ mydef_run t.def PAGE: t --> [out/t.pl] perl out/t.pl Hello World!

Basically it compiles from def to output, then it bases on a set of heuristics on the output file type and runs compiler or interpreter to run the program.

Here is an example with C:

$ mydef_run t.def PAGE: t --> [out/t.c] gcc -std=c99 -O2 -oout/t out/t.c && out/t Hello World!

Here is an example with Java:

$ mydef_run t.def PAGE: t --> [out/t.java] javac out/t.java && cd out && java t Hello World!

Of course not all types of file have a way to run it:

$ mydef_run t.def PAGE: t --> [./t.txt] do not know how to run it

Well, at least it is honest and it compiles for you.

If you configure the command to a hotkey in your editor, then the edit-debug cycle is shortened into a single keypress. If you use vim , put this in .vimrc :

:nmap <F5> :!mydef_run %<CR>

For details of the heuristics or customization, refer to mydef_run.def in the source repository.

It is possible to directly set command line for mydef_run to run. For example:

page : t run : cat t.txt $print Hello World!

$ mydef_run t.def PAGE: t --> [./t.txt] cat t.txt $print Hello World!

Or you may simply specify an argument:

page : t module : perl arg : P1 P2 P3 $foreach $a in @ARGV $print argument $a

$ mydef_run t.def PAGE: t --> [./t.pl] perl ./t.pl P1 P2 P3 arg: P1 arg: P2 arg: P3

As a bonus, do you know that you can directly use mydef_run on .pl , .c , or any source code with a known extension? Try it.

5 output_general

In this chapter, we will explain the internals of MyDef .

output_general is the default output module. It essentially only uses the syntax we have covered so far; nothing more.

However, much of the power of MyDef eventually will come from language or application specific output modules. So at some point, you would need to understand some of MyDef 's internal working. output_general module is the starting point for every output module.

5.1 output_general.def

include : output.def page : output_general , output_main type : pm output_dir : lib/MyDef ext : txt package : MyDef::output_general

The actual code is in a subcode output_main defined in output.def .

5.2 output.def

File output.def provides boiler-plate code that most output modules will share:

subcode : output_main $global $debug=0 $global $out $global $mode $global $page $call @package_globals $sub get_interface return (\&init_page, \&parsecode, \&set_output, \&modeswitch, \&dumpout); $sub init_page($t_page) $page=$t_page $sub set_output($newout) $out = $newout $sub modeswitch($mode, $in) $call @modeswitch $sub parsecode($l) $if $l=~/^\$warn (.*)/ $call warn, $1 return $elif $l=~/^\$template\s+(.*)/ return $call parsecode_debug $call parsecode_eval $call parsecode $sub dumpout($f, $out) my $dump={out=>$out,f=>$f} $call @dumpout MyDef ::dumpout::dumpout($dump); $call single_blocks $call @support_subs 1;

subcode: parsecode is where we apply preprocessing logic for each line. output.def provides a default stub which simply pushes the line straight to @$out . It is supposed to be overwritten in the actual output module. For output_general , it is good enough:

subcode :@ parsecode push @$out, $l

6 output_perl

output_perl source is kept within the base MyDef repository because, well, MyDef is coded in perl and would require output_perl to compile.

output_perl is also one of the most mature module, due to the mandatory round trip testing. In addition, because Perl is a very high-level language and embraces user practical expressiveness, output_perl is relatively simple.

Perl is a general purpose programming language, therefore provides the usual variable , functions , scopes , if-else switch, loops , etc. The custom syntax created with output_perl can be a good example for output modules for other languages.

6.1 It is OK to write vanilla Perl

page : t my $a = 0.1; my $b = 0.2; if ($a+$b == 0.3){ print "$a + $b = 0.3!

" ; } else{ print "Total failure!

" ; }

$ mydef_run -mperl t.def PAGE: t --> [./t.pl] perl ./t.pl Total failure!

And you can check the output t.pl is (almost) exactly as you wrote:

#!/usr/bin/perl use strict; my $a = 0.1; my $b = 0.2; if ($a+$b == 0.3){ print "$a + $b = 0.3!

"; } else{ print "Total failure!

"; }

It only adds the #! line and use strict; . It is recommended that strict should always be used. However, if you disagree, you could set the relax option:

page : t module : perl relax : 1 ...

6.2 $print

print is the most fundamental and useful statement in a language, not because it is essential in the final program, but it is essential to provide feedback during code development. Programming is a process.

Perl is one of the most straight forward languages. The "Hello world" program that listed in the first chapter of "Programming Perl" is:

print "Howdy, world!

";

To most programmers, this is as basic as it can be. But nevertheless, in the coming years of experience, a few wishes do pop up: Is it necessary to type that quotation marks, and '

', and the semicolons? The annoyances pops up because we do forget or mistype them from time to time and then have the compiler bug us with the errors.

Turns out, in most cases, no, it isn't necessary. So in output_perl , we introduced a hack:

$print Howdy, world!

In modern versions of Perl, there is a new function say , which essentially is print_ln (print with newline). $print achieves the same without it.

There is more to it. What is the output of this:

$print Howdy, world!



You can quickly check it yourself, but the answer is, no, it won't print two newlines. There is a little intelligence in output_perl that only adds newline when newline is missing. By the way, both of the following works as well:

$print "Howdy, world!" $print "Howdy, world!

"

A little flexibility goes a long way.

Now what if we want to print without newline?

$print Howdy, - $print world

' - ' in the end signals that we do not want newline. We usually want to print with newline, but occasionally we don't. So the philosophy here is to make the default for common scenarios, and go extra for special cases.

Mixing variables into the string still works. In addition, there is a little hack to make print in color easier:

my $name = "world" ; $print Howdy, $green{$name}!

Supported color names include: red , green , yellow , blue , magenta , cyan . You will have trouble if you have variable named with these color names and want to print them. Hopefully that is rare, but if you do, remember the vanilla Perl's print is still there.

Perl's variable interpolation in strings is great. But still sometime we want to print with format. For that, we need remember another function: printf . $print supports that as well:

$print "Pi = %.2f" , 3.1415926

It is equivalent to:

printf "Pi = %.2f

", 3.1415926;

One more thing, sometime we want to print to a filehandle, such as STDERR . We still can use $print for that. All we need is set a special macro:

$(set:print_to=STDERR) $print Error: did you just want an error?

That's it for $print ! If you want further detail, directly consult the source in output_perl.def and search for subcode: parsecode_print . The implementation for all these flexibility is not complicated, thanks to Perl's built-in convenience.

6.3 Optional semicolons

In previous section, we see print not only lets you omit typing quotation marks and '

', you also get to omit semicolons. In fact, with output_perl , you can optionally omit semicolons for almost every normal perl statements. Try this:

my $name = "Alice" if ($name eq "Alice" ){ $print Howdy, $name } else{ $print Nice to meet you, $name! }

Surely, MyDef did not add ' ; ' to every line. It employs some simple heuristics. In fact, you can just check what it does at the source:

subcode : check_termination $if $l=~/^\s*$/ $elif $l=~/^\s*(for|while|if|else if)\s*\(.*\)\s*$/ $elif $l=~/^\s*}/ $elif $l!~/[,:\(\[\{;]\s*$/ $l.= ";" ; $else

6.4 $if , $elif , and $else

These are syntactic sugar for if-elsif-else statements. They allow us to write these control statements in the style of Python rather than C's curly brace style. Here is an example:

for(my $i=1; $i<100; $i++){ $if $i % 15 == 0 $print fizbuzz $elif $i % 3 == 0 $print fiz $elif $i % 5 == 0 $print buzz $else $print $i }

Obviously there is similar indentation based syntax for for -loop, but since that is in the next section, we show the example as above to demonstrate that there is nothing wrong to mix vanilla perl code with the special syntax introduced by output_perl .

One of the top bugs is mis-type == with = in if conditions. So output_perl checks that given the chance and gives you warnings:

my $i=1 $if $i = 15 $print fizbuzz

$ mydef_run t.def PAGE: t [t.def:6] assignment in condition [$i = 15]? --> [./t.pl] perl ./t.pl fizbuzz

It is only a warning, but if you paid attention, you caught the bug!

6.5 $for

First, let's review the vanilla Perl syntax:

for(my $i=0; $i<10; $i++){ $print $i }

MyDef works purely at text level; it does not truly understand the language semantics as a compiler do. It only understands the text via heuristics -- which is more similar to how human mind works. As such, MyDef special syntax does not always work. We start with vanilla Perl syntax example to emphasize that you always can draw back on the syntax if it become more trouble than help.

That said, let's see what output_perl offers:

$for my $i=0; $i<10; $i++ $print $i

Just like the $if syntax, we now can write it in Python style and save the braces. Further:

$for my $i=0:10:1 $print $i

That works. In fact, we can drop the keyword my and the increment by 1 is actually the default:

$for $i=0:10 $print $i

In fact, the loop variable $i is also the default, so we can drop that too:

$for 0:10 $print $i

Or:

$for 10 $print $i

The output is exactly the same for all above examples. We did not change Perl, only the syntax.

Just to make sure, the step part works:

$for $i=0:10:2 $print $i

The decreasing for -loop works too, but it is a bit tricky. It is tricky because it appears there is not a established convention. For example, what do we think the following does?

$for $i=10:0:-1 $print $i

The current implementation in output_perl prints 10 , 9 , ..., 0 . Yep, looped 11 times instead of 10 times. We find it easier to reason by taking the literal meaning of from 10 down to 0 . Many may disagree. As said, tricky. But remember, to be clear, you always can and maybe should use the more verbose version:

$for $i=10; $i>0; $i-- $print $i

6.6 $while

$while exist so we could write while -loop in Python style too.

$while 1 $print MyDef FOREVER! last

6.7 $foreach

foreach reads better with a keyword in :

$foreach $c in split //, "Hello, world!" $print $c

It supplies the my keyword and drops parentheses and braces.

6.8 fncode and function arrangement

Curly braces are also used in defining Perl functions -- sub :

$sub F($x) return $x * $x $print "3: %d" , F(3)

Let's check its Perl equivalent:

sub F { my ($x)=@_; return $x * $x; } printf "3: %d

" , F(3);

You may recall there is this critics about how Perl does not have function signatures. As far as syntax goes, this lets us define functions with signature.

The $sub syntax works, but behaves as a normal statement -- just like if or while . Semantically it is a bit unsatisfactory as $sub merely defines the function rather than having run-time action as other control statement do. Ideally, we would like to have a declarative syntax for functions and move them outside of normal code blocks. In MyDef , we can use fncode:

page : t $print "3: %d" , F(3) fncode : F ($x) return $x * $x

If you are writing a .pm module, All the fncode: is included in the output. However, if you are writing .pl script, which is the default, only those functions that are used will be included. output_perl detects which function are being used with a very simple heuristics:

fncode : check_fcall ($l) $while $l=~/\b(\w+)\(/g $call add_function, $1

In the case the heuristics doesn't work well and misses certain function that you do use, you always can add the function explicitly with $list :

page : t $list F $print "3: %d" , F(3) fncode : F ($x) return $x * $x

6.9 $global and $use

There is this religion that we should avoid global variables. The practitioners of MyDef do not believe in extreme stances. Globals are certainly convenient. If we do use globals, we definitely would like to define them close to where it is relevant rather than pile all of them at the top. The language semantics on the other hand, do make sense to have all globals piled at the top. To help, output_perl provides $global to make declaration (and initialization) of global variables declarative.

page : t $print "3: %d" , F(3) fncode : F ($x) $global $Offset = 10 return $x * $x + $Offset

We certainly do not want re-declare and reset $Offset every time the function is called. The Perl output equivalent is:

use strict; our $Offset = 10; sub F { my ($x)=@_; return $x * $x + $Offset } printf "3: %d

", F(3);

Similarly, $use made importing packages declarative so you could embed certain package dependence into relevant code and be confident that they all will be gathered to their rightful place.

6.10 std_perl.def

Finally, not all customization need to happen at the internal level. We can have customization coded in MyDef macro syntax and have them included in the def file. For each output module, there is one standard library that is always included automatically. For perl , that is std_perl.def . It is located at deflib/ in the repository and at $MYDEFLIB in your installed destination.

It is easy enough for you to open that file and see for yourself what are in it. They are a collection convenience macros that some are frequently used while some are rarely used. Those rarely used probably will get pruned away at some point.

The most used two subcodes are subcode: open_r and subcode: open_w . subcode: open_r allows short code for opening file and read it line by line:

&call open_r, t.txt $if /(\w+), (\w+)/ $print Hello $2 $1!

subcode: open_w is similar but obviously it can't write line by line for you:

&call open_w, t.txt $(set:print_to=Out) $print Hello, world! $print Hello, world, again.

In is the default input file handle, and Out is the default output file handle.

If you just want to read in the whole file in a scalar, you can use subcode: get_file_in_t :

$call get_file_in_t, t.txt $print [$t]

There is also get_file_lines that slurps entire text file into @lines array.

$call get_file_lines, t.txt $print "Got %d lines" , $#lines+1

7 output_c

[To be continued.]

8 output_www

[To be continued.]

9 output_python

[To be continued.]

10 output_java

[To be continued.]