Obfuscated Tiny C Compiler

What is it ?

My goal was to write the smallest C compiler which is able to compile itself. I choose a subset of C which was general enough to write a small C compiler. Then I extended the C subset until I reached the maximum size authorized by the contest: 2048 bytes of C source excluding the ';', '{', '}' and space characters.

I choose to generate i386 code. The original OTCC code could only run on i386 Linux because it relied on endianness and unaligned access. It generated the program in memory and launched it directly. External symbols were resolved with dlsym() .

In order to have a portable version of OTCC, I made a variant called OTCCELF. It is only a little larger than OTCC, but it generates directly a dynamically linked i386 ELF executable from a C source without relying on any binutils tools! OTCCELF was tested succesfully on i386 Linux and on Sparc Solaris.

NOTE: My other project TinyCC which is a fully featured ISOC99 C compiler was written by starting from the source code of OTCC !

Download

Original OTCC version (runs only on i386 Linux): otcc.c (link it with -ldl ).

on i386 Linux): otcc.c (link it with ). OTCC with i386 ELF output (should be portable): otccelf.c.

Example of C program that can be compiled: otccex.c.

[New] The non-obfuscated versions are finally available: otccn.c and otccelfn.c. These non-obfuscated versions do not self compile. They are provided for documentation purpose.

gcc -O2 otcc.c -o otcc -ldl gcc -O2 otccelf.c -o otccelf

./otccelf otccelf.c otccelf1

C Subset Definition

Expressions: binary operators, by decreasing priority order: '*' '/' '%', '+' '-', '>>' '<' '>=', '==' '!=', '&', '^', '|', '=', '&&', '||'. '&&' and '||' have the same semantics as C : left to right evaluation and early exit. Parenthesis are supported. Unary operators: '&', '*' (pointer indirection), '-' (negation), '+', '!', '~', post fixed '++' and '--'. Pointer indirection ('*') only works with explicit cast to 'char *', 'int *' or 'int (*)()' (function pointer). '++', '--', and unary '&' can only be used with variable lvalue (left value). '=' can only be used with variable or '*' (pointer indirection) lvalue. Function calls are supported with standard i386 calling convention. Function pointers are supported with explicit cast. Functions can be used before being declared.

Types: only signed integer ('int') variables and functions can be declared. Variables cannot be initialized in declarations. Only old K&R function declarations are parsed (implicit integer return value and no types on arguments).

Any function or variable from the libc can be used because OTCC uses the libc dynamic linker to resolve undefined symbols.

Instructions: blocks ('{' '}') are supported as in C. 'if' and 'else' can be used for tests. The 'while' and 'for' C constructs are supported for loops. 'break' can be used to exit loops. 'return' is used for the return value of a function.

Identifiers are parsed the same way as C. Local variables are handled, but there is no local name space (not a problem if different names are used for local and global variables).

Numbers can be entered in decimal, hexadecimal ('0x' or '0X' prefix), or octal ('0' prefix).

'#define' is supported without function like arguments. No macro recursion is tolerated. Other preprocessor directives are ignored.

C Strings and C character constants are supported. Only '

', '\"', '\'' and '\\' escapes are recognized.

C Comments can be used (but no C++ comments).

No error is displayed if an incorrect program is given.

Memory: the code, data, and symbol sizes are limited to 100KB (it can be changed in the source code).

OTCC Invocation

otcc prog.c [args]...

Examples:

Sample compilation and execution: otcc otccex.c 10

Self compilation: otcc otcc.c otccex.c 10

Self compilation iterated... otcc otcc.c otcc.c otccex.c 10

#!/usr/local/bin/otcc

OTCCELF Invocation

otccelf prog.c prog chmod 755 prog

Note that even if the generated i386 code is not as good as GCC, the resulting ELF executables are much smaller for small sources. Try this program:

#include <stdio.h> main() { printf("Hello World

"); return 0; }

Compiler Executable size (in bytes) OTCCELF 424 GCC (stripped) 2448

Links

License