As programmers, in our daily office/school life, we are expected to write code following best practice, to comment it wisely, so that when need is to re-read it, well someone can do it. To take a break from all those constraints, we can head to the IOCCC the International Obfuscated C Code Contest.

In this post, we are going to focus on the IOCCC 1986 winner in the Worst abuse of the C preprocessor category. The code was written by James Hague.

Starting from the given source, observing its output, we will explain how it works.

The Code

Here it is in all its obfuscated glory:

#define DIT ( #define DAH ) #define __DAH ++ #define DITDAH * #define DAHDIT for #define DIT_DAH malloc #define DAH_DIT gets #define _DAHDIT char _DAHDIT _DAH_[]="ETIANMSURWDKGOHVFaLaPJBXCYZQb54a3d2f16g7c8a90l?e'b.s;i,d:" ;main DIT DAH{_DAHDIT DITDAH _DIT,DITDAH DAH_,DITDAH DIT_, DITDAH _DIT_,DITDAH DIT_DAH DIT DAH,DITDAH DAH_DIT DIT DAH;DAHDIT DIT _DIT=DIT_DAH DIT 81 DAH,DIT_=_DIT __DAH;_DIT==DAH_DIT DIT _DIT DAH;__DIT DIT'n'DAH DAH DAHDIT DIT DAH_=_DIT;DITDAH DAH_;__DIT DIT DITDAH _DIT_?_DAH DIT DITDAH DIT_ DAH:'?'DAH,__DIT DIT' 'DAH,DAH_ __DAH DAH DAHDIT DIT DITDAH DIT_=2,_DIT_=_DAH_; DITDAH _DIT_&&DIT DITDAH _DIT_!=DIT DITDAH DAH_>='a'? DITDAH DAH_&223:DITDAH DAH_ DAH DAH; DIT DITDAH DIT_ DAH __DAH,_DIT_ __DAH DAH DITDAH DIT_+= DIT DITDAH _DIT_>='a'? DITDAH _DIT_-'a':0 DAH;}_DAH DIT DIT_ DAH{ __DIT DIT DIT_>3?_DAH DIT DIT_>>1 DAH:' 'DAH;return DIT_&1?'-':'.';}__DIT DIT DIT_ DAH _DAHDIT DIT_;{DIT void DAH write DIT 1,&DIT_,1 DAH;}

Apart from the particular formatting, what jumps to the eye is the number of “unnecessary” macros and the repetitive use of DIT and DAT variations.

The output

If we compile the code at this point we see many warnings. Among them, two for the implicit declaration of __DIT and _DAH . After that step, we can run the code, and as we provide sequences of ascii characters, it spits out sequences of . and _.

$ ./a.out hello, world

.... . .-.. .-.. --- --..-- .-- --- .-. .-.. -..

It looks like Morse code. And indeed, using an online Morse decoder, it is. It reverses back to HELLO, WORLD

De-Obfuscating

Let’s first try to perform the pre-processor job and replace the macros by their values. After a bit of reformatting, this is what we have:

char _DAH_[]=”ETIANMSURWDKGOHVFaLaPJBXCYZQb54a3d2f16g7c8a90l?e’b.s;i,d:”; main() { char *_DIT, *DAH_, *DIT_, *_DIT_, *malloc (), *gets(); for (_DIT = malloc(81), DIT_=_DIT++; _DIT == gets(_DIT); __DIT(‘n’)) for (DAH_=_DIT; *DAH_; __DIT(*_DIT_ ? _DAH(*DIT_ ) : ‘?’),__DIT(‘ ‘),DAH_++) for (*DIT_ = 2, _DIT_ = _DAH_; *_DIT_ && (*_DIT_ != (*DAH_ >= ‘a’ ? *DAH_&223 : *DAH_ )); (*DIT_ )++,_DIT_++) *DIT_+= (*_DIT_>=’a’ ? *_DIT_ — ‘a’ : 0); } _DAH(DIT_) { __DIT(DIT_> 3 ? _DAH(DIT_>>1) : ‘ ’); return DIT_ & 1 ? ‘-’ : ‘.’; } __DIT(DIT_) char DIT_; { (void) write (1,&DIT_,1); }

Slightly better.

We see the three functions we expected: main , _DAH , and __DIT . We also see an external variable __DAH__ , a long string. __DIT looks like the putchar function from the standard library, printing a char at a time. And what about _DAH ?

Dive into _DAH