I studied linear algebra decades ago and I have forgotten most of what I learned. I can't even remember how to invert a matrix now (//shame). However, I still remember how to multiply matrices. More specifically, I remember this:

"But this isn't matrix multiplication. Matrix multiplication is a summation!" -- If you cry out, I commend you for being master of your linear algebra; but the above equation is matrix multiplication; it is written in a convention called Einstein notation. A notation is simply a set of syntactical conventions. For example, when two variables are together, a multiplication is implied. In Einstein notation, the convention is having loops over each subscript indexes and a summation over the subscripts that only appears on the right-side of the assignment. In programming, what I mean is:

$for i=0:Ni $for j=0:Nj C[i,j] = 0.0 $for k=0:Nk C[i,j] += A[i,k] * B[k,j]

History of Einstein notation

I do not really know the history of Einstein notation. In fact, I did not even know the notation had a name until recently. All I remember is this way of derive matrix operations that my math teacher introduced to us long long time ago and I have been very impressed with it ever since. But recently, I was surprised to find most of my friends do not know about this convention, which prompted me to search up on the internet and find this catchy name -- "Einstein notation" -- and this Wikipedia page: http://en.wikipedia.org/wiki/Einstein_notation. Apparently, it was Einstein who introduced this notation to the public(physicists) in 1916 when he talked about general theory of relativity. The Wikipedia page mentions that it is a notational subset of "Ricci calculus". So my math teacher probably did not learn that from Einstein; but the name is catchy (at least to common mortals), so I started to use "Einstein notation" now.

Why I was so impressed with it

Because it is concise and contains all the necessary information for me to understand the matrix operation.

Most of my friends who use Matlab use this notation:

C = A * B

To me, this is like black magic. There is no clue on the surface to tell me what is A, B, or C, and what is really going on with the expression. Yes, I can read the whole program and simply hold on to my working memory that A, B, and C are all matrices, and further, I can retrieve from my long term math memory that what goes on is having rows of A multiplied with columns of B and accumulated into C. -- That is not trivial by my standards. All the matrix implementation in various programming languages I have ever encountered so far are like this, an abstract magic. But with Einstein notation, all the important information is on the surface:is matrix A multiplies matrix B and assigned into matrix C. Andis vector B multiplied by matrix M and assigned into vector A. Not only I don't have to hold the context of the whole program to understand these expression, I also know what is going on underneath relying only on my basic arithmetics. That is, even I do not have any matrix library available, I know how to calculate them using basic arrays, loops, and summation and multiplication. That is power! To illustrate, rather than search for magic words in typical libraries, I understand following:

[EDIT: The Cross product expression is a bit stretch where subscript '[]' denotes a 'modulo 3' operator. I do use it personally when I need figure out the component relationships.]

[EDIT2: Mathematician out there will defend the magic of abstraction. Please do not get me wrong -- I am not attacking it. I myself use the abstract symbolic matrix algebra more often than I use Einstein notation. However, it is like that special hammer, the appreciation and usefulness of certain tool is not directly related to how often it is being used.]

The way I want to program

For all those years, Einstein notation worked for me in mind and on paper, but when I write programs (in C, as in my line of work, efficiency trumps everything), I still need write like this:

int main(int argc, char** argv){ float A[4] = {2, 3, 4, 0}; float B[4] = {1, 2, 5, -1}; float sum = 0; int i; float C[4]; for(i=0; i<2; i++){ int k_ij; int j; k_ij = i * 2; for(j=0; j<2; j++){ int k_ik; int k_kj; float sum = 0; int k; k_ik = i * 2; k_kj = +j; for(k=0; k<2; k++){ sum += A[k_ik] * B[k_kj]; k_ik++; k_kj += 2; } C[k_ij] = sum; k_ij++; } } }

That is ugly; that is painful; that makes me want to abandon my basic knowledge of how to multiply matrices and go use Matlab (and deal with its awful string handling handicaps, slowness, and the yearly license, and more importantly, giving away my feeling of power and surrender to magic)!

I know what I want. I want to write my program like this:

float A[4]={2, 3, 4, 0} float B[4]={1, 2, 5, -1} float C[4] Einstein_notation: C[i,j]=A[i,k]*B[k,j]

Introducing MyDef -- Two layers of programming

I am very proud to announce, that I can almost do just like that! It will need some of the magic of my own, that is MyDef

If we separate the programming into two layers. The lower layer deals with language units that have correspondence to units in assembly or machine code. This includes data types, first-order functions, data structures, switches and loops. Typical library implementations are in this lower layer as well because they are compiled as a binary block and being called by a (or a set of) assembly instructions. The upper layer deals with language texts. C's preprocessor are in the upper layer. Various syntactic sugar in many modern languages also belong in the upper layer. Bluntly, all compile time only language features are upper layer features. While most language distinct themselves in their lower layer design choices, functional programming languages such as lisp are known for their upper layer features.

Conventionally, programming languages do not separate the two layers, and consequently, features in both layers become constraints to each and language design choices are often a compromise. For example, C almost have a complete representation of hardware units (which takes all shape and complexity/flexibility). With that, its higher level constructs are very limited (or terrible). Its include mechanism are so basic that it become major problem for large projects. Its macros are so semantics-less that it can be very obscure. On the other end of the spectrum, lisp is designed for its upper layer elegance. Its macros are structured all the way to the top. However, to achieve that, it limits its lower level representations. In lisp, code is data, so it is s-expression all the way down. In the middle of the spectrum, there is this set of dynamic languages. They strike the middle ground by abstract away the lower level hardware with virtual machines and provide "eval" and its family to allow language features manipulating its compile time structure -- dynamically. With that, they have the lever to adjust the balance between features of the two layers. This is the famous "water-bed" theory -- by Larry Wall, the father of Perl.

I agree with the water-bed theory, but the theory applies to everything, including our daily life such as natural language. I personally never had the dilemma in choosing between flexibility and specificness. How did we manage that? We manage that with context. We can think abstractly -- no data types, then we can think in detail -- how the integer changes in bits, and we don't even realize that we have been switching context!

It will be too ambitious to emulate that kind of power of a human brain, but my solution is to separate context. My theory is that typical languages have to observe water-bed theory because it is just one context to the user/programmer. Under one context, the programmer has to worry about expressibility and precision at the same time, which necessarily requires balance. But let's separate the upper layer of the programming out from the lower layer. Now with two layers, in the upper layer (with enough editor and language feature clues) the user/programmer is clear that it is a upper layer, he can embrace ambiguity for expressing power or ideology. Then as a process of refactoring/debugging, he can go to the lower layer to verify accuracy. Just like how we use our natural language. When we speak, we speak whatever in our mind directly, then we check audience response and adjust/develop our context or conventions until the ambiguity is automatically resolved correctly most of the time.

Confused? Let me illustrate. Say we are inventing a syntax, if it is the only context, we need worry about its corner cases, because otherwise we won't know for sure the program is not broken in situations that we did not test. And the reality is we can never test all cases and there is often nearly unlimited amount of corner cases -- it is just hard to invent languages given one and only context. But if it is merely an upper layer, there is no dire consequences in using high-power syntax or sometime even arbitrary syntax (think slangs). We can verify its safety by examine its lower layer translations -- hopefully, that lower layer language is clear and unambiguously designed and its compiler can offer good verifications. The worst case is no less worse than we use the lower layer language directly.

Introducing MyDef. Unlike typical programming languages, MyDef is a pure upper layer -- I find that name too obscure, so I often refer to it lexical (disregard its ambiguity ) -- a pure lexical layer implementation.

Introducing MyDef -- its thinnest form

I hate myself when I talk in the cloud. In reality, MyDef is simply a flexible layer to translate from the way you want to program into a form the underneath language compiler will accept (exactly the form you would write directly in that language). Disregard how it sounds like when I talked in a cloud, in reality it is very simple to start using MyDef. Let me show you the example of MyDef in its thinnest form:

page : matrix int main(){ float A[4] = {2, 3, 4, 0}; float B[4] = {1, 2, 5, -1}; float sum = 0; int i; float C[4]; for(i=0; i<2; i++){ int k_ij; int j; k_ij = i * 2; for(j=0; j<2; j++){ int k_ik; int k_kj; float sum = 0; int k; k_ik = i * 2; k_kj = +j; for(k=0; k<2; k++){ sum += A[k_ik] * B[k_kj]; k_ik++; k_kj += 2; } C[k_ij] = sum; k_ij++; } } for(i=0; i<4; i++){ printf( "C[%d]: %f

" , i, C[i]); } }

The thin layer is so thin that is literally transparent. But it still need to be translated:

$ mydef_page -mc matrix.def PAGE: matrix --> [./matrix.c] $ gcc -o matrix matrix.c && ./matrix C[0]: 17.000000 C[1]: 1.000000 C[2]: 4.000000 C[3]: 8.000000

Introducing MyDef -- $sumcode

Ha! You are under-impressed, are you?

I think we can draw an analogy of MyDef to Emacs. To its basic, Emacs are not far away from Notepad. But Emacs has a layer of scheme, when embraced, can achieve anything, only limited by user's imagination and mastership.

In day 1, you don't have to worry about all complicated, fancy stuff that MyDef allow you to do. You just program in your usual way, as if it is a notepad in Emacs. Slowly, you can learn and try stuff in MyDef, and before long, you can write program like this:

include : c/matrix.def page : matrix , basic_frame $local float A[4]={2, 3, 4, 0} $local float B[4]={1, 2, 5, -1} $call set_matrix, A, 2, 2 $call set_matrix, B, 2, 2 $local float C[4] $call set_matrix, C, 2, 2 $sumcode C[i,j]=A[i,k]*B[k,j] $call matrix_print, C

I admit It is not as ideal as the math notation -- I have to tell MyDef the dimensions of my A, B, and C, and I don't really want to omit "*" and use subscripts -- but it is very close to the way I wanted to program. With the same compiling cycle:

$ mydef_page -mc matrix.def PAGE: matrix --> [./matrix.c] $ gcc -o matrix matrix.c && ./matrix Matrix C: 2 x 2 17 1 4 8

If you don't want to wet your hands, I paste "matrix.c" for your reference:

void f_matrix_print(float* pf_A, int m, int n); int main(int argc, char** argv){ float A[4] = {2, 3, 4, 0}; float B[4] = {1, 2, 5, -1}; float C[4]; int i_i; for(i_i=0; i_i<2; i_i++){ int k_ij; int i_j; k_ij = i_i * 2; for(i_j=0; i_j<2; i_j++){ int k_ik; int k_kj; float sum = 0; int i_k; k_ik = i_i * 2; k_kj = +i_j; for(i_k=0; i_k<2; i_k++){ sum += A[k_ik] * B[k_kj]; k_ik++; k_kj += 2; } C[k_ij] = sum; k_ij++; } } printf( "Matrix C: %d x %d

" , 2, 2); f_matrix_print(C, 2, 2); return 0; } void f_matrix_print(float* pf_A, int m, int n){ int i; for(i=0; i<m; i++){ if(m > 8 && i == 3){ puts( " ... ..." ); i += m - 5; } else{ int j; for(j=0; j<n; j++){ if(n > 8 && j == 3){ printf( " ... " ); j += n - 5; } else{ printf( " %6g " , pf_A[i*n+j]); } } puts( "" ); } } }

I admit: just as the underneath layer of C language is not pure lower layer, my upper layer of MyDef is not pure upper layer as well -- it did sneaked in a subroutine of "f_matrix_print" which should live in a lower level library. For practical points, you don't mind, do you?

More examples in C

Just to illustrate that the idea of Einstein notation ($sumcode) is not limited to matrices (or vectors). The concept applies to general indexed arrays and loops:

$local int a[100] $sumcode a[i]=i+1 $sumcode sum = a[i] $print "Sum of 1..100 is $sum"

You have no trouble understand what it does, right? Yep, if you let it, MyDef will do some basic type inference. On the other hand, MyDef will leave all lines that ends with ';' or that it doesn't understand as is, so you can almost ignore MyDef if your usual C programming is full of semicolons.

Examples in Perl

Not all applications or all parts of programs are efficiency critical, and for non-critical programming, my favorite language is Perl. Like C, Perl do not try to say something on the paradigm of my programming should be, leave me more room in the lexical layer.

page : test $my @a[10] $sumcode $a[i]=i $sumcode print "i: $a[i]

" $sumcode(10) print "sumcode simple: i

" $my @a[2,2], @T[2,2]=(0,1,1,0) $sumcode $a[i,j]=i+j $sumcode print "i, j: $a[i,j]

" $print $my @b[2,2] $sumcode $b[i,j]=$T[i,k]*$a[k,j] $sumcode print "i, j: $b[i,j]

"

Again, I need invent some syntax to tell MyDef my intended dimensions for the arrays. We need go through the similar compile cycle (it can be integrated in your editor, I believe):

$ mydef_page -mperl test.def PAGE: test --> [./test.pl] $ perl test.pl 0: 0 1: 1 2: 2 3: 3 4: 4 5: 5 6: 6 7: 7 8: 8 9: 9 sumcode simple: 0 sumcode simple: 1 sumcode simple: 2 sumcode simple: 3 sumcode simple: 4 sumcode simple: 5 sumcode simple: 6 sumcode simple: 7 sumcode simple: 8 sumcode simple: 9 0, 0: 0 0, 1: 1 1, 0: 1 1, 1: 2 0, 0: 1 0, 1: 2 1, 0: 0 1, 1: 1

I only implemented $sumcode in C and Perl as they are what I needed. In principle, it is not difficult to implement to any other languages. Think about it, it is just basic text manipulations (but you need that layer to allow you to manipulate).

PS:

I hope you like want I showed here. I would like to remind that I am not just showing you a language syntax candy. I am showing you an idea, that you do not have to suppress your own intuition on what is the best way to programming (unless you are young and just starting out, in which case you should empty your bottle and learn). On the other hand, if you believe in yourself, then I encourage you to pick up MyDef (or lisp if you like it, or any other lexical empowering layer) and program the way you think.