I’ve spent around a year now fiddling with and eventually doing real

data analytic work in the The Programming Language J. J is one of

those languages which produces a special enthusiasm from its users and

in this way it is similar to other unusual programming languages like

Forth or Lisp. My peculiar interest in the language was due to no

longer having access to a Matlab license, wanting an array oriented

language to do analysis in, and an attraction to brevity and the point

free programming style, two aspects of programming which J emphasizes.

I’ve been moderately happy with it, but after about a year of light

work in the language and then a month of work-in-earnest (writing

interfaces to gnuplot and hive and doing Bayesian inference and

spectral clustering) I now feel I am in a good position to offer a

friendly critique of the language.

First, The Good

J is terse to nearly the point of obscurity. While terseness is not a

particularly valuable property in a general purpose programming

language (that is, one meant for Software Engineering), there is a

case to be made for it in a data analytical language. Much of my work

involves interactive exploration of the structure of data and for that sort

of workflow, being able to quickly try a few different ways of

chopping, slicing or reducing some big pile of data is pretty

handy. That you can also just copy and paste these snippets into some

analysis pipeline in a file somewhere is also nice. In other words,

terseness allows an agile sort of development style.

Much of this terseness is enabled by built in support for tacit

programming. What this means is that certain expressions in J are

interpreted at function level. That is, they denote, given a set of

verbs in a particular arrangement, a new verb, without ever explicitly

mentioning values.

For example, we might want a function which adds up all the maximum

values selected from the rows of an array. In J:

+/@:(>./"1)

J takes considerable experience to read, particularly in Tacit

style. The above denotes, from RIGHT to LEFT: for each row ( "1 )

reduce ( / ) that row using the maximum operation >. and then ( @: )

reduce ( / ) the result using addition ( + ). In english, this means:

find the max of each row and sum the results.

Note that the meaning of this expression is itself a verb, that is

something which operates on data. We may capture that meaning:

sumMax =: +/@:(>./"1)

Or use it directly:

+/@:(>./"1) ? (10 10 $ 10)

Tacit programming is enabled by a few syntactic rules (the so-called

hooks and forks) and by a bunch of function level operators called

adverbs and conjuctions. (For instance, @: is a conjunction rougly

denoting function composition while the expression +/ % # is a fork,

denoting the average operation. The forkness is that it is three

expressions denoting verbs separated by spaces.

The details obscure the value: its nice to program at function level

and it is nice to have a terse denotation of common operations.

J has one other really nice trick up its sleeve called verb

rank. Rank itself is not an unusual idea in data analytic languages:

it just refers to the length of the shape of the matrix; that is, its

dimensionality.

We might want to say a bit about J’s basic evaluation strategy before

explaining rank, since it makes the origin of the idea more clear. All

verbs in J take one or two arguments on the left and the right. Single

argument verbs are called monads, two argument verbs are called dyads.

Verbs can be either monadic or dyadic in which case we call the

invocation itself monadic or dyadic. Most of J’s built-in operators

are both monadic and dyadic, and often the two meanings are unrelated.

NB. monadic and dyadic invocations of <

4 < 3 NB. evaluates to 0

<3 NB. evalutes to 3, but in a box.

Give that the arguments (usually called x and y respectively) are

often matrices it is natural to think of a verb as some sort of matrix

operator, in which case it has, like any matrix operation, an expected

dimensionality on its two sides. This is sort of what verb rank is

like in J: the verb itself carries along some information about how

its logic operates on its operands. For instance, the built-in verb

-: (called match) compares two things structurally. Naturally, it

applies to its operands as a whole. But we might want to compare two

lists of objects via match, resulting in a list of results. We can

do that by modifying the rank of -:

x -:”(1 1) y

The expression -:”(1 1) denotes a version of match which applies to

the elements of x and y, each treated as a list. Rank in J is roughly

analogous the the use of repmat, permute and reshape in Matlab: we can

use rank annotations to quickly describe how verbs operate on their

operands in hopes of pushing looping down into the C engine, where

it can be executed quickly.

To recap: array orientation, terseness, tacit programming and rank are

the really nice parts of the language.

The Bad and the Ugly

As a programming environment J can be productive and efficient, but it

is not without flaws. Most of these have to do with irregularities in

the syntax and semantics which make the language confusing without

offering additional power. These unusual design choices are

particularly apparent when J is compared to more modern programming

languages.

Fixed Verb Arities

As indicated above, J verbs, the nearest cousin to functions or

procedures from other programming languages, have arity 1 or

arity 2. A single symbol may denote expressions of both arity, in

which case context determines which function body is executed.

There are two issues here, at least. The first is that we often want

functions of more than two arguments. In J the approach is to pass

boxed arrays to the verb. There is some syntactic sugar to support

this strategy:

multiArgVerb =: monad define

‘arg1 arg2 arg3’ =. y

NB. do stuff

)

If a string appears as the left operand of the =. operator, then

simple destructuring occurs. Boxed items are unboxed by this

operation, so we typically see invocations like:

multiArgVerb('a string';10;'another string')

But note that the expression on the right (starting with the open

parentheses) just denotes a boxed array.

This solution is fine, but it does short-circuit J’s notion of verb

rank: we may specify the the rank with which the function operates on

its left or right operand as a whole, but not on the individual

“arguments” of a boxed array. But nothing about the concept of rank

demands that it be restricted to one or two argument functions: rank

entirely relates to how arguments are extracted from array valued

primitive arguments and dealt to the verb body. This idea can be

generalized to functions of arbitrary argument count.

Apart from this, there is the minor gripe that denoting such single

use boxed arrays with ; feels clumsy. Call that the Lisper’s bias:

the best separator is the space character.1

A second, related problem is that you can’t have a

zero argument function either. This isn’t the only language where

this happens (Standard ML and OCaml also have this tradition, though I

think it is weird there too). The problem in J is that it would feel

natural to have such functions and to be able to mention them.

Consider the following definitions:

o1 =: 1&- o2 =: -&1 (o1 (0 1 2 3 4)); (o2 (0 1 2 3 4)) ┌────────────┬──────────┐ │1 0 _1 _2 _3│_1 0 1 2 3│ └────────────┴──────────┘

So far so good. Apparently using the & conjunction (called “bond”)

we can partially apply a two-argument verb on either the left or the

right. It is natural to ask what would happen if we bonded twice.

(o1&1)

o1&1

Ok, so it produces a verb.

3 3 $ '' ;'o1' ;'o2' ;'right' ;((o1&1 (0 1 2 3 4)) ; (o2&1 (0 1 2 3 4)) ;'left' ; (1&o1 (0 1 2 3 4)) ; (1&o2 (0 1 2 3 4))) ┌─────┬────────────┬────────────┐ │ │o1 │o2 │ ├─────┼────────────┼────────────┤ │right│1 0 1 0 1 │1 0 _1 _2 _3│ ├─────┼────────────┼────────────┤ │left │1 0 _1 _2 _3│_1 0 1 2 3 │ └─────┴────────────┴────────────┘

I would describe these results as goofy, if not entirely impossible to

understand (though I challenge the reader to do so). However, none of

them really seem right, in my opinion.

I would argue that one of two possibilities would make some sense.

(1&-)&1 -> 0 (eg, 1-1) (1&-)&1 -> 0″_ (that is, the constant function returning 0)

That many of these combinations evaluate to o1 or o2 is doubly

confusing because it ignores a value AND because we can denote

constant functions (via the rank conjunction), as in the expression

0"_ .

Generalizations

What this is all about is that J doesn’t handle the idea of a

function very well. Instead of having a single, unified abstraction

representing operations on things, it has a variety of different ideas

that are function-like (verbs, conjuctions, adverbs, hooks, forks,

gerunds) which in a way puts it ahead of a lot of old-timey languages

like Java 7 without first order functions, but ultimately this

handful of disparate techniques fails to acheive the conceptual unity

of first order functions with lexical scope.

Furthermore, I suggest that nothing whatsoever would be lost (except

J‘s interesting historical development) by collapsing these ideas

into the more typical idea of closure capturing functions.

Other Warts

Weird Block Syntax

Getting top-level2 semantics right is hard in any

language. Scheme is famously ambiguous on the subject, but at

least for most practical purposes it is comprehensible. Top-level has

the same syntax and semantics as any other body of code in scheme

(with some restrictions about where define can be evaluated) but in

J neither is the same.

We may write block strings in J like so:

blockString =: 0 : 0 Everything in here is a block string. )

When the evaluator reads 0:0 it switches to sucking up characters

into a string until it encounters a line with a ) as its first

character. The verb 0:3 does the same except the resulting string is

turned into a verb.

plus =: 3 : 0 x+y )

However, we can’t nest this syntax, so we can’t define non-tacit

functions inside non-tacit functions. That is, this is illegal:

plus =: 3 : 0 plusHelper =. 3 : 0 x+y ) x plusHelper y )

This forces to the programmer to do a lot of lambda lifting

manually, which also forces them to bump into the restrictions on

function arity and their poor interaction with rank behavior, for if

we wish to capture parts of the private environment, we are forced to

pass those parts of the environment in as an argument, forcing us to

give up rank behavior or forcing us to jump up a level to verb

modifiers.

Scope

Of course, you can define local functions if you do it tacitly:

plus =: 3 : 0 plusHelper =. + x plusHelper y )

But, even if you are defining a conjunction or an adverb, whence you

are able to “return” a verb, you can’t capture any local functions –

they disappear as soon as execution leaves the conjunction or adverb

scope.

That is because J is dynamically scoped, so any capture has to be

handled manually, using things like adverbs, conjunctions, or the good

old fashioned fix f. , which inserts values from the current scope

directly into the representation of a function. Essentially all modern

languages use lexical scope, which is basically a rule which says: the

value of a variable is exactly what it looks like from reading the

program. Dynamic scope says: the valuable of the variable is whatever

its most recent binding is.

Recapitulation!

The straight dope, so to speak, is that J is great for a lot of

reasons (terseness, rank) but also a lot of irregular language

features (adverbs, conjunctions, hooks, forks, etc) which could be

folded all down into regular old functions without harming the

benefits of the language, and simplifying it enormously.

If you don’t believe that regular old first order functions with

lexical scope can get us where we need to go, check out my

tacit-programming libraries in R and Javascript. I

even wrote a complete, if ridiculously slow implementation of J‘s

rank feature, literate-style, here.

Footnotes

1 It bears noting that ; in an expression like (a;b;c)

is not a syntactic element, but a semantic one. That is, it is the

verb called “link” which has the effect of linking its arguments into

a boxed list. It is evaluated like this:

(a;(b;c))

(a;b;c) is nice looking but a little strange: In an expression

(x;y) the effect depends on y is boxed already or not: x is always boxed regardless, but y is boxed only if it wasn’t boxed before.

2 Top level? Top-level is the context where everything

“happens,” if anything happens at all. Tricky things about top-level

are like: can functions refer to functions which are not yet defined,

if you read a program from top to bottom? What about values? Can you

redefine functions, and if so, how do the semantics work? Do functions

which call the redefined function change their behavior, or do they

continue to refer to the old version? What if the calling interface

changes? Can you check types if you imagine that functions might be

redefined at any time? If your language has classes, what about

instances created before a change in the class definition. Believe or

not, Common Lisp tries to let you do this – and its confusing!

On the opposite end of the spectrum are really static languages like

Haskell, wherein type enforcement and purity ensure that the top-level

is only meaningful as a monolith, for the most part.