K

K is an interactive array and list processing language of mostly functional style, designed and implemented by a single person, Arthur Whitney. Being an experienced APL programmer, Whitney at one time created a dialect of APL called A, aimed at high productivity in numerically intensive applications, including time series data analysis. Later on, around 1993, he undertook a new, more radical redesign effort in the same direction, which resulted in creating K. Still later, Whitney founded a company of his own to which the use of K was (and is) of central importance. One consequence of this is that the language became known to the wide programming community.

(A itself was extended by other people into the language A+, which is now free and open source – see the Links section of the APL article.)

K is known for its extreme efficiency in financial computations and other areas where large volumes of data have to be analysed. It is the core technology on which several highly efficient financial products are founded, the main being an in-memory, column-based relational DBMS (Kdb). Also characteristic of K are its terseness, as well as the low volume of its implementation (~200KB, including inter-process communication, a Web interface, and a graphical user interface).

Data and Functions

Although APL was a major inspiration in the design of K, much of its legacy was abandoned in favour of introducing less restrictive data and program structures. In this respect, another important source of influence was Lisp, but K is substantially different from both these languages to be considered a close relative to any of them. The two most important novelties in K with respect to APL were replacing arrays with heterogeneous lists as the most fundamental data structure and the use of functions with arbitrary number of arguments.

As homogeneous lists (vectors) are vastly useful, they are supported by special cases – including at syntax level – of the many K operators that work with lists of general kind. Thus K's generality and economy of expression does not compromise its practicality and execution speed.

Another distinguishing feature of K is the use of dictionaries: associative tables whose keys are symbols, i.e., internalized strings. In turn, dictionaries are the building material of a hierarchically organized global data space called the K-tree.

As with J and some APLs, grammatical terms such as noun, verb, and adverb are used to describe K, although K does not go as far as J in this respect. A program entity that can be immediately executed is either a verb or a function: it applies to nouns (data items) to produce nouns. An adverb is a meta-, or higher-order function which applies to a verb, function or noun to produce a new verb or function. There are only six predefined adverbs and (unlike J) there is no provision for user-defined ones.

Some verbs are said to be ‘atomic’, in the sense that they apply to the atoms that ultimately constitute a list, regardless of how deep and varied the hierarchy of that list is. For example, given a: (1; (2; 3; (4 5; 6); 7); 8; 9) , then -a is (-1; (-2; -3; (-4 -5; -6); -7); -8; -9) , 100+a is (101; (102; 103; (104 105; 106); 107); 108; 109) , and 10 20 30 40 + a is (11; (22; 23; (24 25; 26); 27); 38; 49) . Other verbs apply to a list as a whole, e.g. #a is 4 ( a 's size (at top-level) is 4).

Although there is no distinguishing of verbs by ranks of application other than being atomic or non-atomic (unlike J, where verbs do have ranks associated with them), the level of nesting at which a verb applies can be controlled by modifying the verb by means of adverbs (see the Examples section below).

Verbs are designated by non-alphanumeric characters, such as +, %, !, #, $, etc. Almost all verbs are monadic or dyadic, and most verb designators are ambivalent, having, as in APL and J, both monadic and dyadic meanings.

In general, non-alphanumeric characters are being heavily overloaded. Each character represents two or more distinct functions – verbs or adverbs – that are related to each other but differ depending on the type and the structure of their arguments. Which is being referred to is determined according to the particular context. An extreme case is @ , denoting five different verbs, and of those verbs there are still variants to distinguish among.

Unlike APL, but similar to J and Nial, K only uses ASCII characters.

There is a standard library of functions called ‘system functions’ whose names are letter sequences, but which are similar to verbs: both dyadic verbs and dyadic system functions can be – and commonly are – used as infix operators.

A programmer-defined function cannot be written infix, but, unlike verbs and system functions, it can have an arbitrary number of arguments.

Verbs and functions (but not adverbs) are data values, and can be designated by function atoms as well as by expressions. Examples of expressions that designate functions are + – a function atom denoting a verb, +/ (where / is an adverb) – a derived verb, and {x+5*y-z} – a general-form function expression (anonymous function). Assigning, e.g., a:+ , d:+/ , and f:{x+5*y-z} the three names a , d , and f themselves become function atoms. Like other data, verbs and functions can be named, stored in data structures, or passed as arguments to other functions.

The third of the above functions can be applied {x+5*y-z}[-2;4;3] or f[-2;4;3] to compute (evaluating right-to-left) (-2)+(5*(4-3)) , i.e. 3 .

In order to simplify function definitions, it is assumed that if a function does not explicitly list its parameters, the names x , y , and z can be used to denote, correspondingly, a first, second, and third parameter. For instance, the function {x+5*y-z} could have been defined equivalently as {[p;q;r]5+p*q-r} .

Function application can be partial, thus producing projections that are themselves functions. Taking once more f from above, f[;10;] or f[;10] is equivalent to {x+5*10-y} , so that f[;10][1;2] yields 1+5*10-2 , i.e. 41 ; and h:f[4;7;] or h:f[4;7] is the one-argument function {4+5*7-x} , so h[3] or h 3 yields 24 .

A dyadic verb, e.g. - , can be partially aplied to its left argument, such as in d:7- , so that d 10 yields -3 . As dyads can be applied using the general syntax for function application – such as in -[2;5] – partial application to the right argument of a dyad is possible as well: -[7;] or -[7] is the same as d , while -[;7] is ‘the function that subtracts 7’ (hence -[;7] 10 is 3 ).

Functions can be defined locally. A local function can access the local variables of an enclosing function but is unable to change them. The two (zero-argument) functions f1: {a:3; g:{a+5}; g[]} and f2: {a:3; g:{a:5}; g[]; a} each have a local variable a and a local function g ; both initialize a to 3 and then call g . f1 's g returns 8 , and so does f1 itself, while, within f2 , g assigns 5 to its own copy of a , not f2 's a , so f2[] returns 3 rather than 5 .

A sequence of monadic verbs with a possible dyad at the end is a composition of those verbs; e.g., u: #*|: defines u as composition of |: (reverse), * (first), and # (count), so that, given a: ("ab";"c";"def") , then u a is ‘the count of the first upon reversal of a ’, i.e. the size of "def" , which is 3 . Similarly, given b: (10;20 30) , then (|:'|,)[a;b] computes the composition of , (the dyad join), | (reverse), and |:' (reverse each) – a dyad, applied on a and b , which is (30 20;10;"fed";"c";"ba") . (In both examples, : within |: is used to force the verb | to be interpreted as a monad, as by default ambiguities are resolved in favour of dyads.)

A composition a 1 a 2 …a n of adverbs preceded by a verb v is interpreted as the verb (…((v a 1 )a 2 )…)a n (adverbs apply in left-to-right order, transforming v in stages). The Examples section below illustrates this.

The K-tree

All code and data in K programs is organized in a hierarchical name space, called the K-tree. The root of the K-tree is a directory where other directories, as well as ordinary data, can be stored. Those directories can have their own directories and other data etc.

In fact, directories are themselves data objects: a directory is nothing but a global variable whose value happens to be a dictionary. Thus every global variable belongs to a directory, and the entries in a global dictionary, being at the same time entries in a directory, are, in turn, the global variables of that (nested) directory. At any moment of the program execution one of the directories is current, so that variable names (simple or partial) can be resolved relative to it. Within a program, the directories on the K-tree are being created, removed, and made current as needed.

In order to avoid name conflicts, library or other K code and data can be loaded at distinct nodes on the K-tree. This mechanism proves very effective in implementing scoping and modularization policies, even simple object-oriented models.

Every object on the K-tree, including the directories, can have attributes associated with it. The attributes are also stored on the K-tree, relative to the respective objects. In general, what attribute values and for what purposes are used is up to the programmer, possible uses including e.g. documentation and handling administrative information about a system.

There are attributes with predefined meaning, e.g. controlling the display of the corresponding objects, in particular – that in the GUI. The GUI is managed entirely in this way, thus it is purely data driven and declarative.

Two other predefined kinds of attributes, the so called dependencies and triggers, provide a spreadsheet-like behaviour in programs by relating global variables to each other. Both dependencies and triggers are expressions, each associated (as an attribute) with a global variable.

A trigger is executed whenever its variable receives a value, and is most often used for setting the value of another global variable. Evaluation of a dependency takes place due to any of the global variables in it changing its value. The value of the dependency expression eventually becomes the value of the dependent variable. However, unlike that of a trigger, the evaluation of a dependency only occurs when the dependent variable is actually referenced. In other words, triggers evaluate eagerly and dependencies lazily.

Environment

K is indistinguishable from its implementation, so it is worth mentioning the features of the language that characterize it as a programming environmet.

Normally, K's user interacts with it through the K console, which provides a REPL and commands for loading scripts, as well as for debugging, displaying help, and some others. An executing K program can load from files subroutines, in K source or compiled, in order to extend itself dynamically.

K also incorporates a simple but useful GUI which can be used, in parallel with the K console, for data entry and visualization. The display of a variable's value within the corresponding GUI widget(s) is automatically synchronized with possible changes occurring programmatically, including setting it directly in the console. Conversely, if the value of a variable gets changed from within the GUI, the change is automatically seen by all computations making use of that variable.

Examples

• Given a list of strings, join them into one, using the string ", " as a separator:

2_,/", ",/:

This is a composition of three verbs: ", ",/: , ,/ , and 2_ . The verb ,/: is the verb , (‘join’) modified by the adverb /: (‘each-right’), so that join applies to a left argument and each of the items of the right argument (the given list). ", ",/: is a partial application of ,/: to ", " : the result is a verb appending ", " to the front of each string in the list. ,/ is the verb , modified by the adverb / (‘over’) which makes it apply between each two adjacent items in a list: it joins together all of them into one. Finally, 2_ is a partial application of _ (‘cut’) with left argument 2 : it removes from the result the very first two characters ", " .

• The following three functions do simple computations on polynomials, based on the Horner's rule:

eval: {[t;c]{y+x*t}/c} divide: {[t;c]{y+x*t}\c} shift: {[t;c]{(0,t*x)+x,y}/c}

For a number t and a polynomial p(x) with coefficients given as a list c :

eval computes the value of the polynomial at an argument x= t ,

divide finds the polynomial quotient q(x)=p(x)/(x–t) and remainder r of dividing p(x) with x–t, and

shift finds the polynomial s(x), such that s(x–t)=p(x).

E.g., if t is 2 and c is 2 3 -5 1 (p(x) = 2x3+3x2–5x+1), then

p(2) = 2⋅23+3⋅22–5⋅2+1 = 19,

q(x) = 2x2+7x+9, r = 19 (as p(x) = (2x2+7x+9)(x–2)+19), and

s(x) = 2x3+15x2+31x+19 (as 2(x–2)3+15(x–2)2+31(x–2)+19 = 2x3+3x2–5x+1),

so:

eval[2;2 3 -5 1] returns 19 ;

divide[2;2 3 -5 1] returns 2 7 9 19 ;

shift[2;2 3 -5 1] returns 2 15 31 19 .

In eval , the anonymous function {y+x*t} is applied over the list of coefficients with the adverb / – the accumulated result of this application is the value p(t). divide only differs from eval by using the adverb \ (‘scan’) in place of / (on the same function). shift also uses / but on a different function, which, in a sense, is an inverse to that within eval and divide .

• Flattening a list:

{ :[ @x ; x ; ,/_f'x ] }

‘Flattening’ means obtaining a list of all the atoms of a list – no matter how deep they are nested in its sublists – preserving the relative order in the original list. The above function is recursive and implements a straightforward logic: in a conditional expression, if the argument is an atom ( @x ), x itself is returned; otherwise the same function ( _f is a name for self-reference) applies to each (note the adverb ' , ‘each’) item of x , and the results are collected together in a single list by joining them ( ,/ ).

There is an even shorter solution, though:

,//

The verb ,// reads (,/)/ ; the , in ,/ is the dyad ‘join’, and ,/ itself designates either a monad or a dyad. In the context of ,// this ambivalence remains undecided, and in such a case it is assumed that the monadic variant of ,/ applies. Thus, we have an expression of the form m/ , where m is a monad. The specific rule of applying / to a monad is that the latter is called repeatedly on the argument, gradually transforming it in this way until the result equals the (last or initial) argument.

Applied to the list ((1;2 3;"4");5;("six";`seven)) , the result is (1;2;3;"4";5;"s";"i";"x";`seven)

It should be mentioned that the two functions behave differently when applied to an atom: the former one returns the atom, while the latter returns a list of that atom.

• Cartesian product of a list of lists:

{(,()){,/(,:'y),/:\:x}/|x}

The heart of this function is the verb ,/:\: , implementing a cartesian product of two lists. It is ,/: , i.e. join-with-each-right (see above), modified with \: (‘each-left’), so that each item of the left argument is combined with each item of the right – which is what a cartesian product is. The overall function works by applying an inner function, {,/(,:'y),/:\:x} , through / , repeatedly over the reversed ( | ) list of lists x . Starting from an one-item list with an empty list as an item ( ,() ) – which is the cartesian product of an empty list – the inner function gradually enlarges this cartesian product ( x ) by taking the next list y and joining each of its items with each of the items of x (thus obtaining a new value of x ). Just before doing ,/:\: , each item of y gets ‘promoted’ by nesting it in a list of itself ( ,:'y ). Conversely, after each join, applying ,/ (see above) is needed so that an unnecessary level of list nesting in the product is removed.

Applied to the list ((1;2 3;"4");5;("six";`seven)) of 3 items (themselves having 3, 1, and 2 items, correspondingly), the cartesian product is a list of 6(=3⋅1⋅2) items, of 3 elements each:

((1;5;"six")

(1;5;`seven)

(2 3;5;"six")

(2 3;5;`seven)

("4";5;"six")

("4";5;`seven))



Changes to the Language

K has been mildly revised several times. A very successful and relatively widely used version is K3. The language description in the present article refers mostly to this variant of the language. Further evolution lead to a more recent variant known as K4, and a superset, which is also a syntactic variant of K4, named q.

K4/q is a change over K3 in a number of significant ways, such as:

A number of new datatypes have been added; namely, those defined in SQL 3, and common to most databases.

Tables, such as in relational databases, are first class data in q. More generally, by including features for querying a relational database, q integrates general-purpose and database programming into one language.

(Previously, there was a separate language for database programming, Ksql, implemented in K. Merging the two languages resulted in both increased productivity and reduced program size.)

(Previously, there was a separate language for database programming, Ksql, implemented in K. Merging the two languages resulted in both increased productivity and reduced program size.) Operator ambivalence – so characteristic of the APL-like languages – is avoided in q by giving word names to all monadic operators. Symbols such as + , @ , # , $ , /: etc. only denote dyadic operators, and operators whose names are words are only monadic. (Ambivalence is still present in K4.)

, , , , etc. only denote dyadic operators, and operators whose names are words are only monadic. (Ambivalence is still present in K4.) Nested functions in K4 and q cannot refer to surrounding function's local variables. (Often, the lack of this ability can be circumvented by making use of function projection.)

The support for GUI is dropped from K4 (and q).

Also removed is the possibility to attach attributes to data objects.

The owner of K and q presently offers a database programming system, Kdb+, of which q is the programming language. The system allows one to create DBMSs with the q language integrated in them, contributing to very high performance.

Only q, not K, is currently being offered as a programming language to use, within Kdb+ or otherwise. However, q's implementation is actually one of K4. The differences of q with respect to K4 – its specific syntax and its table/query processing capabilities – are implemented as libraries written in K4.

In fact, besides under q, K4 can still be used directly: the interpreter can switch between q and K modes and run scripts written in K. K4 is not officially documented, though.

Links of Relevance:

Kx Systems: the company that produces K

Introductory articles – somewhat dated but nevertheless rather informative: K, by Arthur Whitney A Shallow Introduction to the K Programming Language

Two interviews with Arthur Whitney

Documentaion for K K user manual: language overview, some primitive functions, operators K reference manual: the complete reference of the language

Free for non-commercial use versions of q (Kdb+)

Kona: an open-source implementation of K (K3/K4)

oK: an open-source implementation of K (K5) in JavaScript, including an online interpreter

boykobbatgmaildotcom