The Concatenative Language XY

XY is a family of array-oriented, concatenative programming languages with first-class continuations. XY 1 has quotations, lists, functions, and patterns. XY 2 is flat. XY 0 has quotations and shuffle-symbols but dispenses with lists and patterns.

Elements

The elements of XY are words. A word is a unary function which takes and returns a pair of the form:

[stack queue]

The stack is a list which represents the past of the computation.

The queue is a list which represents the future of the computation.

XY is written in K in continuation-passing style: the queue returned by an application step determines the next application step. A call to a defined word is implemented by pushing the definition of that word onto the queue. In this sense, XY is "stackless."

The datatypes of XY are the atoms and lists of K: integer, float, character, symbol, dictionary, null, function, and list. The elementary datatypes are integer, float, character, and symbol. A vector is a list of atoms of uniform elementary datatype.

The twenty K verbs are the primitive functions of XY:

~ ! @ # $ % ^ & * - _ = + | : < , > . ?

The eight words of the XY core are implemented directly in K:

-> <- => <= / \ ; {

The K adverbs, and the operators and combinators of Joy are implemented in XY.

Notation

Punctuation symbols:

() [ ] " '

Words are separated by blanks:

12 4 + *

Blanks may be elided around punctuation symbols:

[1 2 3[4 5]] (1 2 3(4 5))

Characters:

'a

Strings:

"abc\"def"

A string is a list of characters:

'a'b'c "abc" *: 'a

Backquote is both a primitive (enclose/disclose) and, prefixed to a list, notation for functions:

`[1 2 3] `[1 2 3] 4:: 7

Verbs

Each K verb v has three forms:

v dyad [X^a^b^v Y] -> [X^a v b Y] v: monad [X^a^v Y] -> [X^v a Y] v. commute dyad [X^a^b^v Y] -> [X^b v a Y]

Literals

_n null _n 0s empty symbol ` 0S empty symbol vector 0#` 0V empty integer vector !0 0I max int 0I -0I 1+min int -0I 0N min int 0N 0v empty float vector 0#0. 0i infinity 0i -0i negative infinity -0i 0n NaN 0n

Lists and Quotations

XY supports notation for both lazy lists ("quotations") and eager lists ("lists").

Quotations are written using square brackets:

[2 3 +] [2 3 +]

Lists are written using parentheses:

(2 3 +) [5]

A list such as (2 3 +) is recursively transformed at parse-time into the expression

[[2 3 +]] [i] infra

Evaluation

In the course of execution, XY processes two objects: a list of values X and a list of tokens z^Y, where z is the next token to be processed and Y is a list of the remaining tokens. X is the result stack and Y is the input queue. Evaluation of z consists of applying z to the list [X Y]. The result of the application is [X' Y'], a list containing a new stack X' and a new queue Y'.

z is either an atom or a list. An atom is either an integer, a float, a character, a dictionary, null, a function, or a symbol. A symbol is either defined or undefined.

If z is anything other than a defined symbol, its evaluation rule is:

[X z^Y] -> [X^z Y]

That is, z is popped from the queue and pushed onto the stack.

XY maintains a mapping W from symbols to evaluation rules. For each defined symbol, there is an evaluation rule in W. For example, the evaluation rule for the primitive + is:

[X^x1^x2 Y] -> [X^x1+x2 Y]

That is, + takes X^x1^x2 and Y, pops x1 and x2, adds them, pushes the result onto X, and returns the new stack and the unaltered queue Y.

Projection

Primitives like + and patterns like { [a b c] b a + c * } have valence. + has valence 2: it expects at least two values on the stack. If the stack contains fewer than the expected number of values, the function is projected onto the available values to produce a closure:

clear 2 + * [2 + *] 3 swap i [5 *] 4 swap i 20 clear 2 { [a b c] b c + a * } [2 { [a b c] b c + a * }] 3 swap i [3 2 { [a b c] b c + a * }] 4 swap i 20 clear 2 { [a b] a b + } { [a b] a b * } [2 { [a b] a b + } { [a b] a b * }] 3 swap i [5 { [a b] a b * }] 4 swap i 20

Core

<- stack [X^z Y] -> [z Y] <- replaces the stack with the item found at the top of the stack: 10 20 [2 3 +] <- 2 3 + <- is Joy's "unstack" operator.

-> queue [X^z Y] -> [X z] -> replaces the queue with the item found at the top of the stack: 10 20 [2 3 +] -> 30 40 50 10 20 5 -> has no analogue in Joy.

<= uncache [X Y^z] -> [X^z Y] <= moves the tail of the queue to the top of the stack: 1 2 3 <= 4 5 1 2 3 5 4 <= has no analogue in Joy.

=> cache [X^z Y] -> [X Y^z] => moves the top of the stack to the tail of the queue: 1 2 3 => 4 5 1 2 4 5 3 => has no analogue in Joy.

/ use [X^z Y] -> [X z,Y] / prepends to the queue the item found at the top of the stack. 10 20 [2 3 +] / 30 40 50 10 20 5 30 40 50 / is Joy's "i" combinator.

\ mention [X z^Y] -> [X^z Y] \ appends to the stack the item found at the top of the queue. 10 20 \ + 30 40 10 20 + 30 40 \ has no analogue in Joy.

` enclose [X^z Y] -> [X^{z} Y] disclose [X^{z} Y] -> [X^z Y] ` turns a list into a function and a function into a list. ` has no effect on non-functional atoms. `[1 2 3] @: 1 `[1 2 3] ` @: -1 10 ` 10 ~ 1 N.B. ` is self-dual.

; define [X name^..code..^;^Y] -> [X Y] All definitions in XY are inline: ; add-mul + * ; 2 3 4 add-mul 14 Definitions can nest: ; xxx \; yyy 2 + \; ; xxx 3 yyy 5 Undefine with: ; foo ; E.g., ; add-mul ; add-mul add-mul

{ pattern [X template^..code..^}}^Y] -> [X^Z Y] The template is a list of names, possibly nested. { matches names in the template to elements on the stack, and then substitutes those values for matching names in the code. The result of the subtitution Z is prepended to the queue: 10 20 30 { [a b] a b b a } 10 20 30 30 20 The template may be nested: 10 [20 30] { [a [b c]] [a b][a c] } [10 20] [10 30] A capital letter may occur as the final element in a template list, in which case it denotes the 'cons' of the list: [10 20 30] { [[a A]] A a } [20 30] 10 If the stack and queue have the form X^x1^..^xn {^[^a1^..^an^]^..^}^Y then _x is X, _y is Y, and _z is {^[^a1^..^an^]^..^}. That is, _x is the stack after the items matching names in the template have been popped, _y is the queue after the pattern has been popped, and _z is the pattern being processed. Substitution of values for names (including _x, _y, and _z) is scoped to one level of pattern. That is, if P is a pattern whose template contains the name N, and Q is a pattern occuring within P whose template contains N, then the subtitution of a value on the stack for N occurs in P but not Q. The empty pattern {} is the identity function, or "no op".

Periphery

The following words are provisionally included in the current implementation:

.g get [X^".." Y] -> [X^[value] Y] .s set [X^value^".." Y] -> [X Y]

.i input [X Y] -> [X^".." Y], ".." accepted .o output [X^a Y] -> [X Y], a printed

.p parse ` [X^".." Y] -> [X^[..] Y] .r represent [X^[..] Y] -> [X^".." Y]

.v chase ` [X^i^v Y] -> [X^v/i Y] .v! chase! [X^i^v Y] -> [X^v\i Y]

.m fsm [X^j^i^m Y] -> [X^j m/i Y] .m! fsm! [X^j^i^m Y] -> [X^j m\i Y]

.ea evaluate alternate [X^p^q Y] -> [X p^Y] or [X q^Y] .xy xy step [X^X0^Y0 Y] -> [X^X1^Y1 Y]

.a amend [X^a^b^c Y] -> [X^@[a;b;:;c] Y] .d dmend [X^a^b^c Y] -> [X^.[a;b;:;c] Y]

Operational

XY contains the following operational words:

:cmd execute command [X^".." Y] -> [X Y], \.. :k exit to k [X Y] -> [X Y] :exit exit to sysem [X Y] -> :error signal error [X^".." Y] -> [X Y], signal error :stop stop in K [X Y] -> [X Y] :load load .xy script [X^".." Y] -> [X' Y'] :save save the stack [X^".." Y] -> [X Y] :trace set trace on/off [X^n Y] -> [X Y] :words defined symbols [X Y] -> [X^[words in W] Y]

Scripts

XY supports scripts. An XY script is a text file which contains XY code. Typically, the filename ends in ".xy", although this is not necessary.

On initialization, XY loads the system script xy.xy. The system script loads further scripts containing core definitions, including stack manipulation words like dup and swap, K adverbs like over, each, and prior, and Joy combinators like linrec and dip.

Comments are written using //:

// this is a comment

A comment ends at the next newline.

In a script, \\ at the beginning of a line terminates interpretation:

: \\ unevaluated :

Programming

In Factor, the words >r and r> are used to store and retrieve items from the stack. The analogous words in XY are => and <=, which use the tail of the queue instead of a separate stack. As of 1.7, these words have been moved to the core, but they can be re-defined as non-primitives:

; => { [] _y -cons -> } ; ; <= { [] _y -uncons -> } ;

A trace shows how they operate:

40 :trace <enter> for next step, any character to exit 1 2 3 => 4 5 <= 6 7 8 : 1 2 3 => 4 5 <= 6 7 8 1 : 2 3 => 4 5 <= 6 7 8 1 2 : 3 => 4 5 <= 6 7 8 1 2 3 : => 4 5 <= 6 7 8 1 2 3 : { [] _y -cons -> } 4 5 <= 6 7 8 1 2 3 : [4 5 <= 6 7 8] -cons -> 4 5 <= 6 7 8 1 2 3 [4 5 <= 6 7 8] : -cons -> 4 5 <= 6 7 8 1 2 3 [4 5 <= 6 7 8] : |: cons |: -> 4 5 <= 6 7 8 1 2 3 [8 7 6 <= 5 4] : cons |: -> 4 5 <= 6 7 8 1 2 3 [8 7 6 <= 5 4] : { [a b] [a] b , } |: -> 4 5 <= 6 7 8 1 2 : [3] [8 7 6 <= 5 4] , |: -> 4 5 <= 6 7 8 1 2 [3] : [8 7 6 <= 5 4] , |: -> 4 5 <= 6 7 8 1 2 [3] [8 7 6 <= 5 4] : , |: -> 4 5 <= 6 7 8 1 2 [3 8 7 6 <= 5 4] : |: -> 4 5 <= 6 7 8 1 2 [4 5 <= 6 7 8 3] : -> 4 5 <= 6 7 8 1 2 : 4 5 <= 6 7 8 3 1 2 4 : 5 <= 6 7 8 3 1 2 4 5 : <= 6 7 8 3 1 2 4 5 : { [] _y -uncons -> } 6 7 8 3 1 2 4 5 : [6 7 8 3] -uncons -> 6 7 8 3 1 2 4 5 [6 7 8 3] : -uncons -> 6 7 8 3 1 2 4 5 [6 7 8 3] : |: uncons |: -> 6 7 8 3 1 2 4 5 [3 8 7 6] : uncons |: -> 6 7 8 3 1 2 4 5 [3 8 7 6] : { [[a A]] \ a A } |: -> 6 7 8 3 1 2 4 5 : \ 3 [8 7 6] |: -> 6 7 8 3 1 2 4 5 3 : [8 7 6] |: -> 6 7 8 3 1 2 4 5 3 [8 7 6] : |: -> 6 7 8 3 1 2 4 5 3 [6 7 8] : -> 6 7 8 3 1 2 4 5 3 : 6 7 8 1 2 4 5 3 6 : 7 8 1 2 4 5 3 6 7 : 8 1 2 4 5 3 6 7 8 : 1 2 4 5 3 6 7 8

Continuations

At any point in the course of execution, the stack and the queue together make up the current continuation. Four functions for manipulating the current continuation are defined in xy/call.xy.

call/cc [X^[p]^n Y] -> [X^[X^Y^n^continue/cc] Y] { [q n] [_x _y n continue/cc] q i } call/cc expexts a quotation q and a non-negative integer n. call/cc returns [_x _y n continue/cc] and then executes q.

continue/cc [X^a1^..^an Y] -> [X' Y'] { [x y n] x <- n -: _x # i y -> } continue/cc expects the old stack x, the old queue y, and a non-negative integer n. continue/cc sets the new stack to the old stack, takes n elements from the top of what was the new stack, and sets the new queue to the old queue.

call/pc [A^m^X^m^[p]^n Y^m^B] -> [A^[X^Y^n^continue/pc] B] { [m q n] _x m -take_ _y m take_ n \continue/pc , q i } call/pc expects a "mark" object, a quotation q, and a non- negative integer n. call/pc constructs a quotation [X Y n continue/pc] where X is the stack back to the mark object (or the whole stack if the mark object isn't found,) and Y is the queue ahead to the mark object (or the whole queue if the mark object isn't found.) call/pc returns the the quotation, and then executes q.

continue/pc [X^a1^..^an Y] -> [X' Y'] { [x y n] x i n -: _x # i y i } continue/pc expects the partial old stack x, the partial old queue y, and a non-negative integer n. continue/pc appends the partial stack to the current stack, appends to that n elements from the top of the current stack, and prepends the partial queue to the current queue

Implementation

At the present time, the implementation consists of a single script containing approximately 160 lines, including whitespace and comments.

The K process contains 24 functions and four global variables:

a b c d e f g h i j k l m n o p q r s t v w x y

L S T W

L is a list of literals and their XY names. S is the result-stack, defined as a global in order to maintain state across interactions with the interpreter. T is the trace value. W is the mapping from names to evaluation rules.

Interpretation in XY consists of the standard read-eval-print loop. The three steps correspond to three functions: the parser p, the evaluator e, and the function r, which takes a list (either the stack or the queue) and represents it as a string, suitable for printing.

The evaluator takes a stack x and a queue y:

e:{({{if[T>0;t[x;y]];a[*y,();x,()](1_ y),()}}.)/(x;y)}

e repeatedly evaluates x and y until one of the following conditions occurs: either the result of the evaluation matches the previous result, or the result of the evaluation matches the initial values of x and y.

At each step, e checks to see whether the trace variable T is positive. If it is, it displays the current states of x and y and waits for the user to continue (press Enter) or exit the trace (type a character, press Enter.)

t:{if[(#x)|#y;`0:(T$r x)," : ",(-T)$r y;:[#y;if[#0:`;T::0];`0:,""]]}

It then applies the following algorithm: if the queue z is empty or the stack y is null, it returns the current stack and queue; else if the next element to be evaluated x is not a defined symbol, it appends it to the stack and returns that and the queue; else it applies the evaluation rule for that symbol:

a:{:[(~#z)&_n~x;(y;z);~4=4:x;(y,,x;z);(#**W)=i:W[0]?x;(y,x;z);W[1;i][y;z]]}

Use

Download and install a copy of the K interpreter from http://www.kx.com.

Download the directory http://www.nsl.com/k/xy/.

Configure a command prompt or shell to start in the directory xy.

At the command line, type

k xy

An XY script is a text file with the qualifier .xy. To start XY with the script my.xy:

k xy my.xy

To exit to K from XY:

:k

To exit to the command prompt:

:exit

The default state of the XY interpreter is "error-trapping off". In this state, a K error will cause the XY interpreter to suspend at the error:

-1 !: domain error {[f;n;x]:[n=1;f@*-1#x;f .(-n)#x]} ^ >

To turn error-trapping on:

"e 1" cmd -1 !: domain error

The system word 'stop' is useful for inspecting the stack and queue in K:

2 3 :stop 4 5 type error {`0;(x;y)} ^ > x 2 3 > y 4 5

To continue past the stop-point:

> : : 2 3 4 5

Oddities

Strings, definitions, and patterns automatically close:

2 3 "abc 2 3 "abc" ; f 2 + 10 f 12 2 3 { [a b] a b + 5

In XY 2.0, quotations and lists automatically open:

2 3 4 ] [2 3 4] 2 3 + ) [5]

Flatness

XY 2.0 is a flat version of XY.

The idea is due to Billy Tanksley: a language is flat if, for any sequence S of program tokens, all partitions of S are semantically equivalent. For example,

2 [3 +] i 5 ; f 2 [ 3 ; ; g + ] i ; f g 5

Flat + Markers

In XY 2.0a, [ and ] are functions. [ places a marker [[ on the stack, and ] enlists everything back to the last [[ and places it on the queue.

[ \[ [ [ [ ] [ \] ] [ ] ]

Also note:

; h 1 2 3 ; [ h ] [1 2 3] [ \h ] [h]

The punctuation symbols [ ] ( ) `[ `( are promoted to the core. { and ; are redefined. `{ is added to the vocabulary as a core primitive. Spaces around these primitives may be elided.

[ quote [X Y] -> [X^[[ Y] `[ quote function [X Y] -> [X^`[[ Y] places the quotation marker [[ or `[[ on the stack

] quoted [X^[[^Z Y] -> [X [Z]^Y] [X^`[[^Z Y] -> [X [Z]^`^Y] ] finds everything back to the last [[ or `[[ and places it it (as a list) on the queue, followed by ` if `[[

( list [X Y] -> [X^(( Y] `( list function [X Y] -> [X^`(( Y] places the list marker (( or `(( on the stack

) listed [X^((^Z Y] -> [X [Z]^/^infra^Y] [X^`((^Z Y] -> [X [Z]^/^infra^`^Y] ) finds everything back to the last (( or `(( and places it (as a list) on the queue, followed by [/]^infra or [/]^infra^` if `(

{ pattern [X Y] -> [X^{{ Y] `{ pattern function [X Y] -> [X^`{{ Y] places the pattern marker {{ or `{{ on the stack

} patterned [X^{{^Z Y] -> [X {Z}^Y] [X^`{{^Z Y] -> [X {Z}^`^Y] } finds everything back to the last {{ or `{{ and places its analysis on the queue, followed by ` if `{{

In addition to the new core words, flat XY reserves the following symbols for internal use:

[[ quotation marker `[[ quotation function marker (( list marker `(( list function marker {{ pattern marker `{{ pattern function marker ;; definition marker \; definition quotation

Since blanks around [ ] ( ) { } and ; can be elided, it is not possible to directly produce the marker symbols. For example,

\]] []]

because \] pushes ] onto the stack, then ] creates the list []]. ]] cannot be a word in XY.

Note that strings are tokens. For example, you may not say:

; f "this is ; ; g a string" ; f g "this is a string"

Flat - Markers

XY can be flattened without the use of special marker symbols. In XY 2.0b, we treat [ and similar symbols as literals. They all have the semantic rule:

[X z^Y] -> [X^z Y]

Then, for example, ] finds everything on the stack up to the last [, enlists it, and places it on the queue. The only restriction in the markerless implementation is that [ and similar symbols cannot themselves be quoted.

Flat - Markers + Lazy

The major drawback of flatness in XY is that quotations are no longer fully lazy. To produce the unit list [a] where a has been defined to have some value we must say [\a]. Alternatively, we may require that, if [ (or the marker produced by [) is on the stack, and we want the value of a word to be placed on the stack, then we must explicitly say so. For example,

; a 1 2 3 ; a 1 2 3 [ a ] [a] [ a / ] [1 2 3]

Again:

; f [1 2 ; ; g 3 4 5] ; f g [1 2 g / [1 2 3 4 5]

In order to combine flatness and lazy quotation, we must extend the core word / by giving it some of the functionality of .g:

/ use [X^z Y] -> [X z^Y] / prepends to the queue the value of the item found at the top of the stack. 10 20 [2 3 +] / 30 40 50 10 20 5 30 40 50 ; foo 2 + ; 3 \ foo / 5 / is similar to Joy's "i" combinator.

This is implemented in the preferred version, XY 2.0.

In this version, [ ( { `[ `( and `{ are literals, with the semantic rule [X z^Y] -> [X^z Y]. ; is a literal if ; is not on the stack. If any one of these symbols is on the stack, all symbols other than ] ) } ; \ and / are treated as literals.

] quote [X^[^Z Y] -> [X [Z]^Y] [X^`[^Z Y] -> [X [Z]^`^Y] ] finds everything back to the last [ or `[ and places it it (as a list) on the queue, followed by ` if `[

) list [X^(^Z Y] -> [X [Z]^/^infra^Y] [X^`(^Z Y] -> [X [Z]^/^infra^`^Y] ) finds everything back to the last ( or `( and places it (as a list) on the queue, followed by [/]^infra or [/]^infra^` if `(

} pattern [X^{^Z Y] -> [X {Z}^Y] [X^`{^Z Y] -> [X {Z}^`^Y] } finds everything back to the last { or `{ and places its analysis on the queue, followed by ` if `{

For and against flatness

I am aware of three arguments for the desirability of flatness:

Theoretical smoothness: All words are functions. Blank is the only punctuation symbol.

Finer-grained factoring: Words may be factored on any boundary, including those which occur within quotations, lists, patterns, and word-definition. Flatness obviates the need for a macro-facility.

Simplified implementation: The parser reduces to cut-on-word-boundaries and cast-word-to-type.

Against flatness, there are two objections:

The logic of the evaluator is more complex. It now has to know whether it is in "list formation" mode or not.

Factoring within what would otherwise be punctuation symbols leads to confusing code. At best, fine-grained factoring isn't useful.

Revisions

XY 0 has quotations and shuffle notation, but not lists or patterns. Shuffle-symbols start and end with { and }, use blank for delimiter, have [ and ] for < and >, and ( and ) for _x and _y. For example,

10 20 30 {x ([x])} 40 50 10 20 [10 20] [30] [40 50] 40 50

XY 0 adds two primitives to the core (this is experimental):

( stack* [X Y] -> [X^[X] Y] ( quotes the stack: 1 2 3 ( 4 5 6 1 2 3 [1 2 3] 4 5 6

) queue* [X Y] -> [X^[Y] Y] ) quotes the queue: 1 2 3 ) 4 5 6 1 2 3 [4 5 6] 4 5 6

For example,

10 ) #: [20 30] 40 10 2 [20 30] 40

Note on the asymmetry of stack and queue

[]-parsing is done at run-time. The queue* primitive (and the use of ) within shuffle-symbols) currently parses the stream of tokens before pushing the queue onto the stack. This decision keeps XY 0 consistent with XY 1, but the alternative is certainly defensible. For example, if []-parsing is deferred, then in

10 ) #: [20 30] 40 10 6 [20 30] 40

#: sees six tokens - [ 20 30 ] 40 - instead of two objects - [20 30] 40. In theory, the alternate approach, perhaps augmented by primitive words for grouping, might allow the programmer additional opportunities to write new parsing words.

Links

XY 2.0

XY 2.0 (flat): http://www.nsl.com/k/xy/xy2.k

XY 2.0a (flat, markers, semi-lazy): http://www.nsl.com/k/xy/xy2a.k

XY 2.0b (flat, semi-lazy): http://www.nsl.com/k/xy/xy2b.k

Modified scripts for 2.0a,b: http://www.nsl.com/k/xy/xy2.xy

XY 0.0

XY 0.0: http://www.nsl.com/k/xy/xy0.k

XY 0 definitions: http://www.nsl.com/k/xy/xy0.xy

MvT reproducing programs: http://www.nsl.com/k/xy/eg/rep.xy

XY 1.7

Acknowledgements

Thanks to the usual suspects on the concatenative languages mail group: Chris Double, Slava Pestov, Billy Tanksley, Chris Cogan, and Manfred von Thun.

Late thanks to Alex Belanger for finding and fixing typos and inconsistencies, and for picking up the ideas in this paper and carrying them forward.

Copyright (c) 2004, Stevan Apter. All Rights Reserved.