Strings

Version 1.5 of uLisp adds support for strings. A string can consist of an arbitrary number of ASCII characters, and the storage required is 2 bytes per character plus four or five bytes. The following section describes how strings are implemented.

16th April 2017: This description has been updated to match uLisp 1.8.

11th August 2017: Note that in the 32-bit versions of uLisp each object consists of two 4-byte cells, and four ASCII characters are packed into each object.

String representation

As with all objects in uLisp the head of a string object consists of two 2-byte cells. Strings are identified by a '8' in the left cell, and there is a pointer to the characters in the string in the right cell:

The type enum is now:

enum type {ZERO=0, SYMBOL=2, NUMBER=4, STREAM=6, STRING=8, PAIR=10};

A null string just has NULL in the right cell.

In a string of one or more characters the right cell points to a linked list of objects, one object for each pair of characters. This avoids the need to have a separate storage area for strings, and allows strings to be garbage collected in the same way as other objects.

For example, creating a string with:

(defvar str "hello")

would give this structure:

In a string the cells are linked together using car pointers, rather than the usual cdr pointers, so that the characters won't be affected when the top bit of the car cell is marked during garbage collection.

Garbage collection

An additional test in markobject() handles the garbage collection of strings:

void markobject (object *obj) { MARK: if (obj == NULL) return; if (marked(obj)) return; object* arg = car(obj); unsigned int type = obj->type; mark(obj); if (type >= PAIR || type == ZERO) { // cons markobject(arg); obj = cdr(obj); goto MARK; } if (type == STRING) { obj = cdr(obj); while (obj != NULL) { arg = car(obj); mark(obj); obj = arg; } } }

This simply steps along the string, until it reaches a NULL pointer, marking each pair as it goes.

Reading a string

The utility function readstring() reads in a string up to a specified delimiter and returns the string object: