SIOD: Scheme in One Defun

This document is © 1996-2007 by George J. Carrette, All Rights Reserved.

File Format Description siod.tgz gzip tar Source code, all versions siod.zip INFO-ZIP archive Source code, all versions winsiod.html document Windows binaries and unpacking instructions.

Table of Contents

Apology, discussion and motivation

How can we possibly hope to make use of any of this stuff in work we are doing in other courses or in our jobs?

The problem being that both the commercial and non-commercial lisp offerings at the time seemed to want to take over your entire programming environment if you wanted to use lisp at all. A completely satisfactory solution to this problem will not be fully available again until a lisp system is written that uses the same compiler back-end, debugging and runtime support as all the other languages on a popular machine. But I digress.

Knowing that the use of lisp didn't have to be so intrusive, but having very little practical evidence at hand to actually prove the fact, I decided one day while I was keeping the laboratory section room open for course EK201 to sit down and implement a demonstration made up of a simple cons, arithmetic, read, eval, and print in straightforward C programming style, where the garbage collector was stop-and-copy and could only run in the context of the toplevel loop. Defining and executing fib resulted in a code-coverage of over 95% of the lines of the C program. Hence, SIOD, Scheme in One Day. Borrowing the name of a previous Scheme interpreter I had done to test the Lispmachine microcode compiler, Scheme in One Defun.

The motivation behind SIOD remains a small footprint, in every sense of the word, at runtime, at compile time, and in cognitive attention required to understand how the system works enough to be able to extend it as well as the author would have done the work himself.

About eight years have passed since that initial release. It has been possible to add a feature or two without contributing to the cause of software bloat, with the code segment of the libsiod shared library remaining under 75K bytes on a prototypical comparison machine like a VAX. Furthermore, as the richness of the C runtime library available on most systems has improved over time, SIOD remains a useful kind of glue to have in a software engineers toolbox.

Please forgive the lack of full compliance with IEEE or R4RS standards. Perhaps one of these days.

Building from Sources

Make OS Digital Equipment Corporation DIGITAL UNIX (OSF/1) Linux Linux Hewlett-Packard Company HP-UX Sun Microsystems Solaris Silicon Graphics IRIX Digital Equipment Corporation OpenVMS Microsoft Windows 95

Release Notes

1.0 April 1988. Initial release.

April 1988. Initial release. 1.1 April 1988. Macros, predicates, load. Better number recognizer in read, provided siod.scm file.

April 1988. Macros, predicates, load. Better number recognizer in read, provided siod.scm file. 1.2 April 1988. Name changes as requested by JAR@AI.AI.MIT.EDU, plus some bug fixes.

April 1988. Name changes as requested by JAR@AI.AI.MIT.EDU, plus some bug fixes. 1.3 May 1988. Changed env to use frames instead of alist. define now works properly.

May 1988. Changed env to use frames instead of alist. define now works properly. 1.4 November 1989. This release is functionally the same as release 1.3 but has been remodularized in response to people who have been encorporating SIOD as an interpreted extension language in other systems.

November 1989. This release is functionally the same as release 1.3 but has been remodularized in response to people who have been encorporating SIOD as an interpreted extension language in other systems. 1.5 November 1989. Added the -g flag to enable mark-and-sweep garbage collection. The default is stop-and-copy. (Note: changed default to mark-and-sweep)

November 1989. Added the -g flag to enable mark-and-sweep garbage collection. The default is stop-and-copy. (Note: changed default to mark-and-sweep) 2.0 December 1989. Set_Repl_Hooks, catch & throw.

December 1989. Set_Repl_Hooks, catch & throw. 2.1 December 1989. Additions to SIOD.SCM: Backquote, cond.

December 1989. Additions to SIOD.SCM: Backquote, cond. 2.2 December 1989. User Type extension. Read-Macros. (From C-programmer level).

December 1989. User Type extension. Read-Macros. (From C-programmer level). 2.3 December 1989. save-forms. load with argument t, comment character, faster intern. -o flag gives obarray size. default 100.

December 1989. save-forms. load with argument t, comment character, faster intern. -o flag gives obarray size. default 100. 2.4 April 1990. speed up arithmetic and the evaluator. fixes to siod.scm. no_interrupt around calls to C I/O. gen_readr.

April 1990. speed up arithmetic and the evaluator. fixes to siod.scm. no_interrupt around calls to C I/O. gen_readr. 2.5 September 1990. numeric arrays in siod.c

September 1990. numeric arrays in siod.c 2.6 March 1992. remodularize .h files, procedure prototypes. gc, eval, print hooks now table-driven.

March 1992. remodularize .h files, procedure prototypes. gc, eval, print hooks now table-driven. 2.7 March 1992. hash tables, fasload.

March 1992. hash tables, fasload. 2.8 April 1992. bug fixes.

April 1992. bug fixes. 2.9 August 1992. added trace.c, fseek, ftell, some fixes.

August 1992. added trace.c, fseek, ftell, some fixes. 3.0 May 1994. Windows NT port. some cleanups. SQL support, more string stuff. Heap management flexibility, default to mark-and-sweep, suggestions by some code reviewers for comp.sources.unix.

May 1994. Windows NT port. some cleanups. SQL support, more string stuff. Heap management flexibility, default to mark-and-sweep, suggestions by some code reviewers for comp.sources.unix. 3.1x June 1995. Verbose flag to supress file loading and error messages, along with enhanced command-line interface made siod useful for writing scripts. Support for more C library and unix programming functionality, including regular expressions and sockets. Debugging hooks *eval-history-ptr*.

June 1995. Verbose flag to supress file loading and error messages, along with enhanced command-line interface made siod useful for writing scripts. Support for more C library and unix programming functionality, including regular expressions and sockets. Debugging hooks *eval-history-ptr*. 3.2 June 1996. shared library modularization, dynamic linking interface for extensions. documentation in html.h Lexical closure support at c-programmer level. Arithmetic cleanup. parser:XXXXX extension. command "compiler" interface.

June 1996. shared library modularization, dynamic linking interface for extensions. documentation in html.h Lexical closure support at c-programmer level. Arithmetic cleanup. parser:XXXXX extension. command "compiler" interface. 3.4 Feb 1997. Windows NT/95 cleanup.

Feb 1997. Windows NT/95 cleanup. 3.5 5-MAY-97 fixes, plus win95 "compiler" to create exe files.

5-MAY-97 fixes, plus win95 "compiler" to create exe files. 3.6 5-APR-2007. Upload to CodePlex.Com, port to Visual C++ 2005 Express Edition.

What is Scheme?

QBASIC.EXE input SIOD input SIOD Result PRINT (1 + 2) * (3 + 4) (* (+ 1 2) (+ 3 4)) 21 PRINT "HELLO-" + "BUDDY" (string-append "HELLO-" "BUDDY") "HELLO-BUDDY" FUNCTION f(x) IF x < 2 THEN f = x ELSE f = f(x - 1) + f(x - 2) END IF END FUNCTION (define (f x) (if (< x 2) x (+ (f (- x 1)) (f (- x 2))))) #<CLOSURE> PRINT f(20) (f 20) 6765

Reference Section for built-in procedures

U

(%%%memref address)

(%%closure env code)

(%%closure-code closure)

(%%closure-env closure)

(%%stack-limit amount silent)

(* x1 x2 ...)

*after-gc*

(set! *after-gc* '(if (< (gc-info 4) 5000) (allocate-heap)))

*args*

(*catch tag body ...)

*env*

*eval-history-ptr*

*pi*

*plists*

(*throw tag value)

*traced*

(+ x1 x2 ...)

(- x1 x2 ...)

(/ x1 x2 ...)

(< x y)

(<= x y)

(= x y)

(> x y)

(>= x y)

(F_GETLK fd ltype whence start len)

U

(F_SETLK fd ltype whence start len)

U

F_SETLKW fd ltype whence start len)

U

(abs x)

(access-problem? filename method)

U

(if (access-problem? "x.y" "r") (error "can't read x.y"))

(acos x)

(alarm seconds flag)

U

(allocate-heap)

(and form1 form2 form3 ...)

(append l1 l2 l3 l4 ...)

(append '(a b) '(c d)) => (a b c d)

(apply function arglist)

(apropos substring)

(aref array index)

(array->hexstr string)

(aset array index value)

(ash value bits)

(asin x)

(ass key alist function)

(define (assq x alist) (ass x alist eq?))

(assoc key alist)

(assq key alist)

(assv key alist)

(atan x)

(atan2 x y)

(base64decode x)

(base64encode x)

(begin form1 form2 ...)

(benchmark-eval nloops exp env)

(benchmark-funcall1 nloops f arg1)

(benchmark-funcall2 nloops f arg1 arg2)

(bit-and x y)

(bit-not x)

(bit-or x y)

(bit-xor x y)

(butlast x)

(bytes-append x1 x2 ...)

(caaar x)

(caadr x)

(caar x)

(cadar x)

(caddr x)

(cadr x)

(car x)

(cdaar x)

(cdadr x)

(cdar x)

(cddar x)

(cdddr x)

(cddr x)

(cdr x)

(chdir path)

U

(chmod path mode)

U

(chmod f (encode-file-mode (append '(XUSR XGRP XOTH) (cdr (assq 'mode (stat f))))))

(chown path uid gid)

U

(closedir stream)

U

(cond clause1 clause2 ...)

(predicate-expression form1 form2 ...)

(cons x y)

(cons 1 (cons 2 (cons 3 ())))

(1 2 3)

(cons-array dimension kind)

(copy-list x)

(cos x)

(cpu-usage-limits soft-limit hard-limit)

U

(crypt key salt)

U

(current-resource-usage kind)

U

(datlength data ctype)

(datref data ctype index)

(decode-file-mode x)

(define subform1 subform2)

(define variable value)

(define (procedure-name arg1 arg2 ...) form1 form2 ...)

(delete-file path)

(delq element list)

(encode-file-mode list)

(encode-open-flags list)

U

(endpwent)

U

(env-lookup indentifier environment)

(eof-val)

(eq? x y)

(equal? x y)

(eqv? x y)

errobj

(error message object)

(define (error message object) (if (> (verbose 0)) (writes nil "ERROR: " message "

")) (set! errobj object) (*throw 'errobj (cons message object)))

(eval expression environment)

(eval (read-from-string "(+ 1 2)"))

(exec path args env)

U

(exit status)

U

(exp x)

(fast-load path noeval-flag)

(fast-print object state)

(fast-read state)

(fast-save filename forms nohash-flag comment-string)

(fchmod filedes mode)

U

(fclose stream)

U

(fflush stream)

U

(file-times path)

U

(first x)

(fmod x y)

U

(fnmatch pattern string flags)

U

(fopen path mode)

U

(fork)

U

(fread size-or-buffer stream)

U

(fseek file offset direction)

U

(fstat stream)

U

(ftell stream)

U

(fwrite data stream)

(gc)

(gc-info item)

Item Value 0 true if copying gc, false if mark and sweek 1 number of active heaps 2 maximum number of heaps 3 number of objects per heap 4 amount of consing of objects before next gc

(gc-status [flag])

(get object key)

(getc stream)

U

(getcwd)

U

(getenv name)

U

(getgid)

U

(getgrgid gid)

U

(getpass prompt)

U

(getpgrp)

U

(getpid)

U

(getppid)

U

(getpwent)

U

(getpwnam username)

U

(getpwuid)

U

(gets stream)

(getuid)

U

(gmtime value)

U

(hexstr->bytes str)

(href table key)

(define (href table key) (cdr (assoc key (aref table (sxhash key (length table))))))

(hset table key value)

(html-encode str)

(if predicate-form true-form false-form)

(intern str)

(kill pid sig)

U

(lambda (arg1 arg2 ...) form1 form2 ...)

(mapcar (lambda (x) (* x x)) '(1 2 3))

(1 4 9)

(larg-default list index default-value)

(last list)

(last-c-error)

U

(lchown path owner group)

U

(length object)

(let (binding1 binding2 ...) form1 form2 ...)

(let ((x 10) (y 20)) (+ x y))

(let* (binding1 binding2 ...) form1 form2 ...)

(let* ((x 10) (y (+ x 10))) (+ x y))

(letrec (binding1 binding2 ...) form1 form2 ...)

(link existing-file entry-to-create)

U

(list item1 item2 ...)

(lkey-default list index default-value)

(load fname noeval-flag search-flag)

(load-so fname init_fcn)

(localtime value)

U

(log x)

(lref-default list index default-fcn)

(lstat path)

U

(make-list length element)

(mapcar fcn list1 list2 ...)

(max x1 x2 ...)

(md5-final state)

(define (md5 str) (let ((s (md5-init))) (md5-update s str) (array->hexstr (md5-final s))))

(md5-init)

(md5-update state string length)

(member key list)

(memq key list)

(memv key list)

(min x1 x2 ...)

(mkdatref ctype ind)

(mkdir path mode)

U

(mktime alist)

U

(nconc l1 l2)

(nice increment)

U

nil

(not x)

(nreverse list)

(nth index list)

(null? x)

(number->string x base width precision)

(number? x)

(opendir path)

U

(or form1 form2 ...)

(os-classification)

(pair? x)

(parse-number str)

(pclose stream)

U

(popen command type)

U

(pow x y)

(prin1 object stream)

(print object stream)

(print-to-string object string no-trunc-flag)

(prog1 form1 form2 form3 ...)

(putc char stream)

U

(putenv setting)

U

(putprop object value key)

(puts string stream)

U

(qsort list predicate-fcn access-fcn)

Example Result (qsort '(3 1 5 4 2) <) (1 2 3 4 5) (qsort '((3 a) (2 b)) < car) ((2 b) (3 a))

(quit)

(quote x)

(rand modulus)

(random modulus)

(read stream)

(read-from-string string)

(readdir directory-stream)

(readline stream)

(define (load-spread-sheet filename) (if (>= (verbose) 2) (writes nil ";; loading spread sheet " filename "

")) (let ((result nil) (line nil) (f (and (not (equal? filename "-")) (fopen filename "r")))) (while (set! line (readline f)) (set! result (cons (strbreakup line "\t") result))) (and f (fclose f)) (nreverse result)))

(readlink path)

U

(realtime)

(rename from-path to-path)

U

(require path)

(require-so path)

(require-so (so-ext 'name))

(rest x)

(reverse x)

(rld-pathnames)

(rmdir path)

U

(runtime)

(save-forms filename forms how)

(sdatref spec data)

(set! variable value)

(set-car! cons-cell value)

(set-cdr! cons-cell value)

(set-eval-history length circular-flag)

(define (fib x) (if (< x 2) x (+ (fib (- x 1)) (fib (- x 2))))) (set-eval-history 200) (fib 10) (mapcar (lambda (x) (if (pair? x) (car x) x)) *eval-history*)

(set-symbol-value! symbol value env)

(setprop obj key value)

(setpwent)

U

(setuid x)

U

(sin x)

(siod-lib)

(sleep n)

(so-ext path)

(sqrt x)

(srand seed)

U

(srandom seed)

U

(stat path)

(strbreakup string sep)

(strbreakup "x=y&z=3" "&") => ("x=y" "z=3")

(strcat str1 str2)

U

(strcmp str1 str2)

U

(strcpy str1 str2)

U

(strcspn str indicators)

U

(strftime format-string alist)

(strftime "%B" '((mon . 3))) => "April"

U

(string->number str radix)

(string-append str1 str2 str3 ...)

(string-dimension str)

(string-downcase str)

(string-length str)

(string-lessp str1 str2)

(string-search key str)

(string-trim str)

(string-trim-left str)

(string-trim-right str)

(string-upcase str)

(string? x)

(strptime str format alist)

(cdr (assq 'mon (strptime "March" "%B"))) => 2

U

(strspn str indicators)

(define (string-trim-left x) (substring x (strspn x " \t")))

U

(subset pred-fcn list)

(subset number? '(1 b 2 c)) => (1 2)

(substring str start end)

(substring-equal? str str2 start end)

(swrite stream table form)

(sxhash data modulus)

(symbol-bound? symbol env)

(symbol-value symbol env)

(symbol? x)

(symbolconc arg1 arg2 ...)

(symlink contents-path link-path)

U

(system arg1 arg2 ...)

U

t

(tan x)

(the-environment)

(trace fcn1 fcn2 ...)

(trunc x)

(typeof x)

(unbreakupstr list sep)

(define (save-spread-sheet filename data) (if (>= (verbose) 2) (writes nil ";; saving spread sheet " filename "

")) (let ((result data) (f (and (not (equal? filename "-")) (fopen filename "w")))) (while result (writes f (unbreakupstr (car result) "\t") "

") (set! result (cdr result))) (and f (fclose f))))

(ungetc char stream)

(unix-ctime x)

U

(unix-time)

U

(unix-time->strtime x)

(unlink path)

U

(untrace fcn1 fcn2 ...)

(url-decode str)

(url-encode str)

(utime path modification-time access-time)

U

(verbose arg)

Verbose Level Effect on System 0 No messages. 1 Error messages only. 2 Startup messages, prompts, and evaluation timing. 3 File loading and saving messages. 4 (default) Garbage collection messages. 5 display of data loaded from files and fetched from databases.

(wait pid options)

U

(while pred-form form1 form2 ...)

(writes stream data1 data2 data3 ...)

Reference Section for extension-provided procedures

Extension: acct

(decode_acct string)

(decode_tacct string)

UTMP_FILE

WTMP_FILE

(endutent)

U

(getutent)

U

(setutent)

U

(utmpname path)

U

Extension: gd

U

gdBrushed

gdFont.h

gdFont.w

gdFontGiant

gdFontLarge

gdFontMediumBold

gdFontSmall

gdFontTiny

gdImageArc

gdImageChar

gdImageCharUp

gdImageColorAllocate

gdImageColorClosest

gdImageColorExact

gdImageColorTransparent

gdImageCreate

gdImageCreateFromGif

gdImageCreateFromXbm

gdImageFill

gdImageFillToBorder

gdImageFilledPolygon

gdImageFilledRectangle

gdImageGif

gdImageGifmem

gdImageInterlace

gdImageLine

gdImagePolygon

gdImageRectangle

gdImageSetPixel

gdImageString

gdImageStringUp

gdPoint

gdPoint.x

gdPoint.y

gdStyled

gdStyledBrushed

gdTiled

gdTransparent

Extension: ndbm

U

DBLKSIZ

DBM_INSERT

DBM_REPLACE

PBLKSIZ

(dbm_close handle)

(dbm_delete handle key)

(dbm_dirfno handle)

(dbm_error handle)

(dbm_fetch handle key)

(dbm_firstkey handle)

(dbm_nextkey handle)

(dbm_open path open-flags mode)

(dbm_pagfno handle)

(dbm_rdonly handle)

(dbm_store handle key data flag)

Extension: parser_pratt

(pratt_read_token buffer token-table stream)

Example, hello.scm

#!/usr/local/bin/siod -v01,-m2 -*-mode:text;parser:pratt-*- main() := {writes(nil,"Hello Scheme World.

"); fflush(nil); writes(nil,"fib(20) = ",fib(20),"

"); } $ fib(x) := if x < 2 then x else fib(x-1) + fib(x-2) $

Extension: regex

U

(regcomp pattern flags)

(regerror code handle)

(regexec handle string flags)

Extension: sql_oracle

Extension: sql_rdb

Extension: sql_sybase

(sybase-close [handle])

(sybase-execute [handle] string cmd-type key1 arg1 key2 arg2 ...)

(sybase-open key1 arg1 key2 arg2 ...)

(sybase-status [handle])

Extension: ss

U

(get-protocol-name number)

(get-service-name port-number)

(gethostbyaddr x)

(gethostbyname name)

(mapcar inet_addr (cdr (assq 'addr_list (gethostbyname (gethostname)))))

(gethostname)

(inet_addr str)

(s-accept stream)

(s-close stream)

(s-drain stream)

(s-force-output stream)

(s-getc stream)

(s-gets stream)

(s-open address port listen-flag)

(s-putc char stream)

(s-puts string stream)

(s-read size-or-buffer stream)

(s-read-sexp stream)

(gethostid)

(wsa-data)

Extension: tar

*tar-header-size*

(checksum-tar-header string)

(decode-tar-header string)

Command interfaces and some scripts provided

Command Purpose siod the interpreter csiod command linker for siod scripts cp-build a file copy command with versioning and audit trail ftp-cp passive-mode ftp copy ftp-put passive-mode ftp put with rename http-get command-line http client http-stress stress and http server proxy-server serializing, logging, http proxy server snapshot-dir create a snapshot of a directory hierarchy. snapshot-compare compare hierarchy snapshots, with options.

Some scheme coded library modules

Name Purpose cgi-echo.scm example cgi script, echo the environment find-files.scm works like the unix find command, but provides lisp data. fork-test.scm example use of fork hello.scm an example command using infix syntax http-server.scm useful as a socket example http-stress.scm http client with stress features http.scm more http client examples ftp.scm support for file transfer protocol maze-support.scm cgi script example, provides a run-maze subroutine parser_pratt.scm interface to infix language parser pop3.scm A pop3 client pratt.scm infix language parser selfdoc.scm create a table of built-in procedures siod.scm mostly obsolete collection of utility subroutines smtp.scm smtp client subroutines sql_oracle.scm utilities for oracle database client sql_rdb.scm utilities for rdb database client sql_sybase.scm utilities for sybase database client piechart.scm a CGI script that returns a piechart as a GIF

Garbage Collection

Then, by the time SIOD had been out for a year there had been enough complaints about the lack of fully available GC that I was motivated to utilize a stack heuristic, since I had no intention of maintaining any explicit book-keeping code in the source. The published arguments in favor of the conservative approach described by Hans Boehm of Xerox Parc then reduced this design decision to a no-brainer, and his implementation suggested the use of setjmp as a sufficiently portable way for C code to get at the machine register set without introducing assembling language. To SIOD then I added only about 300 bytes of VAX instructions to the size of the runtime system.

There are two storage management techniques which may be chosen at runtime by specifying the -g argument flag.

-g1 is stop-and-copy. This is the simplest and most portable implementation. GC is only done at toplevel.

-g0 (the default) is mark-and-sweep. GC is done at any time, required or requested. However, the implementation is not as portable.

If you get strange errors on a machine architecture not listed then you may be forced to use -g1 until you investigate and contact the author for advise.

Stop and Copy

The real tricks in handling garbage collection are (in a copying gc):

keeping track of locations containing objects parsing the heap (in the space scanning)

The procedure gc_protect is called once (e.g. at startup) on each global location which will contain a lisp object.

That leaves the stack. The beleive is that if we had chosen not to use the argument and return-value passing mechanism provided by the C-language implementation, (also known as the "machine stack" and "machine procedure calling mechanism) this lisp would be larger, slower, and rather more difficult to read and understand. Furthermore it would be considerably more painful to *add* functionality in the way of SUBR's to the implementation.

Aside from writing a very machine and compiler specific assembling language routine for each C-language implementation, embodying assumptions about the placement choices for arguments and local values, etc, we are left with the following limitation:

YOU CAN GC ONLY AT TOP-LEVEL

However, this fits in perfectly with the programming style imposed in many user interface implementations including the MIT X-Window Toolkit. In the X Toolkit, a callback or work procedure is not supposed to spend much time implementing the action. Therefore it cannot have allocated much storage, and the callback trampoline mechanism can post a work procedure to call the garbage collector when needed.

Our simple object format makes parsing the heap rather trivial. In more complex situations one ends up requiring object headers or markers of some kind to keep track of the actual storage lengths of objects and what components of objects are lisp pointers.

Because of the usefulness of strings, they were added by default into SIOD 2.6. The implementation requires a hook that calls the C library memory free procedure when an object is in oldspace and never got relocated to newspace. Obviously this slows down the stop-and-sweep GC, and removes one of the usual advantages it has over mark-and-sweep.

Mark and Sweep

Another advantage of the mark_and_sweep storage management technique is that only one heap is required.

The main disadvantages are:

start-up cost to initially link freelist. (can be avoided by more general but slower NEWCELL code). does not COMPACT or LOCALIZE the use of storage. This is poor engineering practice in a virtual memory environment. the entire heap must be looked at, not just the parts with useful storage.

In general, mark-and-sweep is slower in that it has to look at more memory locations for a given heap size, however the heap size can be smaller for a given problem being solved. More complex analysis is required when READ-ONLY, STATIC, storage spaces are used (which we do not support, currently). Additionally the most sophisticated stop-and-copy storage management techniques take into account considerations of object usage temporality.

The technique assumes that all machine registers the GC needs to look at will be saved by a setjmp call into the save_regs_gc_mark data, and that every thing else is on the C runtime stack. Hence we have some assumptions that impact portability.

Porting

If your system or C runtime needs to poll for the interrupt signal mechanism to work, then define INTERRUPT_CHECK to be something useful.

The STACK_LIMIT and STACK_CHECK macros may need to be conditionized. They currently assume stack growth downward in virtual address. The subr (%%stack-limit setting non-verbose) may be used to change the limits at runtime.

The stack and register marking code used in the mark-and-sweep GC is unlikely to work on machines that do not keep the procedure call stack in main memory at all times. It is assumed that setjmp saves all registers into the jmp_buff data structure. If your target machine architecture is radically different, such as using linked procedure call frames of some kind, not organized as a stack, then it would be best if you could find vendor-supported routines for walking these frames, such as would be utilized by a debugger. The mark_locations procedure can then be invoked multiple times with the proper start and end addresses.

If the stack is not always aligned (in LISP-PTR sense) then the gc_mark_and_sweep procedure will not work properly unless steps are taken to work around the problem.

Example, assuming a byte addressed 32-bit pointer machine:

stack_start_ptr: [LISP-PTR(4)] [LISP-PTR(4)] [RANDOM(4)] [RANDOM(2)] [LISP-PTR(4)] [LISP-PTR(4)] [RANDOM(2)] [LISP-PTR(4)] [LISP-PTR(4)] stack_end: [LISP-PTR(4)]

As mark_locations goes from start to end it will get off proper alignment somewhere in the middle, and therefore the stack marking operation will not properly identify some valid lisp pointers.

Fortunately there is an easy fix to this. A more aggressive use of our mark_locations procedure will suffice.

For example, say that there might be 2-byte quantities inserted into the stack. Then use two calls to mark_locations, as as in THINK_C on the Macintosh:

mark_locations(((char *)stack_start_ptr) + 0,((char *)&stack_end) + 0); mark_locations(((char *)stack_start_ptr) + 2,((char *)&stack_end) + 2);

If we think there might be 1-byte quantities, then 4 calls are required:

mark_locations(((char *)stack_start_ptr) + 0,((char *)&stack_end) + 0); mark_locations(((char *)stack_start_ptr) + 1,((char *)&stack_end) + 1); mark_locations(((char *)stack_start_ptr) + 2,((char *)&stack_end) + 2); mark_locations(((char *)stack_start_ptr) + 3,((char *)&stack_end) + 3);

Porting to tiny machines

Some things to consider:

changing the double data type in struct obj into a float.

changing mark and type to bytes, or bitfields in a byte.

reducing default sizes of heap, interned numbers, obarray_dim.

excluding modules such as slibu.c and md5.c

making sure constant strings are allocated by the linker into read-only data sections.

There are certainly other was to organize data, avoiding the use of the C programming struct support, and utilizing carefully contrived macro definitions instead.

Writing extensions in the C programming language

There are three common reasons for wanting to write an extension to the system using the C programming language:

For runtime efficiency. To take advantage of operating system, or other runtime library provided functionality. To play games with evaluator semantics.

Some examples of the first class are the functions memq, and nth, study them. These extensions are straightforward, and easy to debug from the C language debugger, with the functions err0, pr, and prp being provided to call back into the lisp runtime system from the C debugger.

Some examples from the second class are the ndbm and regex modules, and the support for commercial database client interfaces. In many cases it is convenient to define new scheme data types to encapsulate the complex state of an API. Study how to utilize allocate_user_tc, set_gc_hooks and set_print_hooks. Careful ordering of storage allocation and interrupt management are important. Also don't forget that most C programming API functions do not handle being longjump'd through very well, so beware of how you handle callbacks and SIGINT.

Functions such as get_c_string, get_c_string_dim, get_c_long, get_c_double, and get_c_file are usually all you need to get at the data you require to get the job done. But beware of spoofing the garbage collector. For example, never do something equivalent to this:

{LISP x; char *z; x = strcons(100,NULL); z = get_c_string(z); /* no further references to x, but z is used */ }

Because there are no further references to x, the C compiler might very well reuse the location on the stack in which x resided. If there is any other consing then the garbage collector will go off at some point in the future inside this function, and it will free the memory pointed to by z. A potential example of this sort of thing is the built-in procedure lexec. In theory a C compiler might store envp and gcsafe in the same memory location. But of course for other reasons it is impossible for that to cause problems unless get_c_string was extended to invoke the evaluator in some cases.

If you want to play with evaluator semantics you need to study the leval function and perhaps the lapply function too. The tc_fsubr object is the conventional way to extend an evaluator, but the tc_msubr is more powerfull and allows for a modular tail recursion. The set_eval_hooks function allows for arbitrary evalution semantics when the first element of a form evaluates to a new datatype.

User Type Extension

a user_relocate, takes an object and returns a new copy. a user_scan, takes an object and calls relocate on its subparts. a user_mark, takes an object and calls gc_mark on its subparts or it may return one of these to avoid stack growth. a user_free, takes an object to hack before it gets onto the freelist.

set_gc_hooks(type, user_relocate_fcn, user_scan_fcn, user_mark_fcn, user_free_fcn, &kind_of_gc);

The variable kind_of_gc should be a long. It will receive 0 for mark-and-sweep, 1 for stop-and-copy. Therefore set_gc_hooks should be called AFTER process_cla. You must specify a relocate function with stop-and-copy. The scan function may be NULL if your user types will not have lisp objects in them. Under mark-and-sweep the mark function is required but the free function may be NULL.

You might also want to extend the printer. This is optional.

set_print_hooks(type,fcn);

LIBSIOD use as an extension language for C programs

#include <stdio.h> #include <stdlib.h> #include <string.h> #include "siod.h" int main(int argc,char **argv) {int j,retval = 0; long iobufflen = 1000; char *iobuff,*sargv[4]; sargv[0] = argv[0]; sargv[1] = "-v0"; sargv[2] = "-g0"; sargv[3] = "-h10000:10"; siod_init(4,sargv); iobuff = (char *) malloc(iobufflen); retval = 0; for(j=1;j<argc;++j) {sprintf(iobuff,"(*catch 'errobj (begin %s))",argv[j]); printf("Evaluating %s, ",argv[j]); retval = repl_c_string(iobuff,0,0,iobufflen); printf("retval = %d

%s

",retval,iobuff);} return(retval);}

Implementation of EVAL and environment representation

C Name arguments Scheme Name leval form,env eval symbol_value sym,env symbol-value symbol_boundp sym,env symbol-bound? setvar sym,value,env set-symbol-value! envlookup sym,env env-lookup

The most common value to pass for env, especially when used from C programs is the value NIL, the empty list. This NIL represents the global, or toplevel environment.

If you go beyond considering the NIL environment then you can get into areas of the system which are subject to change. Although not often. With release 1.0 of SIOD in April of 1988 the environment was a pure association list. But with release 1.3 in May of 1988 it was changed to the list of frames as described in "The Art of the Interpreter." Over the last 9 years that hasn't changed. For example:

> (%%closure-env (let ((x 3)) (lambda ()))) (((x) 3)) > (let ((x 3)) (the-environment)) (((x) 3))

In an environment frame the car is a list of symbols and the cdr is a list of values. The env-lookup procedure returns a list such that car can be used to obtain the value, and set-car! can be used to assign the value.

> (env-lookup 'x '( ( (a b x c d) 1 2 3 4 5 ) )) (3 4 5)

The env-lookup does not work on the global environment. This could be considered an architecture bug. The global environment is represented by actual slots in the symbol structure, rather then as entries in some general frame representation. If SIOD had a "locative" data type then env-lookup might well return that. But either way there is a dicotomy between local and global environment representation which is usually considered to be a bad thing, even though it is a classic implementation technique.

Possibly just-as-good in practice would be to allow an environment frame to be an efficient test=EQ? hash-table.

A future direction to take in SIOD is most likely to involve embracing operating-system-specific environment representations, when appropriate, especially those having to do with underlying library and dynamic linking implementation.

Windows NT and Windows 95 Configuration

To enable usage from within Microsoft Internet Information Server, the registry key is HKEY_LOCAL_MACHINE, SYSTEM, CurrentControlSet, Services, W3SVC, Parameters, Script Map. Create a new string value:

name: .smd data: c:\siod\siod.exe -v0,-m3 %s

To enable usage from the Command Prompt (Windows NT only) or the Windows GUI, it is easiest to use the File Types tab you get by viewing options of My Computer. You will want to create a new type with associated file extension SMD:

action: open application: c:\siod\siod.exe -v01,-m2

Note the different level of main program and verbosity between web server and command usage. This is recommended.

The siod.mak file is used with Microsoft Visual C++ 4.0 development environment. Executable files may also be created. See the winsiod.html support document.

Unix configuration

In all versions beware that LD_LIBRARY_PATH must be set to include the current directory "." first if the development libsiod is to be found first. Otherwise rebuilding it will have no effect at runtime.

In OSF1 everything works without a glitch when the default installation targets are chosen.

In Solaris I found that I had to make a soft link from /usr/lib/libsiod.so to /usr/local/lib/libsiod.so. The diagnostic ldd -s /usr/local/bin/siod, shows that the default lib is only /usr/lib. Make a note to look into setting RPATH in the LD. Setting flag -R /usr/local/lib/siod, would also help remove a kludge from load_so in the slibu.c file.

In Linux you must run the ldconfig command after installing siod. Try ldconfig -v.

References

Contributors

Paul Stodghill

Bob Bane

Barak Pearlmutter

Craig Denson

Philip G Wilson

Leo Harten

Philippe Laliberte

andreasg

Acknowledgements

This software contains code derived from the RSA Data Security Inc. MD5 Message-Digest Algorithm.

This winsiod precompiled version of SIOD package contains software written and copyrighted by Henry Spencer. See hs_regex.html.