SimpSamp provides three categories of functions and macros.

Functions and macros for a wider variety of input and output types, optimized for each combination. Faster, but less flexible.

Functions that can be used with a few different types of input sets.

The result of mutating a set while it is being iterated upon is undefined.

Whenever a parameter calls for a function, any function designator (such as a symbol naming a function) may be used.

Whenever a random sample is requested with a size larger than the input set, the requested size is limited to the size of the input set. In this case, the resulting sample is the entire input set.

The order of elements within any random sample is not assured to be random, even though it may appear so at first glance with some functions and macros. The only thing that is random is which selection is made of all the possible subsets of the input set that is of the requested size.

When a "random sample" is said to be taken, this is shorthand for a simple random sample without selection.

For other types of SET , FUNCTION is applied with one argument: the value of the current element.

When SET is a mapping type (e.g. a hash table), FUNCTION is applied with two arguments: the key and the value of the current entry.

The function to call with each entry of the selected random sample. The arguments given to FUNCTION vary with the type of SET .

Iterates over a random sample of a set. For each iteration, FUNCTION is called with the current entry of the sample. The result of each call to FUNCTION is discarded—results are not accumulated.

Other than the resulting sequence type, the parameters and semantics are identical to those of SAMPLE . See: Section 3.2.1, “ SAMPLE : Samples as Sequences”

When SET is a mapping type (e.g. a hash table), the returned elements will be CONS cells of the form ( KEY . VALUE ) .

SimpSamp offers a few generic functions, which are explained in this section. These generic functions have methods which specialize on LIST , VECTOR , and HASH-TABLE .

EXHAUSTED?-FN is always called before each call to NEXT-FN . The first time that EXHAUSTED?-FN returns true, NEXT-FN will not be called again.

NEXT-FN is a function of no arguments that returns the next element of the set, advancing through the set. EXHAUSTED?-FN is a function of no arguments that returns a generalized boolean indicating whether the end of the set has been reached.

NEXT-FN may produce multiple values. Together, we refer to the sequence of values as an entry of the set. Depending on the function or macro operating on the set, the values may be turned into a list to be used as the value of a single variable, multiple variables may be bound to the values, or a single variable may be bound only to the primary return value .

SIZE is the number of entries in the input set. NEXT-FN is a function of no arguments that returns the next entry of the set, advancing through the set.

A Set Specification expresses an input set to take a sample of. The Set Specification syntax depends on what kind of input is being operated upon.

Expresses the set to take the sample of. Only hash tables are supported as input. See: Section 3.3.6.1, “Set Specification”

For each selected entry, only the primary value returned by the NEXT-FN function of the iterator is represented in the returned sequence.

In no case should [ ] it be more efficient to request a list than it would be to request the same thing as a vector. For best performance, it is recommended that you use vectors unless your situation demands otherwise.

For each selected entry, all the values returned by the NEXT-FN function of the iterator are given to FUNCTION as corresponding arguments.

The function to call with each entry of the selected random sample. The arguments given to FUNCTION vary with each MAP-SAMPLE-OF-* function:

Iterates over a random sample of a set. For each iteration, FUNCTION is called with the current entry of the sample as one or more arguments. The result of each call to FUNCTION is discarded—results are not accumulated.

A form to be evaluated when iteration ends. The result of this form is used as the result of the DO-SAMPLE-OF-* macro. If not given, there are no values in the result of the macro.

Useful in conjunction with EFFECTIVE-N-VAR to initialize any output data structures (but this technique is mostly useful to other functions in SimpSamp that use the DO-SAMPLE-OF-* macros).

When EFFECTIVE-N-VAR is given, it is bound for the evaluation of INITIAL-FORM , RESULT-FORM , and BODY .

A symbol naming a variable to be bound with the effective number of entries of the selected sample. (The result of N-FORM is limited by the actual size of the input set.)

The DO-SAMPLE-OF-* macros evaluate each element of the set specification as expressions to produce the corresponding value to use.

Evaluated to produce the size of the sample to take of the input set.

It is unspecified whether fresh bindings are made for each evaluation of BODY , or whether the same bindings are reused each time.

Either a list of symbols naming variables to be bound to each corresponding value returned by the iterator for each chosen entry, or a symbol naming a variable to be bound to a list of all the values returned by the iterator for each chosen entry.

Expresses which variables to bind to each entry of the selection for the evaluation of BODY . The syntax varies with each macro.

All the macros establish an implicit block named NIL . BODY may terminate the loop prematurely and specify the resulting values of the macro with RETURN .

These macros iterate over a random sample of a set. For each iteration, one or more variables are bound to the current entry of the sample, and a given body of code is evaluated.

The following table lists the available functions and macros, listed according to the type of input they operate on (row-wise) and how they emit their output (column-wise).

Most of the functions and macros provided by SimpSamp are specialized for their input and output types. Each one is optimized for that combination of input and output type, and can only operate on the type of input set it was designed for.

3.4. Low-Level Macros

The generic functions and specialized functions/macros in SimpSamp depend on two core algorithms, both implemented as macros.

The algorithms are implemented as macros because we want to stream the input of the algorithms, element by element, from other components, and (in the case of selection sampling) stream the output of the algorithms to other components. This is unnecessarily slow (on SBCL) when done the functional way. With macros, different parts can still be composed (sample a vector → sample an integer range → sample an abstract iterator), and functional interfaces are provided that present a clean interface to the fast internal macro-based code.

I've found that using macros to layer functionality gives me the same speed as a hand-written function that does exactly what I want. During prototyping, using functional techniques slowed things down up to about 25%.

This solution isn't the most elegant, but the speed matters for where I intended to use the library.

3.4.1. DO-SAMPLE-OF-ITERATOR-EXPR : Selection Sampling DO-SAMPLE-OF-ITERATOR-EXPR Syntax. ( DO-SAMPLE-OF-ITERATOR-EXPR ( VARS N-FORM SIZE-FORM NEXT-FORM &key EFFECTIVE-N-VAR INITIAL-FORM RESULT-FORM ) &body BODY ) => (The result of RESULT-FORM , or no values if not given.) This macro iterates over a random sample of a set given as an expression of its size and an expression which evaluates to the next entry of the set, advancing through the set. For each iteration, one or more variables are bound to the current entry of the sample, and a given body of code is evaluated. This macro is nearly identical in syntax and semantics to the DO-SAMPLE-OF-ITERATOR macro. The key difference is that this macro takes the next entry of the input set by evaluating NEXT-FORM each time, whereas with DO-SAMPLE-OF-ITERATOR , NEXT-FN is evaluated once to produce a function which is called each time. With this difference highlighted, the specification of this macro defers to: Section 3.3.2, “ DO-SAMPLE-OF-* : The Iteration Macros”