Schani's Old Stuff

Introduction

I've written lots of software over the past years. Some of it was intended to be released in a nice package to the world at some point, some was even meant to be useful, but most of it was written because I was curious how it would turn out and/or because I was bored at the time. As it happens, I have released and continue to maintain a few projects, which you can find on my homepage, but there's also a lot of stuff that didn't make it, in most cases because I was too lazy to get it into a form fit for public consumption. I still think that most of these pieces of software - some of them very simple, others not so - will be of interest to somebody, but since I'm still too lazy I have chosen to create this page as a place for my abandoned software. Just because I put a project on this page doesn't mean that I don't care about it, or don't want to be bothered with it any more, though, so if you have any questions, comments, or even consider using some piece of this software for whatever purpose, don't hesitate to mail me.

Legal Conditions

All the code on this page is free software distributed under the terms of the GNU General Public Licence.

Contents

Solving a Minesweeper configuration, meaning given a configuration, finding out which fields are certain to be mines and which are certain not to be, is really very easy. Here's an algorithm to do it: Put together a list of all unvisited fields which have at least one visited neighbour.

For these fields, try out all combinations of assigning each field either bomb status or no-bomb status. Eliminate all combinations where the resulting configurations contain trivial contradictions (i.e., where visited fields have either too many or too few neighbouring bombs).

For all the remaining combinations (the ones without contradictions), find out which fields have the same statuses in all combinations. If in all combinations a field has bomb status, it's sure to contain a bomb. If it has non-bomb status in all configurations, it's sure not to contain a bomb. Of all the other fields, no certain statement can be made. This algorithm will solve all solvable Minesweeper configurations. Unfortunately, it's also very inefficient. For example, take this configuration: ----------- -8-8-8-8-8- ----------- ----------- -8-8-8-8-8- ----------- ----------- -8-8-8-8-8- ----------- The - signs are fields which are not yet visited, the 8 s are visited fields with 8 mines in their neighbourhood. Although it's obvious that all unvisited fields contain mines, the above algorithm would have to try 2^84 combinations to find that out. Assuming you have a 10 GHz Pentium 6 and your implementation can generate and check each combination in only 10 cycles, you'll have to wait about 613 million years for the solution. Even worse, it could be that there is no algorithm for solving a Minesweeper configuration which has better worst-case asymptotic performance than the one above. Richard Kaye has proven that determining whether a Minesweeper configuration is consistent, i.e., whether it could arise during normal play and does not contradict itself, is NP-complete. Note that that does not mean that solving a configuration is requires exponential time, but it does at least make it somewhat likely. It's not surprising, then, that my solver doesn't solve all solvable configurations. For those that it does solve, it's reasonably efficient, though. It can solve an expert board (99 mines in 30x16 fields) from start to finish in less than one tenth of a second, if it's successful. The solver works by making deductions about minimum and maximum numbers of mines in sets of fields, very similar to how I usually think when I play the game. Here's an example (a + signifies a known mine): +33+ ---+ Via the left 3 we know that the three unvisited fields contain exactly two mines, while via the right 3 we know that the two right unvisited fields contain exactly one mine. If we combine that information, we can conclude that the left unvisited field contains a mine, although we cannot say which one of the other two contains a mine. In general, let's say we have a set N with |N| fields, about which we know that it contains at least Nl mines and at most Nu mines. We also have a set M with |M| fields, containing at least Ml mines and at most Mu mines. We'll first consider the intersection I of the two sets, which contains all fields that are both in N as well as in M. How many mines does this set contain at a minimum? Assume N contains its minimum amount of mines, Nl. Also assume that as many of those mines as possible are in those fields that are not in the intersection set I, namely |N|-|I| (that's the number of fields in N but not in I). So, I must contain at least Nl-(|N|-|I|) mines. We can use the same argument for the set M and arrive at another lower bound of Ml-(|M|-|I|) mines. Note that both of these numbers could be zero, so our third lower bound is 0. The best lower bound we can give, therefore, is the largest of these three numbers. Next we'll see what we can say about the maximum number of mines in the set I. One upper bound is the maximum number of mines in N, Nu. The other one is Mu. Of course, I cannot contain more mines than it contains fields, so the third upper bound is |I|. The best upper bound we can give is the lowest of these three numbers. Having established lower and upper mine bounds Il and Iu for the intersection set I, we'll now consider the difference set D=N-M, which contains all fields in N which are not in M. To establish its lower bound, assume that N contains as few mines as possible (Nl), and that as many of N's mines as possible are in the intersection I (that's the part of N that is not in D). We know this latter number, too, it's Iu, the maximum number of mines in I. That gives us a lower bound of Nl-Iu, which can, unfortunately, be negative, so the second lower bound is 0. We can establish the upper bound very similarly: Assume that N contains as many mines as possible (Nu), and that as few mines as possible are in the intersection I (Il), giving an upper bound of Nu-Il, with the other upper bound being |D|, the number of fields in the difference set. The algorithm my solver uses is this: Given a configuration, for each visited field which neighbours on at least one non-visited field, build a set containing all neighbouring non-visited fields. The lower and upper bounds of this set is the number of mines the visited field still misses in its neighbourhood, i.e., the number on the field minus the number of already identified mines in its neighbourhood. Put all those sets into the list Q.

Set D to be the empty list.

Repeat the following until Q is empty: Take one set N out of Q and for each set M in D do: Build the difference set N-M according to the description above. If there is no set in D containing the same fields as this difference set, add it to D. If there is, and its bounds are at least as good as those of our new N-M, do nothing. If there is, but at least one of the bounds of N-M is better than the old bound, take the old set out of D, set its bounds to the improved values (if only one new bound is better, than only that bound is improved, i.e., if one of the new bounds is worse than the old one, it is not put in), and put the set into Q. Then do the same for the set M-N as well. When the algorithm terminates, the list D will contain zero or more sets with lower and upper bounds. Of real interest are only those sets whose lower and upper bounds are equal and are also equal to the number of fields in the set, or equal to zero. If both bounds are equal to the number of fields in the set, we know that all fields contain mines. If both bounds are zero, none of them do. The lower and upper bounds of the other sets can at best be interpreted as approximate probabilities for the presence of mines. If a set containing one field has a lower bound of 0 and an upper bound of 1, it is wrong to conclude that it has a 50% chance of containing a mine. An exhaustive algorithm might find that that field must contain a mine, for example. In fact, a program for finding the exact probabilities in all configurations must be at least as inefficient as a program for solving all solvable configurations (after all, solving a configuration is nothing else than finding the probabilities for only those fields which are certain to contain mines and for those which are certain not to, which are 100% and 0%, respectively). To illustrate this point a bit further, here's a configuration my solver cannot solve: ++4-2 --++- +6--3 ++22+ After finishing its algorithm, my solver proclaims that the second field in the second row (the one directly above the 6 ) contains at least 0 mines and at most 1. While this is certainly true (the bounds my solver produces are always accurate, though not necessarily the tightest ones possible), the probability that that field contains a mine is nowhere near 50%. The exact probability is in fact 100%, meaning that field must contain a mine. You can easily verify this by examining the consequences of there being no mine in that field. Obviously, there is room for improvement in my solver. One such improvement which would probably take care of most cases arising during solving random mine fields would be, in case the above algorithm fails to unearth some certain mines or clear fields, to go through all unvisited fields neighbouring at least one visited field and try what the consequences would be if there were a mine in that field, and what if there were not. If one of these two scenarios resulted in a trivial contradiction, the other scenario would inevitably reflect the truth (assuming, of course, that the whole configuration does not in itself contain a conflict). Note that if none of the two possibilities turn out to result in contradiction, no conclusion can be made. Of course, that new solver would still not be able to solve all configurations, although it would take care of the example above. Download

The 12 coins problem can be stated as follows: You have 12 similar looking coins, 11 of which have exactly the same weight. The 12th coin is a counterfeit and has a different weight, but you don't know if it's lighter or heavier than the others. You have a scales and are allowed 3 weighings to determine the counterfeit coin. This is a description of my solution. It turns out that there are analytical solutions which don't require the help of a computer and which extend to larger numbers of coins, but those are not the solution I found. I figured out (or, rather, I hoped) that it might be possible to set up three weighings in advance and from the results determine the counterfeit coin. By "in advance" I mean that the configuration of the second weighing doesn't depend on the result of the first, etc. My instincts told me that in each of the weighings I should weight 4 coins against 4 others and leave the remaining 4 coins alone. Let's name the 12 coins "A" to "L" and set up the weighings like this: ABCDEFGHIJKL 1 000011112222 2 011100021222 3 101201202012 Line 1 is the configuration for the first weighing, line 2 for the second and 3 for the third. A "0" in a coin's column means that it isn't weighed in that round. A "1" means it's on the left scale and a "2" says it's on the right scale. Hence, according to this table, we first weigh EFGH against IJKL, then BCDI against HJKL, and finally ACFK against DGIL. Now, let's say in our first weighing the left scale is heavier (we denote that as "1"), in the second one the right scale is heavier (denoted as "2") and in the third weighing they were level ("0"). Let's assume that the counterfeit is lighter than the other coins. That means it must have been on the right scale in the first weighing, on the left in the second and on the bench in the third, i.e., it should have the numbers "210" in its column. Referring to the table we note that there is no such coin. Let's assume that it was heavier, then. In this case it should have "120" in its columns, and indeed, coin H (and only coin H) fits our criteria. The question is now, would we always be so lucky to find one, and only one, matching coin if we weighed according to this table? It turns out that there are three (very obvious) criteria that our table must fit for that to be the case: In each weighing, there must be the same number of coins in each scale (in our case, that number is always 4).

There must be no two (different) coins which are in the same scale in each weighing, because we couldn't differentiate between them.

There must be no two (different) coins which are in opposite scales (or both on the bench) in each weighing, because we couldn't tell if one of them is a lighter or the other one is a heavier counterfeit. As if by magic, the table above fits these criteria, as well as a fourth one: Each coin is weighed at least once. This fourth criterion ensures the we can even tell whether the counterfeit is lighter or heaver (if one coin is never weighed, we might be able to tell it's counterfeit because all the others have the same weight, but we couldn't tell whether it's lighter or heavier). The above table was calculated with my little program, which you can download below. It uses a simple brute-force approach, eventually trying out all possibilities. For only 12 coins this works very well and gives solutions instantly, but of course it doesn't scale. Note that if you discard the fourth criterion you can extend the table with a coin which isn't weighed at all and you'd have a solution to the same problem with 13 coins. Download

When I learn a new programming language, the first program I write in that language is usually an interpreter for a small subset of Lisp or Scheme. I have found that to be a good way to quickly familiarize myself with most aspects of a new language. These are some of the results of my efforts.

This is an interpreter for a really simple dynamically scoped Scheme dialect. It only runs with Gforth, because it uses Gforth's structs to implement its data structures. One of the more involved parts of this interpreter is the reader, where I had to do quite a lot of stack juggling to keep everything in line. It doesn't look very involved now but I remember spending quite some time thinking about the stack layout for the reader routines. Download

This is a very efficient interpreter for a small statically scoped subset of Scheme. In the hopelessly contrived recursive Fibonacci benchmark it beats Guile by about a factor of 2.5. It operates by compiling Scheme code into an intermediate data structure which can be executed more efficiently. Most importantly, no symbol lookup needs to happen during execution. I think it's a very nice little piece of software, so I'll present the more important parts of it here. The two main data types are lisp_expr , which describes all Scheme data (including uncompiled code, which is of course nothing else but lists), and lisp_code , which describes compiled code. We'll start with lisp_expr : type lisp_expr = Nil | Cons of lisp_expr * lisp_expr | Int of int This is simple enough. Nil is the empty list, a Cons is a list cell and an Int is an integer. We need integers so that the fib benchmark will run. It goes on: | Symbol of symbol This is a symbol, which, as we'll see below, has a name. Now it gets more complicated, though: | Builtin of (lisp_expr array -> lisp_expr) A builtin is a function that's built into the Scheme interpreter, hence the name. A more modern name would probably be "native function". For example, the function "+" for adding integers is a builtin. Builtins cannot be written in Scheme, but they can be passed around and stored in variables (i.e., they are first class values), that's why they need to be included in this type. A builtin is simply a function taking an array of values (the arguments) and returning a value (the result). | Closure of lisp_code * (lisp_expr array list) This, finally, is a Scheme function, complete with environment. The first part, of type lisp_code , is the code of the function. We'll see the definition of that type below. The second part is the environment of the function (or actually closure). If you're not familiar with closures and environments, here's a short introduction: Let's say you have this function: (lambda (x) (lambda (y) (+ x y))) It's a function that takes one argument x and returns another function, or closure (a "closure" is what we call code with an environment which it needs to execute). You can call this resulting closure with another argument y and it'll give you the sum of the two arguments. Note that you don't need to remember the argument x you gave to the first function, because it's contained in the closure it returned. The value of this argument is therefore contained in the environment of the returned closure. Of course environments can contain more than one value. Most importantly, environments can be deep, as this example illustrates: (lambda (a b) (lambda (c d) (lambda (e f) (- (+ (- a b) (- c d)) (+ e f))))) When you call this function you'll get a closure whose environment contains the arguments a and b . This closure, when called, returns yet another closure, whose environment contains the arguments c and d , but also a and b as well. Let's say you called the function above with the arguments 1 and 2 , and the resulting closure with 3 and 4 . You'd get a closure whose environment is the list of arrays [ [| 3; 4 |]; [| 1; 2 |] ] . The most "recent" argument are always at the front of the list, that's why 3 and 4 are first. Note that the environment does not contain the names of the arguments. Instead, the compiler figures out automatically where in the environment it needs to look for the value of an argument. Note also that the closure does not specify how many arguments it takes, which means that the interpreter cannot catch the error of giving too many or too few arguments to a closure. This is only intentional as far as I was too lazy to make it any fancier. Next are symbols, which are easy: and symbol_value = Value of lisp_expr and symbol = { name: string ; mutable value: symbol_value };; A symbol has a name, obviously, but also a value. All symbols are stored in a global symbol table and a symbol's value is the value of the global variable with that name. Whenever the reader encounters a symbol, it looks it up in the symbol table and if it's not there, a new entry is made (that process is called "interning" the symbol). This means that whenever you use a new symbol you get a new global variable (whose default value is Nil ), which is usually not what you want, but I was too lazy to change this. The easiest way would probably be to add an Undefined alternative to the symbol_value type, so that the interpreter can give an error message whenever an undefined global variable is referenced. We now come to the definition of compiled code: and lisp_code = Quote of lisp_expr This is just a quoted expression, like 'a (which is syntactic sugar for (quote a) . | Global of symbol This is a reference to a global variable. | Global_set of symbol * lisp_code This is code for setting the value of a global variable, the Scheme syntax for which is (define name value) . | Var of int * int This is a reference to a local variable, i.e., to a value in the current environment. The first of the two integers says how deep into the environment the interpreter must reach, while the second integer says which element in the resulting array is to be fetched. In the environment example above, for example, to get to the value of a , we'd have to take the second array and from that the first element, hence the numbers would be 1 and 0 (since the first element has number 0 ). | If of lisp_code * lisp_code * lisp_code This is a simple conditional. The first lisp_code is the condition, the second is the code for the consequent and the third for the alternative. | Application of lisp_code * (lisp_code array) This is a function application. The first lisp_code is supposed to evaluate to a function, i.e., either a builtin or a closure, and the array contains the code for the arguments. | Build_closure of lisp_code This, finally, is closure-generating code, i.e., a lambda expression. Again, it doesn't say how many arguments the closure is supposed to take, which is neither an oversight nor a feature, but merely the result of my laziness. Now that we're through with the data structures, we're ready to discuss the interpreter, which consists of the classical functions eval and apply . I won't discuss the compiler because it's not as pretty as the interpreter. The only really "complicated" thing in the compiler is keeping track of local symbols, so that it can know where in the environment the interpreter has to look for a variable value. Anyway, here's the interpreter: let rec eval x e = eval takes a lisp_code x and an environment e in which to execute x . For a top-level expression, this environment is of course the empty list. eval returns the result of the execution of the code, which is a lisp_expr . match x with Quote q -> q Quoting an expression is really easy - just return it. | Global { value = Value v } -> v Referencing a global variable simply means getting its value. | Global_set (s, c) -> s.value <- Value (eval c e) ; Nil Setting a global variable, on the other hand, means setting its value (and returning Nil ). The value it is set to is the result of executing the corresponding code in the current environment. | Var (depth, index) -> (nth e depth).(index) Since we know exactly where to look for in the environment for a local variable, this is really easy. First we select the array out of the list and then we select the element out of the array. | If (cond, cons, alt) -> if eval cond e == Nil then eval alt e else eval cons e This is one of the two more complicated rules. First we evaluate the condition. If it's Nil , i.e., false, we evaluate the alternative, otherwise, i.e., if it's true, we evaluate the consequent. | Application (f, args) -> apply (eval f e) (Array.map (function a -> eval a e) args) Applying a function means first evaluating the function and all its arguments, and then calling apply , which performs the application once we know the function and the value of its arguments. We'll investigate apply below (it's very simple). | Build_closure c -> Closure (c, e) Building a closure just means putting its code and the current environment into a Closure . Here's apply , which takes a function (in the form of a lisp_expr ) and an array of arguments: and apply f a = match f with Builtin b -> b a The function can be a builtin, in which case we just use the native OCaml function application. | Closure(c, e) -> eval c (a :: e) The function can also be a closure, in which case we need to make a new environment for it first, which really only means putting the argument array in front of the closure's existing environment. Then we evaluate its code in this new environment. | _ -> raise Hell;; Applying anything else (like an integer) is an error. Download