The domain in which I did my research in cognitive science is the Bongard Problems. These are problems on visual pattern recognition that appeared first in the appendix of a book published by the Russian scientist M. M. Bongard in 1967, in what was then the USSR. They became more widely known to the western world when D. R. Hofstadter mentioned them in his book, “Goedel, Escher, Bach: an Eternal Golden Braid”, and speculated on how an automated system could be built to solve such problems. Rather than tiring the reader with words, I prefer to show what Bongard Problems (BP’s) are by presenting a — rather trivial — BP, below.

There are six boxes on the left, and another six on the right. The ones on the left conform to a pattern, or rule, that describes them, and the task of the problem-solver is to find this pattern, or rule. As an aid, the six boxes on the right do not conform to the same pattern. Sometimes they conform to a different pattern (as is the case in the above example), while in some cases the rule that describes them is simply the negation of the rule on the left. (Thus, the six boxes on the left are the ones that, somehow, have the official status of defining the pattern.) There is nothing magic about the number six — it’s just that Bongard decided to provide six examples and six counter-examples for each of his 100 problems in the original publication (see link for references at bottom) and so BP’s are normally designed with 12 boxes thereafter.

As mentioned, the above example is trivial. Not all BP’s are so easy, though. Let us see a problem of medium difficulty:



Did it take you more than a few seconds to see what is going on? If your answer is “yes”, you are probably a “regular” BP-solver, like me. If it took you less than five seconds, though, you probably belong to a small percentage of people who have a knack at solving such problems. But don’t hurry to celebrate, yet! I don’t mean you are the most intelligent person in general that ever walked on Earth! Maybe you are — I don’t know — but clearly, solving BP’s fast is evidence for only some specific aspects of human intelligence. Also, you don’t need to have a degree in higher education to solve such problems; my own research has shown that even otherwise average people can sometimes show a surprisingly excellent ability for solving BP’s. (See the link to my dissertation, near the end of this page.)

Note, please, that if you want to see more BP’s that I (arbitrarily) characterize as belonging to a medium level of difficulty, you may click on the example above. The same remark holds for all the other BP’s shown on this page. You can also check the link to the BP-index, at the bottom of this page.

Now, not all problems can be handled so easily. When M. M. Bongard published his book he included 100 BP’s in the appendix, some of which were rather hard. Later, D. R. Hofstadter engaged himself in a brief period of BP-design mania, during which he created 56 new problems (still called “BP’s”, but available in print only through CRCC — his research center; even better, check my index to all BP’s, or my own appendix to my dissertation, below), some of which I never managed to solve. (I mean I have spent more than half an hour, or over one hour in some cases, and still did not come up with a solution.) Want to try your teeth at some mind-bogglers? Watch this:

Click on the picture to see more. Although Bongard included the answers at the end of his appendix, Hofstadter did not, so the ones I could not handle belong to Hofstadter’s collection. Now, those few readers who happen to know that I did my work under Hofstadter’s research group might wonder: “Why didn’t you ask him, then?!” Well, I found out he never wrote the answers down, and when he looked at one of his problems ( this one) he managed to solve it as if it was somebody else’s, apparently having forgotten all about it after he created them, more than two decades earlier. So I adopted a better strategy: after having this page on the web since 12/1999, I received an avalanche of answers to all problems! Thanks to everyone for the solutions!

Parenthetically, I would like to announce here that the following people have sent me their answers, timing information, and comments on various Bongard problems:

Linda Snow Timothy Reed Bryan Bergert Darius Bacon Annemieke van de Visse Robert Wasierski Priscila Farias Gordon Rios Laurent Mazuré Joel den Boer Philip Meier Ryan Ziegler Blake Hodgetts David Ashley Frank Hofmann Wim Van Laer Michael Roberts Christophe Pernaud Ryan R Newton Joseph Insana Peter Shanahan Glenn Faubert David Sunderland Alison Frane Kevin Jackson-Mead Matt Ryan Kerin Schiesser Claudia Mastroianni Erich Cranor Barbara Bartlett Linda Forman Alex Fink Seo Sanghyeon Facundo Aguero Frédéric Faucheux Andreas Gunnarsson Alain Riccobene Damien R. Sullivan Matthew Hales Robert BL ...For the continuation of this table, please click here...

I am very grateful to all of you for your extremely valuable contributions.

My research task was of course not to solve such problems (! !), but to write a program that would be able to do so. Readers not familiar with programming might think this must be easy, probably misled by the apparent effortlessness with which we handle at least the easiest BP’s. Other, computer-savvy readers might think it is possible to write such a program, but then, what do we gain out of it? Here is my basic premise:

I see Bongard problems not as a mere collection of cute visual puzzles, but as a gateway that allows us to get a glimpse at the foundations of cognition: a set of principles that are as fundamental for cognitive science as Newton’s laws are for physics. Bongard problems themselves are solved primarily as a consequence of following some fundamental principles of cognition.

Meanwhile, I gave a name to my program. I called it Phaeaco. Click on this link to see where this weird name comes from. (Warning: for your entertainment only; no deep cognitive science issues here.)

Before proceeding, let me mention that Phaeaco’s input consists of raw bit-maps, i.e., a Bongard problem is a collection of 12 square black-and-white boxes, each of which has a resolution of 100 pixels per side (just as you see them on this page). There is no other high level information included in the input, such as lines, intersection points, etc. — just pixels. I imposed this form on the input for several reasons:

to be able to get my hands dirty with low level image-processing (okay, just a tad of dirt, no big deal), to have the ability to simply scan the original (Bongard / Hofstadter) problems and feed them to Phaeaco; or just draw problems of my own on a sheet of paper and feed them in, too, with no further manual preprocessing, to introduce “noise” in the input (*) . Finally, and most importantly, to avoid the cheating which is unavoidable if the input is specified in some formulaic way, for example, through first-order logic formulas. By the way, this is the input specification preferred in RF4 ( Saito and Nakano, 1993). Follow this link to see why, in my opinion, first-order logic as input-specification entails covert cheating.

Starting at the level of individual pixels, however, is the least important of the cognitive issues involved. Some other issues of major importance include:

Creating and utilizing a visual representation scheme (*) on the fly. This means that Phaeaco constructs internal structures that represent the input given to it, in some abstract but invariant form. “Invariant” here means that although the input might show triangles in various sizes, orientations, locations, and so on, Phaeaco manages to activate a fixed, invariant internal structure that represents the concept “triangle”. The creation of such conceptual representations is performed “on the fly”, meaning that Phaeaco initially knows nothing about triangles; but, if presented with a small number of them, it gradually creates the internal pattern that represents the concept, generalizing it as more examples appear. Phaeaco’s representational scheme facilitates pattern representation, pattern matching, and cognitive categorization, among other things. (*) Possessing a long term memory and being able to recall similar visual patterns that have been encountered in the past, learning new patterns and new approaches to solutions to problems in the process. In particular, Phaeaco includes a tutoring (non-problem-solving) session where the program’s mentor (a human interactor) is in a position to teach visual patterns which can be used later for solving BP’s (see this page for more on Phaeaco’s Mentor section). Being able to abstract and generalize, and see known and familiar shapes as if they were something else. For example, to see an elongated ellipse as one of the two pillars of an arch; or a large square with a small notch as if it were simply a square with a notch, instead of a weird polygonal shape; or a collection of dots arranged in a circle as if they were a circle; and so on.

A program that could handle all of the above cognitive issues would be much more powerful and cognitively interesting than a program that “merely” solves BP’s. At an earlier stage in its development, my project included also a linguistic component, one that could allow Phaeaco to learn words and phrases for its visual patterns, and use them for expressing the solutions. Later, I had to abandon this direction, after receiving advice from my research supervisors, else my project would never end. I have now postponed the linguistic component (which I regard as extremely important, not for Bongard problems, but for human cognition in general) to a future time of pursuing my research interests.

Download my dissertation: In May, 2006, I completed my dissertation, and in the following month I graduated from Indiana University. For those of you who would like to take a look at my dissertation, here it is (click on the PDF icon):

The above PDF file is 14 MB long. My dissertation is written in a down-to-earth way: there is no “academese”, technical, or other obscure language in it. I hope that, if you are interested in cognition, it makes for a pleasant reading.

If you reached this point and wish to learn more on my doctoral research, you may select one of the following links:

Interface of Phaeaco and program information.

Current state of performance: see how many and which problems Phaeaco can solve, and how fast.

Phaeaco’s source code : not provided, reason largely implicit in the “important last note”, below.

Index of all Bongard problems known to the author.

Some things that are not allowed in BP’s

Design your own Bongard Problems!

Fundamental principles of cognition: I tried to distill some important parts of my dissertation in this text.

Virtual Dialog: a BP-related science-fictionish text I wrote for commemorating the 50th anniversary from Turing’s 1950 paper (the “Turing test”)

Bibliography and related web links





See also:

Simplified image-processing algorithms (general, not directly related to BP’s)



Important last note: In January, 2008, I wrote and uploaded a web page with the title:

Why I stopped working on the Bongard problems

The short answer is, I got scared of some ethical issues involved in this line of research. Click on the above link to learn what those ethical issues are and read the long answer.

Footnotes (clicking on the caret (^) on the left of the footnote brings back to the text) :

(^) At a first glance, one might think there is no noise in the input at all, since the pixels can be either black or white, with no other colors or gray scale values allowed. At the level of individual pixels it is true that there is no noise. However, at every other level there is only noise to be found: Straight lines are never quite straight (e.g., if drawn by hand, or due to low scanning resolution), so the program has to figure out when a line should be seen as straight; lines can have varying width, which should normally be ignored, unless it gets too thick, so we have a specially-featured line; small filled circles are often mere irregular blobs of collections of pixels; bigger circles might look like polygons, or like no circles at all; lines may not meet perfectly at vertex points, or end perfectly at end-points (little serifs may appear), or they may have interruptions; erratic pixels may appear here and there — once again due to scanning noise; and so on. Yet, in all these cases our human visual system has no trouble recognizing the pattern, or shape, that “should be” there, because we are usually guided by top-down forces. My program should be able to do that, too.

(^) What is a representational scheme? Well, when we see a triangle, and remember it later, what sort of information is stored in our minds? Pixel-based image? A mathematical-propositional abstraction? Nothing that can be described at a higher level, just neurons that fire in synchrony, responding to similar inputs? I believe, none of the three ideas is exactly right. My representational scheme holds a middle ground, between symbolic and neuronal computing.