If DL is a Bubble – What Comes Next?

Driven by a rapid sequence of flashy successes in the wake of the deep learning (DL) breakthrough, gigantic amounts of money and effort are currently invested in AI. The general feeling is that a few companies and countries dominating in terms of data and resources are subdividing the cake among themselves while all others are irretrievably left behind. This perspective is based on the expectation that DL is it, and that grand application fields such as autonomous driving, household or battlefield robots and rapid digitization and automation are critically dependent on this technology. Its successes are indeed impressive. Humans are now surpassed by machines in intellectual achievements such as chess or Go, natural language translation is rapidly improving and autonomous driving is being demonstrated if not widely deployed. And the basic idea of DL is very convincing: A standard system is presented with a corpus of solved problems and the system is trained by a standard procedure until it reproduces the sample solutions and solves all similar problems.

Shortcomings of present AI

Here I would like to argue that in spite of appearances the true AI revolution has yet to arrive, that the present technology is a pedestrian approach, and that the spectacular achievements on the AI front are merely due to brute force overcoming fundamental shortcomings of the technology. Indeed, the field seems to be facing a singularity of its own kind, a singularity not of exploding achievement but of exploding effort for diminishing functional return. According to a recent MIT Technology Report the amount of processing power, electric energy and carbon footprint guzzled by leading-edge projects has risen by four orders of magnitude within just two years. Over-all, the power consumption of AI-related computing is currently developing into a major item in the world’s energy bill. And, although the main point of AI ought to be replacing human intellectual effort by automated processes, investment in human resources in the field is also rising steeply, taxing the world-wide pool of IT talents to its limit, as reflected in the average and top salary levels that are being paid. And finally, the hunger for human-derived sample data (the “oil of the age of AI”) is creating its own set of serious problems.

All of this investment would be fine if it was just financing a one-time research effort. Physicists, in their quest to accelerate elementary particles to the speed of light are also willing to put up with their own kind of singularity of effort. But with the current technology AI is not a one-time effort. Each new problem – or problem variant! – needs new attention, adaptation of the “standard” system and “standard” learning procedure and needs, above all, new data and expensive training runs. It is telling that a large part of the Deep Mind team that created the winning Go system consisted of Go masters themselves, probably for a good reason! In view of the varying demands of applications, the standard system is not so standard after all, its architecture needing to be adapted to the problem. (This adaptation of the architecture can be automated as a search process; this search process is, however, very computing intensive, see the entry “transformer network with architecture search” in the MIT report.) Moreover, the neural kernels of present AI systems are rapidly being surrounded by growing algorithmic shells, for administering, selecting and presenting the training data and for fitting the neural system into the application environment. This algorithmic shell is needed to make up for fundamental short-comings of the neural data format the field traditionally adheres to, see below. The effort of creating the algorithmic shell, the need to adapt the neural architecture, and the effort of collecting and curating training data is driving up the cost of AI in its present form.

The public as well as investors are apparently willing to put up with this. That would be fine if AI was really filling the bill. Let us take as sample domain autonomous driving, a field into which industry is heavily investing at the present time. Traffic, especially in inner cities, is a jungle of situations, each needing specific behavior to be dealt with. Currently, systems have to be trained or programmed separately for thousands of specific traffic situations. This is not only very expensive and cumbersome, but is an open-ended game, so that self-driving cars are bound to run unprepared into ever-new problem cases. Chris Urmson, former boss of Waymo (Google’s self-driving car effort) and now CEO of his own company with the same goal, estimates that autonomous cars will come into their own only in 30 to 50 years, and according to Rodney Brooks, roboticist of world renown, AGI (artificial general intelligence) will not come to pass before the year 2200 or 2300.



What’s the Alternative?



Compare this to human drivers. They must also learn to deal with unusual situations. But for them it suffices to run into a single case of a particular type to be able to generalize to all analogous situations. This ability is based on comprehension of events in terms of abstract concepts. Thus, when a brick threatens to fall off a truck in front of you, you recognize it as a hard and heavy object threatening to hit your car and you attempt to steer your car clear of its likely trajectory. In short, you capture the situation and its imminent development in terms of an interacting array of conceptual elements. In our brain, each of these conceptual elements and descriptors of relations is abstract enough to apply to vast ranges of other situations in spite of variation in detail. This gives us the ability to comprehend and to respond to totally new situations with novel combinations of previously learned concepts and relations.

The example makes it clear that in order to realize AI it is necessary to match the infinite creativity the world has in generating challenges with an equally rich and creative system able to comprehend situations in terms of conceptual constructs.



As soon as technology with that capability arrives on the scene the present climate of investment in big data and software development will be debunked as a bubble.

How to unlock that potential



What does it need to unlock that potential? Why hasn’t it happened yet? The problem can be traced back to two potent misconceptions that are currently blocking progress. All it needs to unleash a profound disruption is putting those out of the way. One of these misconceptions is the all-pervasive intelligent design attitude, the other the single neuron dogma strictly adhered to by DL and its ilk.



The intelligent design attitude is incarnated in direct algorithms, algorithms whose problem-solving idea originates in the brain of the human designer. The alternative to that are systems which let ideas arise by emergence (as they do, of course, in the brain). Such systems realized in the computer by indirect algorithms, algorithms that just set the stage for the emergence of the ideas proper. Such algorithms define data elements and selective mechanisms that arrange the elements into ordered patterns. Such indirect algorithms have been studied for a long time, under names such as evolutionary or genetic programming, reinforcement learning and, yes, deep learning! But something is evidently still missing. Emergence can be seen as constrained search. The search space is defined by the data structure of the system. This search space can be seen as a fisher’s net. To be able to catch the fish – the cognitive structure solving the problem at hand – the net must be large enough, but if the net is too large the solution is never found. Using bits as data elements, the most general choice possible, leads to endless search in combinatorial spaces (the nemesis of classical AI). Using, however, neurons and sticking to the single-neuron dogma is too narrow a net. (The success of Alpha Go is due to a combination of a problem-adapted data structure and lavish computing power.)



What is wrong with the current neural nets? Neurons as data elements, as atoms of meaning, correspond to what we find experimentally in the brain and are thus certainly a good start. But the single-neuron dogma goes too far by insisting that for any decision to be made there needs to be a single neuron that is dedicated to that decision. If the decision is recognizing a cat it needs a cat neuron. So far so good. But if the decision is “hit the brake if a brick is threatening your car” it needs a “hitting the brake if a brick threatens your car”-neuron, and so on! To avoid this absurdity, current systems limit the neural kernel to the representation to what is called a bag of features, while for the representation of whole situations they rely on a shell of hand-crafted direct algorithms: voilà the pedestrian solution again!

The brain does it differently. It has a system of interactions that gives neurons the ability to dynamically form structured connectivity patterns on all levels of complexity, up to the representation of the whole scene and its conceptual interpretation! And these structures emerge in the brain spontaneously by network self-organization, totally without intelligent design.



What can be done?



To realize this in electronics is the next big thing beyond DL. To get there means crossing cultural divides. The necessary ideas have long since been formed within their own scientific or engineering traditions, but in order to come to fruition they need to be merged into one conceptual whole. Let me just name a few of those traditions: emergence in general (but studied specifically also as network self-organization in the ontogenesis of the nervous system), schema-based understanding

(philosophy, psychology, cognitive science), compositionality (linguistics, cognitive science), the Gestalt phenomenon (psychology), recognition-by-components (visual psychophysics), indirect algorithms (declarative languages of computer science, neural networks), or neuromorphic computing (information technology).



Overcoming cultural divides just doesn’t happen inside the routine of research funding agencies or big companies and cannot be enforced by government-initiated mega projects.

A small team, a few decisive components – vision, a few highly motivated talents, and a modest amount of funding.

This is best organized as a start-up company with the mandate of creating a rapid sequence of proofs of principle and a growing prototype. Required is not a massive effort but conceptual coherence, which can be attained only in a team of not more than a dozen members.



As argued elsewhere, an artificial vision system, created by training in a virtual reality environment, would demonstrate the possibility and style of intelligence in silico. Our kids demonstrate it can be done. All it needs to kiss computer vision alive is a homogeneous data-and-process architecture as basis for merging re-implementations of the numerous functional components that have been developed over the decades into one coherent whole. Immediate applications abound, among them autonomous driving and household robots.



A Matter of a Trillion Euros



A convincing demonstration of AI that deserves its name will trigger a tsunami creating a totally new field of technology. The economic impact cannot be overstated. Digitization means, to a large extent, turning our smart phones, homes, cars, cities, corporations, administrations and traffic systems into organisms. The potential in terms of cost reduction is enormous (the state of Estonia estimates its digitization of administration economizes 3% of GDP, enough to pay for the country’s military defense). With the present technology the cost of this transformation would be prohibitive, but if the process is automated, so that human involvement is reduced to setting general goals, it will go by like a breeze. We are speaking of an economic potential worth trillions of Euros, unleashed by an original investment of a few million.

