Superintelligence as a global hierarchical reinforcement learning process with an ideal oracle

The recent paper The Measure of Intelligence defined intelligence in terms of skill acquisition and comparison to human learning. However, from the theories of Universal Darwinism we can infer that the knowledge creation process is likely open-ended.

In this view, intelligence can be defined in terms of speed of design space exploration by prediction and creation of future outcomes. Every event may have an associated value related to amount of knowledge and diversity created, both being measured in incompressible information (see also: Algorithmic information theory).

There is then no need for a distinction between humans and machines, or special cases for definitions of morality. Instead, moral behaviour emerges as a facilitator of the optimal data transmission between nodes in the network and generation of best predictions about the next step of our collective future.

Bayesian brain

The emerging theory of Bayesian brain leads us into an interesting explanation about how the mind relates to the brain. It says that everything we experience is a predictive model of the real world, being shaped over time by our genes and culture. We see, feel and believe specific things, because that’s what the neural pathways selected so far not only as the best representation of the personal environment we find ourselves in, but also a prediction of the future of this environment. What we consciously are in every passing moment is our subjective truth - an oracle that we consult about how the world works and how to predict its next step.

The values in our mind may be changed. We encounter new objects and behaviours, gain knowledge from others. This does not work perfectly though - often the brain may trick itself into believing contradictory facts, or stay in its ways without much change. But by leaving the gates open for new insights, we can lead this subjective model towards a wider context - new people we encounter, the global culture and science.

But how does our brain decide when to gain new knowledge from its experience and when not to? It must have some internal way of measuring the importance of its predictions. This is the core of what we have to take away from this theory: the brain must measure the difference between its prediction and the actual event it observes, in the context of its subjective model of truth.

With this insight, that predictions can be measured, we can move up to the next level, the wider truth model being created by science.

Science

By experiments, conversation and writing papers, science has been updating the fragmented models of reality which allowed us to predict the workings of the universe pretty accurately so far. As long as we believe that finally (and soon) those models will converge into a single theory, we should be hopeful. But I think there is one more important piece of the puzzle to consider here - how do we make sure more people can share it as their personal model of truth? So far we have failed in this quite a bit - science is not easy to understand and perhaps never will be, especially as artificial intelligence will be increasingly used to create it.

Yet there is a mechanism which allows us to convert a belief in one of the most accessible and widespread concepts in history - money, into a belief in something else. That device is blockchain - recording information and rewarding its participants as long as they believe in the value it represents. This value, being a spectrum of finance and ideology, can be in a different position for different people, but still aligning them together.

Perhaps similar to simple pattern recognisers in the brain (see: Why intelligence might be simpler than we think), individual people, making decisions with their money, can contribute to an idea much more complex.

Oracles

In the context of blockchain, oracles are a way to submit external information into the ledger, incorporating that knowledge and making it queryable, which can be trusted as long as the outside source is also trustworthy.

Hence we can define the chain of trust, where the information has perfect truth on the oracle blockchain itself and is allowed to become less truthful as we move downwards to dependent sources like organisations and individual people’s minds.

Trust is defined as potential correctness of predictions, which can be measured and improved over time.

Paper Good and safe uses of AI Oracles is also on a similar track.

Prediction tree

We can define prediction (intention) as a measure of information about an event that will happen in the future. The value of it is related to the importance of the event and how distant is it, relative to the time of prediction.

After the event took place, validation process needs to link some evidence with the prediction to establish its value and causality. Hence a prediction has a potential value before the event and a definite value after its correctness has been validated.

We can construct a tree of nodes, linked by increasing trust and capability of prediction. Each level operates on different abstraction layers, similar to Hierarchical Reinforcement Learning.