What is AI?

Artificial Intelligence (AI) impacts almost every aspects of our lives, yet it often goes unnoticed. Search engines, automatic translation, vocal recognition, are all examples of applications that would not have been possible without the recent advances in AI techniques. But what is an AI?

Reinforcement Learning in a nutshell

An AI agent is a computer program that is designed to solve a certain task without being told explicitly how to do so. Rather, the agent should be able to learn how to solve the task by observing an expert, or by being rewarded when it behaved appropriately. The latter is called Reinforcement Learning (RL), and lets the agent learn only from a numerical reward signal sent by the environment as a response to the agent’s actions. If its actions successfully perform the task at hand, the agent receives a reward. If not, it receives a negative feedback. It is the preferred approach when there is no expert to show us the solution. It is often compared to how one could raise an animal, by feeding it when it obeys verbal commands.

RL is a powerful tool because it enables the machine to learn a task that we are not capable of performing ourselves. The catch is that with almost no prior knowledge of the problem, RL algorithms tend to be slow. They typically require to perform many actions in order to learn a good behavior, because they also need to understand how the environment reacts to their actions.

Bayesian RL

Bayesian Reinforcement Learning (BRL) is a subfield of RL, where some knowledge about the environment is available in advance. Those algorithms explicitly encode their knowledge of the environment in a probability distribution, called belief.

After each observation, most of these algorithms refine that belief to reflect what they learnt from it, enabling them to make better informed decisions in the future. By explicitly stating their belief, and their uncertainty about it, BRL algorithms are able to better balance exploratory actions (meant to learn more about the environment) and exploitation actions (i.e. act to maximise immediate rewards according to the current belief).

Our new contribution

In other fields of AI, such as supervised learning, innovation has been fostered by public benchmarks, well-established test protocols and free code implementation of popular algorithms that allowed the empirical validation of any new algorithm. In BRL, these elements for defining and measuring progress do not exist. Instead, each new algorithm is usually only tested on a few arbitrary selected test problems, and the code is often unavailable or not maintained. This makes it hard to anyone to know what is the current best BRL algorithm, since no two papers would share the same implementation of a test problem or the same performance metric. Additionally, due to the absence of a free library gathering standard BRL algorithms, researchers would often need to re-implement existing BRL algorithms to benchmark their new contributions. This has a prohibitive time cost, and is sensitive to development errors. In such a context, how could the BRL community easily achieve measurable progress?

This is the problem that motivated our latest paper, entitled “Benchmarking for Bayesian Reinforcement Learning”. In this work, we offer two main contributions. First, we designed a BRL comparison methodology, comparing fairly BRL algorithms on large sets of problems, in order to address the issues discussed above. Computation time constraints are also part of the protocol. The second contribution is an open source library: Benchmarking tools for Bayesian Reinforcement Learning (BBRL). Rather than asking to BRL researchers to implement all the algorithms, along with the benchmark itself, we decided instead to make our code available to the community. Our library already contains many test problems, along with most of the state-of-the-art RL algorithms. It also provides tools for detailed computation time analysis, to address the large disparity in time requirements of BRL algorithms. Users are encouraged to add their own favorite algorithm to see if it competes with other popular algorithms. By using this library for their work, any BRL scientist can now easily make their results reproducible and open to scrutiny. The code is available, with a complete documentation, on github : https://github.com/mcastron/BBRL/

In our paper, we demonstrate the potential of the BBRL library by doing a thorough analysis of all currently available algorithms and test problems. This work led to many insights, such as which algorithms are the most vulnerable to inaccuracy in the prior knowledge, or which algorithm can benefit the most from more online computation time.