If you are only interested in watching the results of the simulation, you can skip to the end of this article.

In the most abstract sense, markets are known to be self-organizing through supply and demand. What happens if we train an AI to compete in a virtual market with 1000 customers, 100 companies and a specific set of rules that simulate a simplistic real-world market? Well, with a little bit of Reinforcement Learning magic we were able to create such an AI and the results are fascinating to say the least.

THE SETUP

The first step was to create a virtual market for the trading to take place, which I did using Unity. The market is populated by 1000 customers, 100 companies and an AI company. Everything binds together through a set of predefined rules.

Each normal Company starts with a balance sheet of $100.000, a random price for the products it sells and an “investing tendency”. We define investing tendency as the number of products the company is willing to order every time it sells out. The larger the order, the smaller the production cost and therefore the greater the potential profit. The issue with having a high investing tendency is that the company might run out of cash before it manages to sell all the products, which will cause the company to go bankrupt. When a company files for bankruptcy it is removed from the market.

Each Customer starts with a random Maximum Purchase Price, which is the maximum amount of money the customer is willing to pay for the product. The customer also has a Current Purchase Price which indicates the amount he/she is willing to pay for the product today. This price becomes equal to the product price every time a trade takes place. The goal of every customer is to buy only one product by the end of the day. If a few days pass without buying a product, it becomes clear that this customer is not fit for this market and is therefore removed.

The day ends when every single customer has either bought a product or visited all available companies without finding a fitting price. Companies that sold at least one product increase their prices and investing tendency, whereas those that did not, decrease their prices. On the other hand, customers that did not buy a product, increase their Current Purchase price.

Every few days, new customers and companies enter the market to help maintain a dynamic nature.

Snapshot of the virtual market running

THE AI

We built the neural network using Unity’s ML-Agent, which in turn uses Proximal Policy Optimization as a trainer.

Every day the AI observes the following market variables:

Market Info Number of competitors Number or customers Average Product Price in the market

AI Company Info Product selling price Production cost Investing tendency



Based on the above observations the AI chooses to increase or decrease either the price of the product or the investing tendency

By the end of each day we keep track of the progress of the AI and reward it accordingly based on the following Reward Functions:

The hyperparameters used for the training are the following

It takes around 200.000 trials for the neural network to train and produce decent results.

For the simulations results, we recommend watching the video at the end of this article.

THE SIMULATION WITHOUT THE AI

Before we introduce our AI, we needed to run the simulation without it and see how the market normally behaves. The final results are shown in this picture bellow. As you can see, companies competing for the 1000 customers causes the product price to keep falling. This is exactly why competition is good for the customers. You can also notice how customers and companies that were not fit for this market left and were replaced by new ones. Overall, this virtual market with 1000 customers can sustain about 96 competing companies that sell their products at the price point of $130.

Simulation results without AI

THE SIMULATION WITH THE AI

In addition to the other graphs, we introduce a Market Share graph to show the growth of our AI in the market. The results, which can be seen from the picture bellow, are very intriguing! The AI managed to disrupt the market, causing the product price to fall below the barrier of $130 which in turn caused a lot of the companies to go bankrupt because they were unable to keep up. From that point on, our AI was able to acquire almost every-single new customers, which led to the massive increase of its market share. Our AI controls, or at least has major influence on, the product price in the market. It uses this unfair control to produce higher profits and keep new competition from entering the market. Truly fascinating!

Simulation results with AI

CONCLUSION

This has been our first attempt in creating such an AI. We are satisfied with the results, but we might expand its capabilities by introducing it to a more complicated market setup. We are also considering releasing an interactive simulation online. If you are interested in seeing something like that do not hesitate to let us know.

For the simulation and additional commentary, we recommend watching the complementary video of this project: