My journey into machine learning began in the summer of 2016. It all started at a barbecue party at the home of my fiancé’s aunt and uncle’s in northern Stockholm. I was sitting outside at a garden table together with the older men of her family. These are old and tough Finish men, her granddad (96 years old) fought in the war against the Russians. As you can imagine, as the new kid on the block, I was keeping a low profile and my mouth shut. They were discussing their greatest pastime — harness racing.

Harness racing

Harness racing is one of the largest sports in Sweden and Finland. It’s a kind of horse racing, yet different to regular horse racing. In harness racing, the driver does not sits on top of the horse. Instead, the driver sit on a cart which is attached to the horse. The horses are not allowed to run as fast as they want. If a horse gallops, it’s disqualified. The horses have to run at a trot. It is similar to the walking competitions that you see at the Olympic Games, where competitors are not allowed to run. A sport as much of control as of speed.

Harness racing is popular because of betting.

Harness racing as a country

In 2018 a staggering 16 billion Swedish kronor was put on betting tickets. That’s equivalent to about 1.8 billion USD.

If harness racing was a country, then the Country of Harness Racing would rank as the 170th greatest in the world. A bit smaller than Belize.

Back to the barbecue party. I was sitting around listening to these men discuss the upcoming races. I was semi-conscious and was zoning in and out of the conversation.

Then suddenly, an idea struck me. The best idea of my life. At least that’s what I thought at the time.

AI and harness Racing

This was the summer of 2016. A couple of months earlier an artificial intelligence called AlphaGo beat the world’s best Go-players. It was all over the news.

This Chinese board game is so complex that it had so far eluded being modeled. Late 2015 this changed. But it was not the first time that a computer beat humans at a game like this. In 1997 an AI called Deep Blue beat Garry Kasparov, the chess grand master. IBM’s artificial intelligence Watson smashed two grand champions of Jeopardy back in 2011.

So, on that warm summer day, this was the idea that struck me. If artificial intelligence can beat the smartest players in these games, then betting on harness racing should be a walk in the park.

Betting on harness racing is not like playing at the Casino. At the Casino, you cannot win. You will be losing sooner or later. Probably sooner. You are up against the full force of statistics. There can’t be any winning strategies.

In harness racing, you are playing against all the other players. You are not playing the house. The odds of any given horse winning aredirectly related to the amount of money on it.

It is in theory possible to come up with a winning strategy. It just needs to be better than everyone else’s approaches.

The second fact that was going for this idea was the incredible amounts of data available. Every detail about every race is recorded and made available online. Everything you could ever want to know about the horses, drivers, trainers, tracks, weather conditions, and records. With about 10 to 15 races per day in Sweden, it’s heaven for a data scientist. You can even find videos of the horses warming up before the race. The complete genealogy of every horse is available. Their complete family-trees all available online!

Good thing horses generally don’t know about GDPR.

But, the fact that made me the most confident about this idea was that harness racing is not sexy one bit.

If you are an up and coming machine learning researcher, that’s into sports betting, then you are not looking at harness racing. You might be looking at football, basketball, baseball or even regular horse racing. But not harness racing.

My investigation confirmed this. Not a single mention of harness racing.

This was the perfect setup for applying AI. The only problem was: I didn’t know a single thing about artificial intelligence.

Early morning studying

Books and online courses to learn machine learning and AI

So, I got studying.

I started to wake up early in the mornings to get a couple of hours in before my family woke up and I had to leave for my nine-to-five. At five in the mornings I was taking online courses, and I read books.

I spent my entire family vacation in Greece, much to my family chagrin, re-reading my old statistics books from University. I read books on deep learning, data science, and data mining. I learned Python, TensorFlow and Sci-kit learn. I was trying to soak up as much as possible in as short time as possible to solve this one specific problem. I attended meet-ups, seminars, lectures, and went to conferences.

What I found surprised me. It was not that hard to get started. I could achieve a lot with very little time invested. Sure, there are 5-year masters programs teaching these things at every university worldwide. I am sure those are great educations. But training like that was not needed to get started solving my problem. Machine learning and AI is so much more than math. There is also the craftsmanship and engineering. Theory is one thing, but to build an AI or machine learning application from scratch, takes engineering skills. Good thing that I had that part of the puzzle already in place — with a software engineering education and ten years of experience building enterprise software.

I did not need to understand all the ins and outs of what was happening under the hood. The entry barrier was much lower than I had expected. It was possible for me, and I would guess for most, to jump right in.

Winning strategy

Back to the Harness racing.

The odds of a particular horse winning a race is a direct function of the amount of money on that horse. Which means that the odds reflect the consensus of those betting. Average Joe. John and Jane Doe.

So, how often are they correct? How often is the consensus right? How often does the favorite horse win a race?

I looked at all the races in Sweden since 1995 and found that the favorite horse won about 37% of the time. That’s not bad.

This number became a fixation of mine. An obsession. It was the number to beat. The only thing I thought about for several months.

While looking at those historic races, I also simulated betting a dollar on each of them. Picking the favorite horse to win in every one of those races. The result was staggering. A fictive betting account would be making a hefty profit with that simple strategy. I did not even simulate reinvesting the earnings as there would not have been enough money in the world to cover the winnings. It was straight forward flat betting of a dollar per race across about 26 000 races.