What is Machine Learning

To understand what machine learning is, you need to look at the limitations of traditional software systems. Typically, software is written with clear rules and decision making hard coded into the system. The developer writes all of the logic which decides how an output is generated from one or more inputs.

Traditional software is good at:

Mathematics

Business processes

Communications

Structured data storage and manipulation

Pitfalls of traditional software:

All inputs and decision making logic needs to be known in advance

Can grow in complexity and bugs / errors are likely to appear

Machine learning tries to address these pitfalls by mimicking nature. A child learns to recognise objects based on their previous experience of playing or watching and identifying repeatable patterns. There is no clear logic to define a chair; they're all different, however a person knows it’s a chair by comparing it to their previous experiences. Machine learning does the same and allows machines to carry tasks in a similar way to how a human would.

“Machine learning is a type of artificial intelligence (AI) that provides computers with the ability to learn without being explicitly programmed. Machine learning focuses on the development of computer programs that can teach themselves to grow and change when exposed to new data.” – Wikipedia

Machine learning systems are good at:

Pattern detection

Auto Categorisation

Image recognition

Voice recognition

How does it work?

ML is usually achieved by inputting vast amounts of training data. This training data can be collected by recording events or have humans inputting data manually. For example an image recognition system is trained to recognise road signs by providing it with hundreds of thousands of human confirmed images of road signs. The system then extracts features from this image like the colours and arrangement of pixels and stores them a structured format. The more data you feed the system, the more it can identify the common patterns in the data for each image. A sneaky way of training these systems is by asking unsuspecting members of the public. When you complete an online form and “verify you’re a human” by clicking images of road signs; your actually contributing to the ML systems which power their driver-less car program.

Algorithms

There are a whole range of algorithms which you can dive into if you’re interested on Wikipedia. Most of them have been around for decades; some even centuries. The only thing that is different today is that we have the computing power and the vast training data sets available to put them to advanced use.

The simplest algorithm to grasp for beginners is Naïve Bayes which is best known for its speed and straight forward implementation; albeit at a cost in accuracy. Don’t be put off by the fancy mathematic formula! YouTube is a great place to learn about it. Essentially it’s a formula for combining different calculations of probability and combining them to determine the overall probability for an entity and any given classification.

Platforms

This may look like a technology which is only available to large budget projects. It isn’t. It has become surprisingly accessible over the last year or two. The most notable platforms are IBM Watson and Microsoft Cognitive services. Both these platforms provide a simple to use API for training and running ML and AI related functions. Applications can be built very quickly to incorporate image recognition, auto categorisation, voice recognition, predictions and recommendation systems.

A few external links to some inspiring machine learning demonstrations:

At TheTin we had a go at predicting the EU referendum. We used Microsoft tools for the training and prediction analysis and generated pretty good results. We got the final prediction wrong, but retrospectively we identified a few potential improvements which would have produced true to life results. We also harvested a lot of data which will make our next attempt event stronger!