What is Machine Learning?

How does machine learning work

Andy Gray

Co-Founder, CEO & CTO

Machine Learning

Machine learning is the name for the set of algorithms that adapt their outputs based on the inputs to the algorithm. Where most computer programs and algorithms are fixed like a maths formula where 1 + 1 will always equal 2. With machine learning algorithms we can “train” them to give the answers we want. This is super useful because we don’t have to know how to get from the inputs to the outputs, we can leave that up to the machine learning to figure out and increasingly the results are better than anything our best programmers have ever coded and for some problems surpass the performance of the best humans. So how does it work:

If I feed the machine learning algorithm a set of inputs and an answer:

Input 1	Input 2	Answer
1	1	2
3	2	5
5	1	6

It might learn that for all the inputs it can match the answers if it performs an addition of the inputs and so for future sets of inputs it would add them together. If we had another set of training inputs:

Input 1	Input 2	Answer
1	1	0
3	2	1
5	1	4

It might learn that the best way to get to the answers from the input data is to adjust its internal algorithm to perform a subtraction of the first and second inputs and do this for future training examples.

The really clever part is the algorithm that given a set of inputs determines what the right answer should be for a wide range of inputs and problem types. This is the machine learning algorithm and there are many different types; deep neural networks, extreme gradient boosting, k-means just to name a few.

img-ml-vs-programming/img-ml-vs-programming.png

There are two broad categories of machine learning algorithms:

Supervised machine learning

This is where you have examples of both the inputs and the answers you want given those inputs, as is the case in the addition example above. You feed these to the algorithm in a “training” phase and the algorithm adapts itself in different ways depending on what type of machine learning algorithm it is. Once the training is complete, the aim is that the machine learning algorithm will have learned to produce the right answers for future as yet unseen sets of inputs.

Unsupervised machine learning

These are machine learning algorithms that try to provide insight about data without having a defined answer to try and replicate. There are a number of theoretical uses, but in practice they are used to try and find natural groups or clusters within data. This definition can often lead to people thinking that customer segmentation and other sorts of grouping exercises are a natural fit for unsupervised machine learning, but in fact best results are most often achieved using supervised machine learning methods.

A machine learning model

Once a machine learning algorithm has been trained with data to provide answers to a particular problem, we often refer to this as a “model”. The model is the algorithm and the collection of internal states that make it respond to inputs in a given way.

Problems you can solve with machine learning

Our take is that there are 3 main applications for machine learning.

1 Predicting the future (it’s got a little harder in 2020)
2 Hyper-personalisation / segmentation to 1
3 Automation (Generally the highest ROI)

Training data

This is the set of inputs and outputs that we use to train the supervised machine learning model. You will need a sufficient number of examples for the model to perform well on data it hasn‘t seen before. In the subtraction example above many equations could have fit the 3 input and answers. For example:

Input1 - Input2 = Answer

Or

-2 + Input1 + Input2 - (Input2-1) x 2 = Answer

Either one of these algorithms would work for all the training examples listed above but the second one would give some very strange results on new data if we were expecting a simple subtraction. By having more training data examples, we help limit the scope of how wrong the model produced can be while still fitting the answers.

Test set

Because of the fact that we can’t rely on machine learning to produce a perfect model for our data, we need some way to validate that what has been built, is fit for purpose. Typically we hold back some of the data from the training process, say 20% and use this to validate the created solution. We then run this 20% of the data through the model to get answers, we will call these Predictions. We can then compare these Predictions to the answers and see how good the model is.

Depending on the problem we’re solving, it could be a case where we care about how many times the model produced answers that were exactly right, or we might care about the average error of an answer, etc. We can use different statistical measures to evaluate the performance depending on what we want to use it for. So we use this test set to score, evaluate and compare the models.

Better models translate directly to better ROI, so ensuring you have the best model, is important.

Machine learning isn’t just models

While algorithms like deep neural networks are amazingly versatile in their applications to solve business problems, we achieve these results by transforming all the various inputs, pictures, videos, text, demographics into numeric representations with sequences of numbers between 0 and 1. How we do this greatly affects the performance of the models, so part of the machine learning model creation process is figuring out how to transform your data to get the best results.

Algorithms like deep neural networks are highly configurable. They consist of a user defined number of layers of different types and different numbers of neurons on each layer among a whole host of other parameters. Different options for these values means that it can train and produce better models for different business problems. Extreme gradient boosting has a completely different set of parameters that determine how well it will be able to learn to solve a particular problem.

These complexities are why creating machine learning models has traditionally been the domain of data scientists who have wide ranging experience in all these techniques and what works best, but now with AutoML the barrier to entry is about as much information as is in this article.

AutoML (Automated Machine Learning)

This is the field of algorithms and solutions that use machine learning and AI to figure out all the data cleaning and transforming steps as well as figure out the right machine learning algorithms to use and what the right network depth, learning rate, solver and other gubbins to do all the stuff you used to need a data scientist to do.

With the right AutoML solution non-experts can get industry leading results with very little machine learning experience. Models produced in the Kortical platform typically run in the top 3% of competitive data scientists with no human intervention and with a few pointers from our team, often win outright.

AIaaS (AI as a service)

Taking this a step further a full end-to-end solution that encompases the models building but also hosting the models, managing projects, infrastructure and operations is often referred to as AI as a Service.

This greatly eases usage as easy to use integrations with tools such as PowerBI and Excel mean that once the models have been created they can be put to use straight away without needing to get coders involved at all.

img-kortical-timeline/img-kortical-timeline.png

MLOps (Machine learning operations)

Consumer and market behaviours change over time and this can mean our ML models can get out of date. If we created a model to predict toilet paper consumption in February 2020 based on historical data, we might find that its predictions are way off by March. To counter this we need a continuous loop of feeding new data back into the system to create new models that are more accurate for the new market conditions.

The management, governance, monitoring, health management, deployment, diagnostics and business metrics of this process, through the entire model lifecycle, are often referred to as MLOps.

Explainable AI

Until recently, machine learning was seen as a black box technology. Now certain techniques exist to allow users to interrogate the process from input to answer. This has a number of key benefits including:

1 Allowing us to uncover and remove unwanted bias from the data
2 Respond to GDPR / regulator requests about the decision making process
3 The often overlooked benefit is that it allows senior stakeholders to understand how the machine learning works and build enough confidence to sign off budget to put ML solutions live and delivering ROI

img-platform-explain/img-platform-explain.png

Kortical

Kortical is an easy to use AIaaS platform that incorporates AutoML, MLOps and model explainability to make the creation of world-class machine learning models and solutions very easy. Check out our getting started page to see how easy it can be.

You can prototype a full web hosted ML app in a day or do a full enterprise AI transformation in 6 months. This is 16x faster than industry averages.

Using tried and tested technology massively de-risks projects. Kortical projects have a 92% success rate vs 15% average for the industry.

The machine learning models and solutions produced on Kortical’s platform have convincingly won competitions against the biggest players in the market. This means that non data scientists can focus on solving the domain specific parts of the problem and let Kortical handle the machine learning.

The Kortical team have wide ranging experience getting leading results across many industries and are always there to help out.

If you are looking to deliver more ROI with ML faster, then please do get in touch below.

Get In Touch

Whether you're just starting your AI journey or looking for support in improving your existing delivery capability, please reach out.

By submitting this form, I can confirm I have read and accepted Kortical's privacy policy.