Ten weeks ago I broke my hand…what a journey of inconvenience and pain it has been, haha. Only now am I able to type again, sort of. With this improving dexterity in mind, I wanted to celebrate by writing a couple of posts that I’ve been meaning to get around to for a while.
In this one, we’re going to cover the basics of artificial intelligence — more specifically, machine learning.
Simply due to the fact that I interview quite a few potential product managers (for AI products) each week, and it’s surprising how many apply without even a fundamental knowledge of the basics.
I cover the fundamentals of AI problem framing as well as basic algorithms and how to assess their ultimate success. Hopefully this gets you thinking — there’s a rich resource of courses out there, online and free.
It’s all about prediction
The human brain is exceptionally good at many things: conscious thought, emotions, memory, control of movement, not to mention the five senses that let us take in the world around us. It’s also good at detecting patterns in the countless signals that come in through those five senses. But the brain isn’t perfect, and there are some problems that are so complex that they are better suited for computers than for people. This is the realm of machine learning.
Machine learning is a bit of a misnomer — conjuring up images of iRobot, Wall-E, or HAL. In reality it’s just as powerful, but doesn’t have quite the same star power: it refers to any systematic way of finding patterns or similarities in data, usually for the purpose of making predictions
The core of machine learning is its ability to predict.
Decision making at any level requires some form of prediction. When you think of the terms machine learning (ML), natural language processing (NLP), object recognition and autonomous driving — all that you’re seeing is prediction at work.
Skipping the history lesson, this particular branch of artificial intelligence, machine learning, has improved substantially. To the point where a lot of things that just a decade ago we thought of as inherently human problems, can now be done by machines.
Over the years the cost of generating these predictions has decreased.
As its gotten cheaper and better, machines are doing more of it. That means businesses — and individual workers — need to figure out how to take advantage of the technology to stay competitive.
One interesting side note, in terms of AI getting better, is the success of teams competing in a competition run by ImageNet. In just seven years, the winning accuracy in classifying objects in the dataset rose from 71.8% to 97.3%, surpassing human abilities and effectively proving that bigger data leads to better decisions.
All human activities can be described by five high-level components: data, prediction, judgment, action, and outcomes (linked together in the drawing at the beginning of this paragraph).
As machine learning improves, the value of human prediction skills are decreasing because machine prediction provides a cheaper and better substitute for human prediction. However, this does not spell the end for human jobs, as many suggest. That’s because the value of human judgment skills is increasing. Judgment is a complement to prediction and therefore when the cost of prediction falls demand for judgment rises.
We want more human judgment.
What problems should AI solve?
When building any product in a human-centred way, the most important decisions you’ll make are: Who are your users? What are their values? Which problem should you solve for them? How will you solve that problem? How will you know when the experience is “done”?
AI is a general purpose technology, not just confined to the likes of Facebook and Google. What was born in these tech companies has now spilled over to other industries. The effects of which will be magnified in the coming decade, as manufacturing, retailing, transportation, finance, health care, law, advertising, insurance, entertainment, education, and virtually every other industry transform their core processes and business models to take advantage of AI.
AI can be used to improve business performance; for example predictive maintenance, where machine learning’s ability to analyse large amounts of high-dimensional data from audio and images can effectively detect anomalies in factory assembly lines or aircraft engines.
In logistics, AI can optimise routing of delivery traffic, improving fuel efficiency and reducing delivery times. In customer service management, AI has become a valuable tool in call centres, thanks to improved speech recognition.
In sales, combining customer demographic and past transaction data with social media monitoring can help generate individualised “next product buy” recommendations, which many retailers now use routinely.
Such practical AI use cases and applications can be found across all sectors of the economy and multiple business functions, from marketing to supply chain operations. In many of these use cases, machine learning techniques primarily add value by improving on traditional analytics techniques.
For all of these solutions, it began with understanding workflows.
Map existing workflows
Mapping the existing workflow for accomplishing a task can be a good way to find opportunities for AI to improve the experience. Talk to people and understand how they currently complete a process. You’ll better understand the necessary steps and be able to identify steps that can be automated or augmented. If you already have a working AI-powered product, test your assumptions with user research.
Decide if AI adds unique value
Once you identify the aspect you want to improve, you’ll need to determine which of the possible solutions require AI, which are meaningfully enhanced by AI, and which solutions don’t benefit from AI or are even degraded by it. It’s important to question whether adding AI to your product will improve it. Often a rule or heuristic-based solution will work just as well, if not better, than an AI version. A simpler solution has the added benefit of being easier to build, explain, debug, and maintain. Take time to critically consider how introducing AI to your product might improve, or regress, your user experience
When AI is probably better
- Recommending different content to different users
- Prediction of future events
- Personalisation improves the user experience
- Natural language understanding
- Recognition of an entire class of entities
- Detection of low occurrence events that change over time
- An agent or bot experience for a particular domain
- Showing dynamic content is more efficient than a predictable interface
When AI is probably not better:
- Maintaining predictability
- Providing static or limited information
- Minimising costly errors
- Complete transparency
- Optimising for high speed and low cost
- Automating high value tasks
Assess automating versus augmentation
When you’ve found the problem you want to solve and have decided that using AI is the right approach, you’ll then evaluate the different ways AI can solve the problem and help users accomplish their goals. One large consideration is if you should use AI to automate a task or to augment a person’s ability to do that task themselves.
Some tasks, people would love for AI to handle, but there are many activities that people want to do themselves. In those latter cases, AI can help them perform the same tasks, but faster, more efficiently, or sometimes even more creatively. When done right, automation and augmentation work together to both simplify and improve the outcome of a long, complicated process.
Design and evaluate the reward function
The reward function is how an AI defines successes and failures. You’ll want to deliberately design this function including optimising for long-term user benefits by imagining the downstream effects of your product and limiting their potentially negative outcomes.
Weighing up false positives and negatives
Finally, and briefly, when you’re thinking about what problems AI should solve — you need to think about false positives and negatives. Many AI models predict whether or not a given object or entity belongs to a certain category. These kind of models are called binary classifiers. We’ll use them as a simple example for understanding how AI’s can be right or wrong.
When binary classifiers make predictions, there are four possible outcomes:
- True positives – when the model correctly predicts a positive outcome
- True negatives – when the model correctly predicts a negative outcome
- False positives – when the model incorrectly predicts a positive outcome
- False negatives – when the model incorrectly predicts a negative outcome
When defining the reward function, you’ll be able to weigh outcomes differently. Weighing the cost of false positives and false negatives is a critical decision that will shape your users’ experiences. It is tempting to weigh both equally by default. However, that’s not likely to match the consequences in real life for users.
For example, is a false alarm worse than one that doesn’t go off when there’s a fire? Both are incorrect, but one is much more dangerous. On the other hand, occasionally recommending a song that a person doesn’t like isn’t as lethal. They can just decide to skip it. You can mitigate the negative effects of these types of errors by including confidence indicators for a certain output or result.
So, now onto the good stuff… a brief introduction to the basic approaches of modern machine learning. It is worth keeping in mind that it’s a dynamic space — what is popular today, could be replaced next week.
An overview of the basic algorithms
Supervised vs. Unsupervised vs. Reinforcement
Models for supervised learning try to predict a certain value using the values in an input dataset. The learning model attempts to establish a relation between a target feature (i.e. the feature being predicted), and the predictor features. The predictive models have a clear focus on what they want to learn and how they want to learn.
Examples include predicting whether a tumour is malignant or benign (classification), prediction whether a stock will go up or down (regression), handwriting recognition (classification), and fraud detection (regression).
Supervised machine learning is as good as the data used to train it. If the training data is of poor quality, the prediction will also be far from being precise. The model performance can be evaluated based on how many misclassifications have been done based on a comparison between predicted and actual values.
Standard algorithms include:
- Naive Bayes
- k-nearest neighbour (kNN)
- Decision tree
- Linear regression
When conducting supervised learning, the main considerations are model complexity, and the bias-variance tradeoff. Note that both of these are interrelated.
Models for unsupervised learning help in finding groups or patterns in unknown objects by grouping similar objects together. Examples include customer segmentation and recommender systems. There are two types of unsupervised learning problems — clustering and association.
Any unknown or unlabelled data set is given to the model as input and records are grouped. It is difficult to measure if the model did something useful or interesting. Homogeneity of records grouped together is the only measure. It is more difficult to implement than supervised learning.
Standard algorithms include:
- Principal component analysis (PCA)
- Self-organising map (SOM)
- Apriori algorithm
- DBSCAN etc.
A machine learns to act on its own to achieve the given goals. Examples: self-driving cars, intelligent robots, etc. This type of learning is used when there is no idea about the class or label of a particular data. The model has to do the classification — it will get rewarded if the classification is correct, or else it gets punished.
The model learns and updates itself through reward and punishment. The model is evaluated by means of the reward function after it had some time to learn. It is the most complex to understand and apply.
Some standard algorithms are Q-learning and Sarsa.
Practical applications include self driving cars, intelligent robots and AlphaGo Zero.
Evaluating these algorithms
Without doing a proper evaluation of these algorithms using a variety of metrics, You can have a problem whereby the model is deployed on unseen data and can result in poor predictions. This happens because models don’t learn but instead memorise; hence, they can’t generalise on unseen data.
In a classification task, the precision for a class is the number of true positives (i.e. the number of items correctly labelled as belonging to the positive class) divided by the total number of elements labelled as belonging to the positive class (i.e. the sum of true positives and false positives, which are items incorrectly labelled as belonging to the class). High precision means that an algorithm returned substantially more relevant results than irrelevant ones.
Recall, on the other hand, is defined as the number of true positives divided by the total number of elements that actually belong to the positive class (i.e. the sum of true positives and false negatives, which are items which are not labelled as belonging to the positive class but should have been). High recall means that an algorithm returned most of the relevant results.
To fully evaluate the effectiveness of a model, it’s necessary to examine both precision and recall. Unfortunately they are often in conflict. That is, improvements in one leads to a reduction in the other and vice versa.
The F1 score is the harmonic mean of the precision and recall, where an F1 score reaches its best value at 1 (perfect precision and recall) and worst at 0. An F1 score punishes extreme values more.
The ROC curve is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied. The ROC curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings.
Finally, the area under the ROC curve, AUC. An ROC curve is a two dimensional depiction of classifier performance. To compare classifiers, we may want to reduce ROC performance to a single scaler value representing expected performance. A common method is to calculate the area under the ROC curve, abbreviated AUC.
This was a very (very) brief introduction to some of the algorithmic approaches within machine learning. In future posts, I’m going to start to build out practical “work along” examples where we actually build out some basic models and deploy them to AWS.
Please do let me know if you found the post helpful!