What Are Supervised and Unsupervised Learning in Machine Learning?

(Source: https://unsplash.com/photos/z4H9MYmWIMA)

It is undeniable how vast is the enrichment Machine Learning has brought to the world since its introduction. Being an application-oriented set of concepts, the end goal of an ML application determines the means and algorithms to be used — or the realm, to say the least.

One aspect of the field of Machine Learning troubles most newcomers, to the point where it is included in the introduction in almost any ML course. It is the difference between Supervised and Unsupervised Learning. In this post, I try to explain the meaning of each, with an example, discussing the difference between them and tackling some of the misconceptions around them.

Where Do You Place Your Son?

Imagine that you have a young son. Your son has never seen any real animals. It is your responsibility to show him that there are different types of creatures out there. The experience needs to be as hands-on as possible, so that your son learns best. In order to perform this task, you will need to make a few decisions. Those decisions highly depend on one core question: where do you place your son?

Do you place him in a household, bring him a few pets and name their kind to him (e.g. cat, dog, parrot..) as well as explain to him anything he wants to know about them?

OR,

Do you take him to a jungle, where there are way too many animals for you to name the kind of, and let him figure out for himself that there are different kinds?

You don’t actually have to choose one of the above. Any choice of environment you make will likely fall similar to one or the other. For example, taking him to the zoo falls in the same category as the jungle. Even if you name every animal for him, he will be overwhelmed and won’t be able to memorize all of them as well as he would in a household with pets.

Simplistically, the idea is to either expose your son to a few kinds that you know very well and pass that knowledge onto him, or expose him to a lot of kinds and let him find patterns for himself.

Bringing Him Pets

If you go with this option, your son will be able to learn very well what those kinds of animals are. He will be able to find patterns that you can’t understand — or don’t have in mind — and even he himself can’t express. Thus you can expect him to know what a cat is, how it acts, what it eats, and the such for the rest of his life.

Your son will also be able to generalize his knowledge to the outdoors. If he sees a dog on the street, he will know it’s a dog. If he sees some animal that he didn’t have as a pet, he won’t be able to recognize the kind, but he will know that it’s nothing he was introduced to (i.e. not a dog or a cat).

Even though he knows some kinds very well, you can only teach him so much. You just can’t get him a lion or an elephant for pets. You can’t bring him some weird insect that even you don’t know enough about to teach him.

What you just did here is categorize your son as a Supervised Learning classifier. You gave him kinds that you have enough knowledge about. You told him their names and kinds. You explained to him things that he cares about and can label (i.e. food, color, eyes, ear shape, etc) and make use of. You gave him enough time to understand what they act like, and be able to differentiate very well between them.

In Supervised Learning, you provide the model with labeled data so that the model can learn to generalize based on those labels. For example, you feed a model 1000 images of cats, each associated with the label “cat”, and 1000 images of dogs, each labeled “dog”.

The way Supervised Learning works dictates the need for you to have enough knowledge about the classes (i.e. animals, in our analogy), so that you can pass that knowledge (i.e. data) onto the classifier — your son. The limitation here comes from the need to have that data, which is usually a very difficult task.

Throwing Him in A Jungle

If you go with this option, your son will be able to learn and recognize different patterns among animals, and put them in small groups based on what he observes. For instance, he will likely put all birds together in the same group/category in his head, as they all share one distinctive characteristic — the ability to fly.

You also need to bear in mind that your son won’t be able to have deep knowledge in any of the animal kinds, given that there is just so many that he needs to keep in mind. However, the patterns he will find for himself will likely dazzle you. He will be able to make links that you would never have thought of.

Even though your son won’t be able to actually name the animal kinds, he will have the capability to keep making patterns and grouping them together. This is extended to him being able to put a newcomer animal kind straight in its group without the need to know its name.

What you just did here is categorize your son as an Unsupervised Learning classifier. You introduced him to a lot of kinds that you don’t necessarily have a lot of deep knowledge about. He was able to learn the differences for himself nonetheless. Also, he will likely end up with some kinds that just don’t make sense to him (i.e. outliers). Those kinds will fall somewhere between other categories/groups of animals.

In Unsupervised Learning, you provide the model with unlabeled samples of data, give it time to find patterns and group those data samples together based on the patterns it arrives to.

Technicalities

The learning theory of Machine Learning models could fall under Supervised or Unsupervised Learning (or Reinforcement Learning in other contexts). These two can be thought of as “learning paradigms” followed in practice when building a Machine Learning model. Determining which paradigm to follow relies heavily on the application at hand and the type of data available. Labeled data is always desirable, as it can by used in both Supervised and Unsupervised applications (neglecting the actual labels in the latter). However, labeled data is usually expensive, and barely fits the purpose of the application when found.

Supervised Learning

As we have seen in the aforementioned analogy, in Supervised Learning, you know the labels and you feed those labels alongside the data samples themselves into the Machine Learning model for training. Examples of such a type include:

Linear Regression: a Machine Learning algorithm that allows us to map numeric inputs to numeric outputs, fitting a line into the data points.
Logistic Regression: a classification algorithm that is widely used when the dependent variable is binary (0 or 1).
Neural Networks: a Machine Learning framework that gets its effectiveness from introducing non-linearity to linear ML models.
Support Vector Machines: a Machine Learning algorithm that uses Margin Maximization in determining the optimal separator line between classes, utilizing the Kernel Trick.

Applications of such algorithms usually include: image classification, speech recognition, regression-based number predictions, etc.

Unsupervised Learning

It is worth emphasizing on that the major difference between Supervised and Unsupervised learning algorithms is the absence of data labels in the latter. Instead, the data features are fed into the learning algorithm, which determines how to label them (usually with numbers 0,1,2..) and based on what. This “based on what” part dictates which Unsupervised learning algorithm to follow.

It is important to mention that most Unsupervised Learning-based applications utilize a sub-field called clustering. Clustering is the process of grouping data samples together into clusters based on a certain feature that they share — exactly the purpose of unsupervised learning in the first place.

Examples of Unsupervised Learning algorithms include:

k-Means Clustering: a Clustering algorithms that separates data points into k clusters based on k centroids whose values are the means of the samples belonging to the particular cluster at any given time.
Autoencoders: a form of Neural Networks whose output belongs to the same feature space as the input, truly achieving an end-to-end approach.

Applications of such algorithms may include: content recommendation, product promotion, fraud detection, etc.

Conclusion

Machine Learning can be separated into two paradigms based on the learning approach followed. Supervised Learning algorithms learn from both the data features and the labels associated with which. Unsupervised Learning algorithms take the features of data points without the need for labels, as the algorithms introduce their own enumerated labels. The choice of which paradigm to follow depends on the application at hand and the type of data available, making each one superior within its own realm.