Supervised vs. Unsupervised Learning – Use & Myths!

By Published On: May 20th, 2020Categories: artificial intelligence

Have you ever wondered whether there is a connection between the weather and your sales figures or how your sales will develop in the next few months? If so, you have probably often wished for a “black box” for these problems. You provide your problems as input and on the other side, the solution magically appears.

But isn’t this exactly what algorithms in machine learning already do? Let’s find out.

First of all, it is important to understand what an algorithm actually is. Simply put, an algorithm is a way of solving a problem. For example, when developing chatbots, we are faced with the problem that we need to know which questions our users will ask.

For this purpose, we use an algorithm for topic recognition. With the help of this algorithm, we try to understand which topics are important to the users and which questions are asked the most frequently.


What is the difference between Supervised and Unsupervised Learning?

However, these algorithms have to be constantly trained and improved. No matter how well you know the users of your chatbot and their problems: New topics or questions can always arise that you have never even thought of. 

You can distinguish between two types of training for an algorithm: Unsupervised Learning and Supervised Learning. These differ mainly in the type of data and the use cases.

  Supervised Learning Unsupervised Learning
Data Set An example data set is given to the algorithm. This contains data that is already divided into specific categories/clusters (labeled data). The algorithm is given data that does not have a previous classification (unlabeled data).
Use Cases

Well suited for making classifications, i.e. dividing the data according to the given clusters.


Also, well suited for regression analysis, for example, to make predictions about the frequency of questions asked.

Well suited for clustering. In this way, patterns in data can be identified.


Also, suitable for association analysis, in order to recognize which relationships exist among the data.

Supervised Learning


Unsupervised learning is not perfect

At first glance, the Unsupervised Learning approach seems to be the desired black box through which patterns or correlations in the data can be automatically recognized without knowing about them in advance. For example, as in our initial question, this could be a connection between weather and sales.

In theory, an Unsupervised Learning Algorithm can find such connections. In reality, however, although the Unsupervised Learning approach works well for some use cases, there are often outliers and errors. For example, the classification in clustering may not make sense at all for some categories or additional information may be required.

The Twitter account “City Describer” shows numerous examples of such incorrect classifications. On this account, you can find photos of cities described using Computer Vision AI from Microsoft. Unfortunately not always with success. For example, a picture of Central Park is confused with a traffic light on a tree or the Vatican is mistaken for a table with food.



The right mix matters

Since the unsupervised learning approach does not always work reliably, a combination of different supervised and unsupervised learning approaches is often a good choice. At Onlim, we also use a combination of both approaches to discover additional search terms.

Let us assume that you have developed a bot that answers users’ questions about events. The bot is able to answer questions when the user asks for events at a certain place or date. You would like to optimize this bot further. To do this, you can use an Unsupervised Learning approach, which allows you to identify additional topics. In doing so, you will find that users are not only interested in events at a certain place or date, but also in events of different directions, such as sports or music.

You can use this finding to train a supervised learning algorithm that sorts the user messages into event categories.

Additionally, we work on identifying new topics based on user messages using a combined approach. Imagine the following scenario: You have built a bot for a ski resort that can answer questions about the stay at the resort. Even if your bot is designed well, it can happen that problems and questions arise for which the bot has not been trained yet. For example, it is possible that weather conditions in the ski resort suddenly change to the worse, which leads to more questions about the use of snow chains, winter tyres or the current snow depth. An Unsupervised Learning approach can help to raise awareness of these new questions. We can then define new clusters, refine them using a supervised learning approach and use them for further training of the bot.

The examples show that the term “unsupervised” is rather misleading and that it is always necessary to check and adjust the results. The paper “Is ‘Unsupervised Learning’ a Misconceived Term?” also offers an interesting perspective on this topic. The paper proposes to rename the terms and use “Internally” and “Externally Supervised” instead.


Have you already read our whitepaper on “More Knowledge For Chatbots & Voice Assistants?”



The future of Unsupervised Learning

Just because there is no “real” Unsupervised Learning, yet, this does not mean that no research is being done on this topic. In his book “The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World”, Pedro Domingos describes how research institutions are working hard to develop the “ultimate algorithm”.

How long it will take until there will be a kind of black box is hard to say. Until then, impressive results can already be achieved by combining different training approaches. 

If you are interested in how we use Unsupervised Learning at Onlim and what advantages it has for your chatbot, please arrange a free consultation with one of our chatbot experts. We are happy to tell you more about it.

Supervised vs. Unsupervised Learning - Use & Myths!
Supervised vs. Unsupervised Learning - Use & Myths!