[Question] What Is the Idea Behind (Un-)Supervised Learning and Reinforcement Learning?

Through the authority of my ml class and Wikipedia[1] I “learned” that ml algorithms can be roughly put into 3 clusters: unsupervised, supervised, and reinforcement learning.

When learning about machine learning, one term I stumbled over is (un)supervised learning. While at this point I might be able to guess the teacher’s password when someone asked me what (un)supervised learning means[1:1], I don’t understand why one would carve up reality this way. Most resources I found through googling didn’t really help but suggested that for some reason, reinforcement learning would also be thought of as another separate category.

So instead, I felt forced to think about this for myself. The rough abstract picture in my head after thinking for a few minutes:

How do they relate to carving up thing space/​feature space?

  • Unsupervised learning: Computing “summary statistics” for a set of data points.

  • Supervised learning: Extrapolating missing features for a set of data points.

  • Reinforcement learning: Special case of supervised learning, learning some function (action, states) → “values” (something humans care about).

Why are these categories useful?

Machine learning is about getting rid of humans for getting things done. Here is how different algorithms contribute:

  • Unsupervised learning: Computes summary statistics informing human decisions

  • Supervised learning: Making predictions for humans and implicitly making decisions for humans.

  • Reinforcement learning: Explicitly making decisions on behalf of humans.

How do they relate to each other?

It seems like any supervised learning algorithm would need to contain some unsupervised algorithm doing the compression. The unsupervised algorithm could be turned into a “supervised” one, by using the “summary statistics” to predict features we don’t know about a data point through its other features. In the end, the whole reason we are doing anything is that we care about it: reinforcement learning is a special kind of supervised algorithm, designed such that we can exceed humans by explicitly encoding their judgement to “act” on surfaced insights.

The unsupervised/​supervised distinction still seems a bit “unnatural”/​useless to me. Feel free to leave a comment if you have more insight into why one might or might not carve up reality this way.



  1. ↩︎↩︎

    Definition from Wikipedia: “Unsupervised learning is a type of algorithm that learns patterns from untagged data. The hope is that through mimicry, which is an important mode of learning in people, the machine is forced to build a compact internal representation of its world and then generate imaginative content from it. In contrast to supervised learning where data is tagged by an expert, e.g. as a “ball” or “fish”, unsupervised methods exhibit self-organization that captures patterns as probability densities [1] or a combination of neural feature preferences. The other levels in the supervision spectrum are reinforcement learning where the machine is given only a numerical performance score as guidance, and semi-supervised learning where a smaller portion of the data is tagged. ”

No comments.