What is unsupervised learning?

−0

Unsupervised learning covers a variety of different tasks. Depending on the task at hand, different techniques can be used. However, there are a few common paradigms that are used to extract information from unlabelled data.

A simple, yet effective method is to create labels from the input data. This is also known as self-supervised learning. Typically, this is done by using some modified version of the data as inputs and the original inputs as the label. This requires the model to learn a reconstruction function, making it capture a lot of useful information about the data at hand. Autoregressive models like GPT are trained to predict one or more of the next elements for a given sub-sequence.

It is also possible to learn something about the complex distribution of the data by mapping (unlabelled) samples to some simple distribution and/or vice versa. E.g. kernel density estimators represent the data distribution by a simple mixture model. A lot of modern generative (image and audio) models have learned to map samples from some normal distribution to samples from the data distribution (as captured by the model).

Finally, it is often possible to directly formulate an optimisation objective for the kind of information you wish to extract. The key challenge here is to find an objective that can be efficiently optimised. E.g. clustering methods typically try to find an assignment that minimises the distances between samples in the same cluster. Another example is principle components, which are the orthogonal unit vectors that maximise the variance of the data.

I tried to break things down a bit for the sake of clarity, but in the end, learning typically comes down to optimisation. Therefore, if you can set up an objective and have a method to find (approximate) solutions for your objective, you will be able to learn something, even from unlabelled data.

posted over 1 year ago

CC BY-SA 4.0

mr Tsjolder‭

11 9 3

Copy Link

Raw

Markdown

History

Communities

What is unsupervised learning? Question

0 comment threads

1 answer

0 comment threads