What types of learning should I know about?

−0

This question sounds basic, so I'll try to answer with a layperson in mind.

"Learning" is a bit misleading, because programs do not currently learn the way humans do. They do not have a mind or ability to reason, they don't "think". In essence, all machine learning today, including advanced AI like ChatGPT, is dumb habit, more akin to a mussel learning to close in response to some noxious stimuli than a college student learning a complex theory in class.

The learning happens because the model embeds some formula for crunching the input (which is currently always converted to numeric form, ie. to a list of numbers) and spitting out some result. This result can be a simple 1/0 (binary classification), or a new list of numbers which can be converted to an image (Dall-E), or even the URL or ID pointing to some resource online or in a database (retrieval, content recommendation) but currently it is always numeric (a list of numbers).

"Learning" is the technical term for tweaking the contents of the formula. For example, with linear regression, there is a set equation but the constants are tweaked. With a tree model, there is a decision tree that gets rearranged. With a neural network, there are many simple formulas (embedded in virtual "neurons"), and how exactly they are combined (their "weights") is what gets tweaked.

You can learn blindly, by just randomly tweaking the formula until it works on your test set. But usually it is much more efficient to have some heuristic for the model to tweak in a slightly more goal oriented manner. Professionals usually don't call it a "formula" but rather "weights", however weights can also refer to the structure of the formula and not merely the values.

In supervised learning, you simply show the model a bunch of paired inputs and correct outputs, and make it figure out the weights that would have resulted in those outputs. There are many ways to actually update the weights. Usually, you don't want the weights to perfectly match your sample data ("overfitting"), but rather you want to allow some freedom so the model can better deal with new data later that was not present in your sample. Incidentally, if the sample data is too complex and it's impossible to find a decent set of weights, that's underfitting. Note that for this to work, you need a potentially very large set of pre-calculated input-output pairs. Often you have to pay people to create this by hand, or pay data vendors to the dataset to you.

In unsupervised learning, there are precalculated outputs. You can see how it's a bit harder than supervised - how will you figure out the weights if you have no sample output? People have come up with some clever tricks though. The classic is clustering: It groups similar data together. Technically, there is an aspect of supervised to clustering - how many groups, how similar is similar enough, etc are basically weights. But these are usually not "learned" with machine learning, but rather selected in ad hoc ways by the programmer. You don't cluster by showing the model a bunch of correct clusters, instead you implement the clustering logic directly (even if you have to tweak the parameters a bit) - that's what makes unsupervised. Not all unsupervised learning is clustering.

Reinforcement learning is probably the most similar to biological learning. You show the model some inputs, and based on its response, you "reward" or "punish" it. This doesn't mean you kick and pat the computer, you simply give it a score which the model uses to update its weights based on whatever algorithm was chosen by the inventor of that type of model. Notably, the reward or punishment is not necessarily based on whether it gave the "correct output" - it's possible to do it based on a more vague basis such as whether you like the output or not. I'm not an expert on reinforcement, so I'll stop here.

Not mentioned is transfer learning. Many supervised models are programmed so that they're trained on a large batch of data. They don't consume inputs one at a time. Let's say you have a model that predicts if an email would be considered good news or bad news by the reader. Every week, you get feedback from users about whether the model was right or wrong, and you want to use this to refine the training. But it's expensive to train the model on a giant (and ever-growing) batch of emails every week, and a lot of those emails it's seeing are the same every week, so it's wasted effort to "relearn" those. Isn't there a better way? With some ML algorithms, there is. If the model is programmed the right way, instead of retraining on a whole batch each time, you just feed it only the new data, and it can update the weights, while respecting what was learned from the previous training runs. This is called incremental training. When you train first on data X, then incrementally on data Y, the model may not be the same or as good as if you had trained on data X+Y.

Another important use of transfer learning: Say Google publishes a model that can recognize human faces in photos. It took them millions of dollars and billions of pre-labeled photos to do it. That's great! But now you want to detect cats in photos. Do you also have to go to all that trouble? Not necessarily. A lot of the work of recognizing objects in pictures is not particular to the type of object: Things like finding outlines, segmenting, adjusting for light are universal (not if you want to use a face recognition model to find molecules electron microscope captures!). If the model Google used supports transfer learning, it can be possible to simply do a much smaller, additional training with a few cat photos only, to teach it to "do the same thing, except with cats instead of faces".

With LLMs, there is also fine tuning. The bulk of the model is about learning to parse sentences and make new sentences. However, there is a tiny extra bit where you also try to teach the model to not curse, answer in the correct language, don't make things up too much. That extra bit is called fine tuning. I don't know much about fine tuning, but I suspect it can be via reinforcement, transfer learning, incremental training or some combination of these.

posted over 1 year ago

CC BY-SA 4.0

matthewsnyder‭

141 82 73

Copy Link

Raw

Markdown

History

1 comment thread

Might be wrong (4 comments)

Communities

What types of learning should I know about? Question

1 comment thread

1 answer

1 comment thread