Machine Learning Beginner

self-training

A variant of self-supervised learning that is particularly useful when all of the following conditions are true: - The ratio of unlabeled examples to labeled examples in the dataset is high.

Plain English Explanation

A variant of self-supervised learning that is particularly useful when all of the following conditions are true: - The ratio of unlabeled examples to labeled examples in the dataset is high. - This is a classification problem. Self-training works by iterating over the following two steps until the model stops improving: 1. Use supervised machine learning to train a model on the labeled examples. 2. Use the model created in Step 1 to generate predictions (labels) on the unlabeled examples, moving those in which there is high confidence into the labeled examples with the predicted label. Notice that each iteration of Step 2 adds more labeled examples for Step 1 to train on.

How is it used?

Practitioners refer to self-training when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.