What is a distillation?
The process of reducing the size of one model (known as the teacher) into a smaller model (known as the student) that emulates the original model's predictions as faithfully as possible.
distillation explained in plain English
The process of reducing the size of one model (known as the teacher) into a smaller model (known as the student) that emulates the original model's predictions as faithfully as possible. Distillation is useful because the smaller model has two key benefits over the larger model (the teacher): - Faster inference time - Reduced memory and energy usage However, the student's predictions are typically not as good as the teacher's predictions. Distillation trains the student model to minimize a loss function based on the difference between the outputs of the predictions of the student and teacher models. Compare and contrast distillation with the following terms: - fine-tuning - prompt-based learning See LLMs: Fine-tuning, distillation, and prompt engineering in Machine Learning Crash Course for more information.
Example
Practitioners refer to distillation when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.
People also read
- Inference
The phase when a trained model is actually used — taking new input and producing a prediction or response.
- prompt tuning
A parameter efficient tuning mechanism that learns a "prefix" that the system prepends to the actual prompt.
- average precision at k
A metric for summarizing a model's performance on a single prompt that generates ranked results, such as a numbered list of book recommendations.
- bag of words
A representation of the words in a phrase or passage, irrespective of order.
- bidirectional language model
A language model that determines the probability that a given token is present at a given location in an excerpt of text based on the preceding and following text.
- black box model
A model whose "reasoning" is impossible or difficult for humans to understand.
- Chain-of-Thought Prompting
Asking an AI to show its reasoning step by step before giving a final answer, which often improves accuracy on complex tasks.
- conversational coding
An iterative dialog between you and a generative AI model for the purpose of creating software.
- cross-entropy
A generalization of Log Loss to multi-class classification problems.
- deterministic
A system that always returns the same output for a given input.