AIExplainer
Large Language Models Intermediate 1 min read

What is an autorater evaluation?

A hybrid mechanism for judging the quality of a generative AI model's output that combines human evaluation with automatic evaluation.

A hybrid mechanism for judging the quality of a generative AI model's output that combines human evaluation with automatic evaluation. An autorater is an ML model trained on data created by human evaluation. Ideally, an autorater learns to mimic a human evaluator. Prebuilt autoraters are available, but the best autoraters are fine-tuned specifically to the task you are evaluating.

Practitioners refer to autorater evaluation when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.