Large Language Models Prompt Engineering Intermediate 1 min read

What is a side-by-side evaluation?

Comparing the quality of two models by judging their responses to the same prompt.

side-by-side evaluation explained in plain English

Comparing the quality of two models by judging their responses to the same prompt. For example, suppose the following prompt is given to two different models: Create an image of a cute dog juggling three balls. In a side-by-side evaluation, a rater would pick which image was "better" (More accurate? More beautiful? Cuter?).

Example

Practitioners refer to side-by-side evaluation when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.

side-by-side evaluation explained in plain English

Example

People also read