What is a golden response?
A response known to be good.
golden response explained in plain English
A response known to be good. For example, given the following prompt: 2 + 2 The golden response is hopefully: 4
Some evaluation metrics, such as ROUGE, compare reference text to a model's generated text. When there is a single right answer to a prompt, the golden response typically serves as the reference text. Some prompts have no one right answer. For example, the prompt Summarize this document would likely have many right answers. For such prompts, reference text is often impractical because a model can generate a very wide range of possible summaries. However, a golden response might be helpful in this situation. For example, a golden response containing a good document summary can help train an autorater to discover patterns of good document summaries. ---
Example
Practitioners refer to golden response when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.
People also read
- reference text
An expert's response to a prompt.
- automatic evaluation
Using software to judge the quality of a model's output.
- average precision at k
A metric for summarizing a model's performance on a single prompt that generates ranked results, such as a numbered list of book recommendations.
- bag of words
A representation of the words in a phrase or passage, irrespective of order.
- BERT
A model architecture for text representation.
- bigram
An N-gram in which N=2.
- black box model
A model whose "reasoning" is impossible or difficult for humans to understand.
- BLEU
A metric between 0.
- BLEURT
A metric for evaluating machine translations from one language to another, particularly to and from English.
- Chain-of-Thought Prompting
Asking an AI to show its reasoning step by step before giving a final answer, which often improves accuracy on complex tasks.