Reinforcement Learning Intermediate 1 min read

What is a Reinforcement Learning from Human Feedback?

Using feedback from human raters to improve the quality of a model's responses.

Reinforcement Learning from Human Feedback explained in plain English

Using feedback from human raters to improve the quality of a model's responses. For example, an RLHF mechanism can ask users to rate the quality of a model's response with a 👍 or 👎 emoji. The system can then adjust its future responses based on that feedback.

Example

Practitioners refer to reinforcement learning from human feedback when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.

Reinforcement Learning from Human Feedback explained in plain English

Example

People also read