What is a trajectory?
In reinforcement learning, a sequence of tuples that represent a sequence of state transitions of the agent, where each tuple corresponds to the state, action, reward, and next state for a given state transition.
trajectory explained in plain English
Example
Practitioners refer to trajectory when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.
People also read
- action
In reinforcement learning, the mechanism by which the agent transitions between states of the environment.
- environment
In reinforcement learning, the world that contains the agent and allows the agent to observe that world's state.
- episode
In reinforcement learning, each of the repeated attempts by the agent to learn an environment.
- epsilon greedy policy
In reinforcement learning, a policy that either follows a random policy with epsilon probability or a greedy policy otherwise.
- experience replay
In reinforcement learning, a DQN technique used to reduce temporal correlations in training data.
- policy
In reinforcement learning, an agent's probabilistic mapping from states to actions.
- Q-learning
In reinforcement learning, an algorithm that allows an agent to learn the optimal Q-function of a Markov decision process by applying the Bellman equation.
- replay buffer
In DQN-like algorithms, the memory used by the agent to store state transitions for use in experience replay.
- return
In reinforcement learning, given a certain policy and a certain state, the return is the sum of all rewards that the agent expects to receive when following the policy from the state to the end of the episode.
- state
In reinforcement learning, the parameter values that describe the current configuration of the environment, which the agent uses to choose an action.