MO-Mountaincar¶
Action Space |
Discrete(3) |
Observation Shape |
(2,) |
Observation High |
[0.6 0.07] |
Observation Low |
[-1.2 -0.07] |
Reward Shape |
(3,) |
Reward High |
[-1. 0. 0.] |
Reward Low |
[-1. -1. -1.] |
Import |
|
A multi-objective version of the MountainCar environment, where the goal is to reach the top of the mountain.
See Gymnasium’s env for more information.
Reward space:¶
By default, the reward space is a 3D vector containing the time penalty, and penalties for reversing and going forward.
time penalty: -1.0 for each time step
reverse penalty: -1.0 for each time step the action is 0 (reverse)
forward penalty: -1.0 for each time step the action is 2 (forward)
#Alternatively, the reward can be changed with the following options:
add_speed_objective: Add an extra objective corresponding to the speed of the car.
remove_move_penalty: Remove the reverse and forward objectives.
merge_move_penalty: Merge reverse and forward penalties into a single penalty.