MO-Mountaincarcontinuous¶
Action Space |
Box(-1.0, 1.0, (1,), float32) |
Observation Shape |
(2,) |
Observation High |
[0.6 0.07] |
Observation Low |
[-1.2 -0.07] |
Reward Shape |
(2,) |
Reward High |
[0. 0.] |
Reward Low |
[-1. -1.] |
Import |
|
A continuous version of the MountainCar environment, where the goal is to reach the top of the mountain.
See source for more information.
Reward space:¶
The reward space is a 2D vector containing the time penalty and the fuel reward.
time penalty: -1.0 for each time step
fuel reward: -||action||^2 , i.e. the negative of the norm of the action vector