MO-Supermario¶
Action Space |
Discrete(256) |
Observation Shape |
(240, 256, 3) |
Observation High |
255 |
Observation Low |
0 |
Reward Shape |
(5,) |
Reward High |
[ inf 0. 0. 100. inf] |
Reward Low |
[-inf -inf -25. 0. 0.] |
Import |
|
Description¶
Multi-objective version of the SuperMarioBro environment.
See gym-super-mario-bros for more information.
Reward Space¶
The reward is a 5-dimensional vector:
0: How far Mario moved in the x position
1: Time penalty for how much time has passed between two time steps
2: -25 if Mario died, 0 otherwise
3: +100 if Mario collected coins, else 0
4: Points for killing an enemy
Episode Termination¶
The episode terminates when Mario dies or reaches the flag.