Deep-Sea-Treasure-Concave¶
Action Space |
Discrete(4) |
Observation Shape |
(2,) |
Observation High |
[11 11] |
Observation Low |
[0 0] |
Reward Shape |
(2,) |
Reward High |
[124. -1.] |
Reward Low |
[ 0. -1.] |
Import |
|
Description¶
The Deep Sea Treasure environment is classic MORL problem in which the agent controls a submarine in a 2D grid world.
Observation Space¶
The observation space is a 2D discrete box with values in [0, 10] for the x and y coordinates of the submarine.
Action Space¶
The actions is a discrete space where:
0: up
1: down
2: left
3: right
Reward Space¶
The reward is 2-dimensional:
time penalty: -1 at each time step
treasure value: the value of the treasure at the current position
Starting State¶
The starting state is always the same: (0, 0)
Episode Termination¶
The episode terminates when the agent reaches a treasure.
Arguments¶
dst_map: the map of the deep sea treasure. Default is the convex map from Yang et al. (2019). To change, use
mo_gymnasium.make("DeepSeaTreasure-v0", dst_map=CONCAVE_MAP | MIRRORED_MAP).
float_state: if True, the state is a 2D continuous box with values in [0.0, 1.0] for the x and y coordinates of the submarine.
Credits¶
The code was adapted from: Yang’s source. The background art is from https://ansimuz.itch.io/underwater-fantasy-pixel-art-environment. The submarine art was created with the assistance of DALL·E 2.