Deep-Sea-Treasure-Concave#


Action Space	Discrete(4)
Observation Shape	(2,)
Observation High	[11 11]
Observation Low	[0 0]
Reward Shape	(2,)
Reward High	[124. -1.]
Reward Low	[ 0. -1.]
Import	`mo_gymnasium.make("deep-sea-treasure-concave-v0")`

Description#

The Deep Sea Treasure environment is classic MORL problem in which the agent controls a submarine in a 2D grid world.

Observation Space#

The observation space is a 2D discrete box with values in [0, 10] for the x and y coordinates of the submarine.

Action Space#

The actions is a discrete space where:

0: up
1: down
2: left
3: right

Reward Space#

The reward is 2-dimensional:

time penalty: -1 at each time step
treasure value: the value of the treasure at the current position

Starting State#

The starting state is always the same: (0, 0)

Episode Termination#

The episode terminates when the agent reaches a treasure.

Arguments#

dst_map: the map of the deep sea treasure. Default is the convex map from Yang et al. (2019). To change, use mo_gymnasium.make("DeepSeaTreasure-v0", dst_map=CONCAVE_MAP | MIRRORED_MAP).
float_state: if True, the state is a 2D continuous box with values in [0.0, 1.0] for the x and y coordinates of the submarine.

Credits#

The code was adapted from: Yang’s source. The background art is from https://ansimuz.itch.io/underwater-fantasy-pixel-art-environment. The submarine art was created with the assistance of DALL·E 2.