Deep-Sea-Treasure

../../_images/deep-sea-treasure.gif

Action Space

Discrete(4)

Observation Shape

(2,)

Observation High

[11 11]

Observation Low

[0 0]

Reward Shape

(2,)

Reward High

[23.7 -1. ]

Reward Low

[ 0. -1.]

Import

mo_gymnasium.make("deep-sea-treasure-v0")

Description

The Deep Sea Treasure environment is classic MORL problem in which the agent controls a submarine in a 2D grid world.

Observation Space

The observation space is a 2D discrete box with values in [0, 10] for the x and y coordinates of the submarine.

Action Space

The actions is a discrete space where:

  • 0: up

  • 1: down

  • 2: left

  • 3: right

Reward Space

The reward is 2-dimensional:

  • time penalty: -1 at each time step

  • treasure value: the value of the treasure at the current position

Starting State

The starting state is always the same: (0, 0)

Episode Termination

The episode terminates when the agent reaches a treasure.

Arguments

  • dst_map: the map of the deep sea treasure. Default is the convex map from Yang et al. (2019). To change, use mo_gymnasium.make("DeepSeaTreasure-v0", dst_map=CONCAVE_MAP | MIRRORED_MAP).

  • float_state: if True, the state is a 2D continuous box with values in [0.0, 1.0] for the x and y coordinates of the submarine.

Credits

The code was adapted from: Yang’s source. The background art is from https://ansimuz.itch.io/underwater-fantasy-pixel-art-environment. The submarine art was created with the assistance of DALL·E 2.