Four-Room

../../_images/four-room.gif

Action Space

Discrete(4)

Observation Shape

(14,)

Observation High

[13 13 13 13 13 13 13 13 13 13 13 13 13 13]

Observation Low

[0 0 0 0 0 0 0 0 0 0 0 0 0 0]

Reward Shape

(3,)

Reward High

[1. 1. 1.]

Reward Low

[0. 0. 0.]

Import

mo_gymnasium.make("four-room-v0")

Description

A discretized version of the gridworld environment introduced in [1]. Here, an agent learns to collect shapes with positive reward, while avoid those with negative reward, and then travel to a fixed goal. The gridworld is split into four rooms separated by walls with passage-ways.

References

[1] Barreto, André, et al. “Successor Features for Transfer in Reinforcement Learning.” NIPS. 2017.

Observation Space

The observation contains the 2D position of the agent in the gridworld, plus a binary vector indicating which items were collected.

Action Space

The action space is discrete with 4 actions: left, up, right, down.

Reward Space

The reward is a 3-dimensional vector with the following components:

  • +1 if collected a blue square, else 0

  • +1 if collected a green triangle, else 0

  • +1 if collected a red circle, else 0

Starting State

The agent starts in the lower left of the map.

Episode Termination

The episode terminates when the agent reaches the goal state, G.

Arguments

  • maze: Array containing the gridworld map. See MAZE for an example.

Credits

Code adapted from: Mike Gimelfarb’s source.