Four-Room¶


Action Space	Discrete(4)
Observation Shape	(14,)
Observation High	[13 13 13 13 13 13 13 13 13 13 13 13 13 13]
Observation Low	[0 0 0 0 0 0 0 0 0 0 0 0 0 0]
Reward Shape	(3,)
Reward High	[1. 1. 1.]
Reward Low	[0. 0. 0.]
Import	`mo_gymnasium.make("four-room-v0")`

Description¶

A discretized version of the gridworld environment introduced in [1]. Here, an agent learns to collect shapes with positive reward, while avoid those with negative reward, and then travel to a fixed goal. The gridworld is split into four rooms separated by walls with passage-ways.

References¶

[1] Barreto, André, et al. “Successor Features for Transfer in Reinforcement Learning.” NIPS. 2017.

Observation Space¶

The observation contains the 2D position of the agent in the gridworld, plus a binary vector indicating which items were collected.

Action Space¶

The action space is discrete with 4 actions: left, up, right, down.

Reward Space¶

The reward is a 3-dimensional vector with the following components:

+1 if collected a blue square, else 0
+1 if collected a green triangle, else 0
+1 if collected a red circle, else 0

Starting State¶

The agent starts in the lower left of the map.

Episode Termination¶

The episode terminates when the agent reaches the goal state, G.

Arguments¶

maze: Array containing the gridworld map. See MAZE for an example.

Credits¶

Code adapted from: Mike Gimelfarb’s source.