Action Space


Observation Shape


Observation High

[1. 1. 1. 1. 1. 1. 1.]

Observation Low

[-1. -1. -1. -1. -1. -1. -1.]

Reward Shape


Reward High

[1.5 1.5 0. ]

Reward Low

[ 0. 0. -1.]




Agent must collect two types of ores and minimize fuel consumption. From Abels et al. 2019.

Observation Space#

The observation is a 7-dimensional vector containing the following information:

  • 2D position of the cart

  • Speed of the cart

  • sin and cos of the cart’s orientation

  • porcentage of the capacity of the cart filled If image_observation is True, the observation is a 3D image of the environment.

Action Space#

The action space is a discrete space with 6 actions:

  • 0: Mine

  • 1: Left

  • 2: Right

  • 3: Accelerate

  • 4: Brake

  • 5: None

Reward Space#

The reward is a 3D vector:

  • 0: Quantity of the first minerium that was retrieved to the base (sparse)

  • 1: Quantity of the second minerium that was retrieved to the base (sparse)

  • 2: Fuel consumed (dense)

Starting State#

The cart starts at the base on the upper left corner of the map.

Episode Termination#

The episode ends when the cart returns to the base.


  • render_mode: The render mode to use. Can be “rgb_array” or “human”.

  • image_observation: If True, the observation is a RGB image of the environment.

  • frame_skip: How many times each action is repeated. Default: 4

  • incremental_frame_skip: Whether actions are repeated incrementally. Default: True

  • config: Path to the .json configuration file. See the default configuration file for more information: https://github.com/Farama-Foundation/MO-Gymnasium/blob/main/mo_gymnasium/envs/minecart/mine_config.json


The code was refactored from Axel Abels’ source.