Available environments#

MO-Gymnasium includes environments taken from the MORL literature, as well as multi-objective version of classical environments, such as Mujoco.

Env

Obs/Action spaces

Objectives

Description

deep-sea-treasure-v0

Discrete / Discrete

[treasure, time_penalty]

Agent is a submarine that must collect a treasure while taking into account a time penalty. Treasures values taken from Yang et al. 2019.

deep-sea-treasure-concave-v0

Discrete / Discrete

[treasure, time_penalty]

Agent is a submarine that must collect a treasure while taking into account a time penalty. Treasures values taken from Vamplew et al. 2010.

resource-gathering-v0

Discrete / Discrete

[enemy, gold, gem]

Agent must collect gold or gem. Enemies have a 10% chance of killing the agent. From Barret & Narayanan 2008.

fishwood-v0

Discrete / Discrete

[fish_amount, wood_amount]

ESR environment, the agent must collect fish and wood to light a fire and eat. From Roijers et al. 2018.

breakable-bottles-v0

Discrete (Dictionary) / Discrete

[time_penalty, bottles_delivered, potential]

Gridworld with 5 cells. The agents must collect bottles from the source location and deliver to the destination. From Vamplew et al. 2021.

fruit-tree-v0

Discrete / Discrete

[nutri1, ..., nutri6]

Full binary tree of depth d=5,6 or 7. Every leaf contains a fruit with a value for the nutrients Protein, Carbs, Fats, Vitamins, Minerals and Water. From Yang et al. 2019.

water-reservoir-v0

Continuous / Continuous

[cost_flooding, deficit_water]

A Water reservoir environment. The agent executes a continuous action, corresponding to the amount of water released by the dam. From Pianosi et al. 2013.

four-room-v0

Discrete / Discrete

[item1, item2, item3]

Agent must collect three different types of items in the map and reach the goal. From Alegre et al. 2022.

mo-mountaincar-v0

Continuous / Discrete

[time_penalty, reverse_penalty, forward_penalty]

Classic Mountain Car env, but with extra penalties for the forward and reverse actions. From Vamplew et al. 2011.

mo-mountaincarcontinuous-v0

Continuous / Continuous

[time_penalty, fuel_consumption_penalty]

Continuous Mountain Car env, but with penalties for fuel consumption.

mo-lunar-lander-v2

Continuous / Discrete or Continuous

[landed, shaped_reward, main_engine_fuel, side_engine_fuel]

MO version of the LunarLander-v2 environment. Objectives defined similarly as in Hung et al. 2022.

minecart-v0

Continuous or Image / Discrete

[ore1, ore2, fuel]

Agent must collect two types of ores and minimize fuel consumption. From Abels et al. 2019.

mo-highway-v0 and mo-highway-fast-v0

Continuous / Discrete

[speed, right_lane, collision]

The agent’s objective is to reach a high speed while avoiding collisions with neighbouring vehicles and staying on the rightest lane. From highway-env.

mo-supermario-v0

Image / Discrete

[x_pos, time, death, coin, enemy]

[:warning: SuperMarioBrosEnv support is limited.] Multi-objective version of SuperMarioBrosEnv. Objectives are defined similarly as in Yang et al. 2019.

mo-reacher-v4

Continuous / Discrete

[target_1, target_2, target_3, target_4]

Mujoco version of mo-reacher-v0, based on Reacher-v4 environment.

mo-hopper-v4

Continuous / Continuous

[velocity, height, energy]

Multi-objective version of Hopper-v4 env.

mo-halfcheetah-v4

Continuous / Continuous

[velocity, energy]

Multi-objective version of HalfCheetah-v4 env. Similar to Xu et al. 2020.