API

As for Gymnasium, the MO-Gymnasium API models environments as simple Python env classes. Creating environment instances and interacting with them is very simple - here’s an example using the “minecart-v0” environment:

import gymnasium as gym
import mo_gymnasium as mo_gym
import numpy as np

# It follows the original Gymnasium API ...
env = mo_gym.make('minecart-v0')

obs, info = env.reset()
# but vector_reward is a numpy array!
next_obs, vector_reward, terminated, truncated, info = env.step(your_agent.act(obs))

# Optionally, you can scalarize the reward function with the LinearReward wrapper
env = mo_gym.wrappers.LinearReward(env, weight=np.array([0.8, 0.2, 0.2]))

For details on multi-objective MDP’s (MOMDP’s) and other MORL definitions, see A practical guide to multi-objective reinforcement learning and planning.

You can also check more examples in this colab notebook! MO-Gym Demo in Colab