BreakableBottles#
Action Space 
Discrete(3) 
Observation Space 
Dict(‘bottles_carrying’: Discrete(3), ‘bottles_delivered’: Discrete(2), ‘bottles_dropped’: MultiBinary(3), ‘location’: Discrete(5)) 
Reward Shape 
(3,) 
Reward High 
[ 0. 50. 0.] 
Reward Low 
[inf 0. 1.] 
Import 

Description#
This environment implements the problems UnbreakableBottles and BreakableBottles defined in Section 4.1.2 of the paper Potentialbased multiobjective reinforcement learning approaches to lowimpact agents for AI safety.
Action Space#
The action space is a discrete space with 3 actions:
0: move left
1: move right
2: pick up a bottle
Observation Space#
The observation space is a dictionary with 4 keys:
location: the current location of the agent
bottles_carrying: the number of bottles the agent is currently carrying (0, 1 or 2)
bottles_delivered: the number of bottles the agent has delivered (0 or 1)
bottles_dropped: for each location, a boolean flag indicating if that location currently contains a bottle
Reward Space#
The reward space has 3 dimensions:
time penalty: 1 for each time step
bottle reward: bottle_reward for each bottle delivered
potential: While carrying multiple bottles there is a small probability of dropping them. A potentialbased penalty is applied for bottles left on the ground.
Starting State#
The agent starts at location 0, carrying no bottles, having delivered no bottles and having dropped no bottles.
Episode Termination#
The episode terminates when the agent has delivered 2 bottles.
Arguments#
size: the number of locations in the environment
prob_drop: the probability of dropping a bottle while carrying 2 bottles
time_penalty: the time penalty for each time step
bottle_reward: the reward for delivering a bottle
unbreakable_bottles: if True, a bottle which is dropped in a location can be picked up again (so the outcome of dropping a bottle is reversible), otherwise a dropped bottle cannot be picked up.
Credits#
This environment was originally a contribution of Robert Klassert The home asset is from https://limezu.itch.io/serenevillagerevamped The gold, enemy and gem assets are from https://ninjikin.itch.io/treasure The bottles pixel art was created with the assistance of DALL·E 2.