Water-Reservoir¶


Action Space	Box(0.0, inf, (1,), float32)
Observation Shape	(1,)
Observation High	[inf]
Observation Low	[0.]
Reward Shape	(2,)
Reward High	[0. 0.]
Reward Low	[-inf -inf]
Import	`mo_gymnasium.make("water-reservoir-v0")`

Description¶

A Water reservoir environment. The agent executes a continuous action, corresponding to the amount of water released by the dam.

A. Castelletti, F. Pianosi and M. Restelli, “Tree-based Fitted Q-iteration for Multi-Objective Markov Decision problems,” The 2012 International Joint Conference on Neural Networks (IJCNN), Brisbane, QLD, Australia, 2012, pp. 1-8, doi: 10.1109/IJCNN.2012.6252759.

Observation Space¶

The observation is a float corresponding to the current level of the reservoir.

Action Space¶

The action is a float corresponding to the amount of water released by the dam. If normalized_action is True, the action is a float between 0 and 1 corresponding to the percentage of water released by the dam.

Reward Space¶

There are up to 4 rewards:

cost due to excess level wrt a flooding threshold (upstream)
deficit in the water supply wrt the water demand
deficit in hydroelectric supply wrt hydroelectric demand
cost due to excess level wrt a flooding threshold (downstream) By default, only the first two are used.

Starting State¶

The reservoir is initialized with a random level between 0 and 160.

Arguments¶

- render_mode: The render mode to use. Can be 'human', 'rgb_array' or 'ansi'.
- time_limit: The maximum number of steps until the episode is truncated.
- nO: The number of objectives to use. Can be 2, 3 or 4.
- penalize: Whether to penalize the agent for selecting an action out of bounds.
- normalized_action: Whether to normalize the action space as a percentage [0, 1].
- initial_state: The initial state of the reservoir. If None, a random state is used.

Credits¶

Code from: Mathieu Reymond. Ported from: Simone Parisi.

Sky background image from: Paulina Riva (https://opengameart.org/content/sky-background)