CleanMARL (Clean Implementation of MARL Algorithms)

Based on philosophy of this project:

CleanRL

CleanMARL is a Deep MultiAgent Reinforcement Learning library that provides high-quality single-file implementation with research-friendly features.

Algorithms Implemented

Algorithm	Variants Implemented
✅ MultiAgent Proximal Policy Gradient (MAPPO)	`mappo_mpe.py`

Implementation features

Env state based on concatanation of local observations as input to critic
Huber loss for critic (value) network
Value normalization

Authors don`t elaborate on math related to value normalization, but actualy it was done in the next manner (clip on minimum value to exclude zeros omitted)

$$\text{mean} = \mathop{\mathbb{E}}[R]$$

$$\text{meansq} = \mathop{\mathbb{E}}[R^2]$$

$$\beta\text{-debiasing term }$$

$$\text{mean}_t = w*\text{mean}_{t-1} + (1-w)*\text{minibatch.mean()}$$

$$\text{meansq}_t =w*\text{meansq}_{t-1} + (1-w)*\text{minibatch.mean()}^2$$

$$\beta_t=w\beta_{t-1} + (1-w)*1$$

$$v_\text{normalized} = \frac{v - \text{mean}/\beta}{\text{meansq}/\beta- \text{mean}^2}$$

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
cleanmarl		cleanmarl
cleanmarl_utils		cleanmarl_utils
doc		doc
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CleanMARL (Clean Implementation of MARL Algorithms)

Algorithms Implemented

Implementation features

About

Releases

Packages

Languages

james116blue/cleanmarl

Folders and files

Latest commit

History

Repository files navigation

CleanMARL (Clean Implementation of MARL Algorithms)

Algorithms Implemented

Implementation features

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages