Based on philosophy of this project:
CleanMARL is a Deep MultiAgent Reinforcement Learning library that provides high-quality single-file implementation with research-friendly features.
Algorithm | Variants Implemented |
---|---|
✅ MultiAgent Proximal Policy Gradient (MAPPO) | mappo_mpe.py |
- Env state based on concatanation of local observations as input to critic
- Huber loss for critic (value) network
- Value normalization
Authors don`t elaborate on math related to value normalization, but actualy it was done in the next manner (clip on minimum value to exclude zeros omitted)