Umberto, Mert, Antti K., Fabio, Lukas, Sami
- longer update from Antti K.
Antti K.: currently working on 3 projects:
- assist agents by observing only
- improving ad-hoc teammate performances by modeling other agents
- improving multiagents cooperation by optimizing beliefs over memory
The longer update focuses on (2). There a 2 agents and a set of items (several middle relevance, 1 huge relevance) to be collected from a gridworld. The 2 agents are fully cooperating. Currently the idea is to employ Meta Learning and CTDE (Centralized Training Decentralized Execution). It looks like ad-hoc cooperation provides better result with respect to the standard approach.
- Mert to provide a longer update next week.