Umberto, Mert, Antti K., Fabio, Lukas, Sami
|5min||Agenda item||Antti K.|
Antti K.: currently working on 3 projects:
The longer update focuses on (2). There a 2 agents and a set of items (several middle relevance, 1 huge relevance) to be collected from a gridworld. The 2 agents are fully cooperating. Currently the idea is to employ Meta Learning and CTDE (Centralized Training Decentralized Execution). It looks like ad-hoc cooperation provides better result with respect to the standard approach.