Karch, Tristan and Colas, Cédric and Teodorescu, Laetitia and Moulin-Frier, Clément and Oudeyer, Pierre-Yves
This paper investigates the idea of encoding object-centered representations in the design of the reward function and policy architectures of a language-guided reinforcement learning agent. This is done using a combination of object-wise permutation invariant networks inspired from Deep Sets and gated-attention mechanisms. In a 2D procedurally-generated world where agents targeting goals in natural language navigate and interact with objects, we show that these architectures demonstrate strong generalization capacities to out-of-distribution goals. We study the generalization to varying numbers of objects at test time and further extend the object-centered architectures to goals involving relational reasoning.
@inproceedings{karch2020DeepSetsGeneralization,
title = {Deep {{Sets}} for {{Generalization}} in {{RL}}},
booktitle = {Beyond ``{{Tabula Rasa}}'' in {{Reinforcement Learning}} ({{BeTR}}-{{RL}}) {{Workshop}}},
author = {Karch, Tristan and Colas, C{\'e}dric and Teodorescu, Laetitia and {Moulin-Frier}, Cl{\'e}ment and Oudeyer, Pierre-Yves},
year = {2020}
}