Agent Spaces

Agent Spaces describes a novel topology on the set of agents in a decision process. These spaces are called the agent space of the process, for example: "the agent space of tetris". Provided that the set of actions is stochastic, and given mild assumptions about the process itself, both the distribution of paths (i.e. trajectories) and the expected reward are continuous in the agent space. We use this space in a future work as the basis for a modification of the Novelty Search method pioneered by Kenneth O. Stanley and Joel Lehman. Because of the generality of the agent space, our version of Novelty Search is applicable in any decision process which satisfies the assumptions. We describe it as an exploratory method and employ it alongside an exploitative algorithm to great effect in the upcoming paper, which we hope to release by the end of 2022.

Abstract: Exploration is one of the most important tasks in Reinforcement Learning, but it is not well-defined beyond finite problems in the Dynamic Programming paradigm (see Subsection 2.4). We provide a reinterpretation of exploration which can be applied to any online learning method. We come to this definition by approaching exploration from a new direction. After finding that concepts of exploration created to solve simple Markov decision processes with Dynamic Programming are no longer broadly applicable, we reexamine exploration. Instead of extending the ends of dynamic exploration procedures, we extend their means. That is, rather than repeatedly sampling every state-action pair possible in a process, we define the act of modifying an agent to itself be explorative. The resulting definition of exploration can be applied in infinite problems and non-dynamic learning methods, which the dynamic notion of exploration cannot tolerate. To understand the way that modifications of an agent affect learning, we describe a novel structure on the set of agents: a collection of distances (see footnote 7) d_a∈A, which represent the perspectives of each agent possible in the process. Using these distances, we define a topology and show that many important structures in Reinforcement Learning are well behaved under the topology induced by convergence in the agent space.

arXiv
Twitter