Universal Agent for Disentangling Environments and Tasks

Jiayuan Mao, Honghua Dong, Joseph J. Lim

In International Conference on Learning Representations (2018)

maojiayuan [at] gmail.com, dhh19951 [at] gmail.com, limjj [at] usc.edu

Recent state-of-the-art reinforcement learning algorithms are trained under the goal of excelling in one specific task. Hence, both environment and task specific knowledge are entangled into one framework. However, there are often scenarios where the environment (e.g. the physical world) is fixed while only the target task changes. Hence, borrowing the idea from hierarchical reinforcement learning, we propose a framework that disentangles task and environment specific knowledge by separating them into two units. The environment-specific unit handles how to move from one state to the target state; and the task-specific unit plans for the next target state given a specific task. The extensive results in simulators indicate that our method can efficiently separate and learn two independent units, and also adapt to a new task more efficiently than the state-of-the-art methods.

TL;DR: We propose a DRL framework that disentangles task and environment specific knowledge.

Figure1: Proposed Universal Agent, which consists of three parts: a perception function (phi) mapping raw observation to feature space, a path function as an environment actor, and a goal function (tau) for future state planning.

The perception module (phi): Given the raw observation of the state, the perception function phi encodes the observation into a feature space. It can be jointly optimized with path, or separately obtained (e.g., Auto-Encoder).
The environment-specific module (path): Given a (current state s, goal state s') pair. The path function outputs a probability distribution over the action space for the first action to take at state s in order to reach state s'.
The task-specific module (tau): Given the current state. the goal function (tau) determines what the goal state should be for a specific task. The path function is then invoked to get the next primitive action.

If you find this project useful, please consider citing:

@inproceedings{
    mao2018universal,
    title={Universal Agent for Disentangling Environments and Tasks},
    author={Jiayuan Mao and Honghua Dong and Joseph J. Lim},
    booktitle={International Conference on Learning Representations},
    year={2018},
    url={https://openreview.net/forum?id=B1mvVm-C-},
}

Universal Agent for Disentangling Environments and Tasks

=Overview

=Framework

=Resources