Published January 1, 2013 | Version v1
Conference paper Open

Generating Memoryless Policies Faster using Automatic Temporal Abstractions for Reinforcement Learning with Hidden State

  • 1. Middle East Tech Univ, Dept Comp Engn, Ankara, Turkey

Description

Reinforcement learning with eligibility traces has been an effective way to solve problems with hidden state. Under certain conditions, it succeeds to build up a memoryless optimal policy over observations. Automatic generation of temporal abstractions, on the other hand, provides ways to extract and make use of useful sub-policies during reinforcement learning for a fully observable problem setting, so that the agent shall not need to repeatedly learn the same skill. One of the recent automatic abstraction techniques is the extended sequence tree method. We propose a novel way to bring together the extended sequence tree method and reinforcement learning for problems with hidden state. We expand the extended sequence tree method with a mechanism that helps the abstraction procedure to get rid of adverse effects of perceptual aliasing, letting the agent to make use of the remaining useful abstractions. Effectiveness of the method is shown empirically via experimentation on some benchmark problems.

Files

bib-3a7b42d8-2250-461f-8257-dd4be9cab043.txt

Files (238 Bytes)

Name Size Download all
md5:a3d279397508c8fbb8d194771ffb334f
238 Bytes Preview Download