English   español  
Please use this identifier to cite or link to this item: http://hdl.handle.net/10261/166730
logo share SHARE logo core CORE   Add this article to your Mendeley library MendeleyBASE

Visualizar otros formatos: MARC | Dublin Core | RDF | ORE | MODS | METS | DIDL | DATACITE
Exportar a otros formatos:


Online reinforcement learning using a probability density estimation

AuthorsAgostini, Alejandro ; Celaya, Enric
Issue Date2017
PublisherMassachusetts Institute of Technology
CitationNeural Computation 29(1): 220-246 (2017)
AbstractFunction approximation in online, incremental, reinforcement learning needs to deal with two fundamental problems: biased sampling and nonstationarity. In this kind of task, biased sampling occurs because samples are obtained from specific trajectories dictated by the dynamics of the environment and are usually concentrated in particular convergence regions, which in the long term tend to dominate the approximation in the less sampled regions. The nonstationarity comes from the recursive nature of the estimations typical of temporal difference methods. This nonstationarity has a local profile, varying not only along the learning process but also along different regions of the state space.We propose to deal with these problems using an estimation of the probability density of samples represented with a gaussian mixture model. To deal with the nonstationarity problem, we use the common approach of introducing a forgetting factor in the updating formula. However, instead of using the same forgetting factor for the whole domain, we make it dependent on the local density of samples, which we use to estimate the nonstationarity of the function at any given input point. To address the biased sampling problem, the forgetting factor applied to each mixture component is modulated according to the new information provided in the updating, rather than forgetting depending only on time, thus avoiding undesired distortions of the approximation in less sampled regions.
DescriptionLetter: Communicated by Masa-aki Sato.
Publisher version (URL)https://doi.org/10.1162/NECO_a_00906
Identifiersdoi: 10.1162/NECO_a_00906
e-issn: 1530-888X
issn: 0899-7667
Appears in Collections:(IRII) Artículos
Files in This Item:
File Description SizeFormat 
lettermasaki.pdf1,03 MBAdobe PDFThumbnail
Show full item record
Review this work

WARNING: Items in Digital.CSIC are protected by copyright, with all rights reserved, unless otherwise indicated.