English   español  
Por favor, use este identificador para citar o enlazar a este item: http://hdl.handle.net/10261/166730
logo share SHARE logo core CORE   Add this article to your Mendeley library MendeleyBASE

Visualizar otros formatos: MARC | Dublin Core | RDF | ORE | MODS | METS | DIDL
Exportar a otros formatos:

Online reinforcement learning using a probability density estimation

AutorAgostini, Alejandro ; Celaya, Enric
Fecha de publicación2017
EditorMassachusetts Institute of Technology
CitaciónNeural Computation 29(1): 220-246 (2017)
ResumenFunction approximation in online, incremental, reinforcement learning needs to deal with two fundamental problems: biased sampling and nonstationarity. In this kind of task, biased sampling occurs because samples are obtained from specific trajectories dictated by the dynamics of the environment and are usually concentrated in particular convergence regions, which in the long term tend to dominate the approximation in the less sampled regions. The nonstationarity comes from the recursive nature of the estimations typical of temporal difference methods. This nonstationarity has a local profile, varying not only along the learning process but also along different regions of the state space.We propose to deal with these problems using an estimation of the probability density of samples represented with a gaussian mixture model. To deal with the nonstationarity problem, we use the common approach of introducing a forgetting factor in the updating formula. However, instead of using the same forgetting factor for the whole domain, we make it dependent on the local density of samples, which we use to estimate the nonstationarity of the function at any given input point. To address the biased sampling problem, the forgetting factor applied to each mixture component is modulated according to the new information provided in the updating, rather than forgetting depending only on time, thus avoiding undesired distortions of the approximation in less sampled regions.
DescripciónLetter: Communicated by Masa-aki Sato.
Versión del editorhttps://doi.org/10.1162/NECO_a_00906
Identificadoresdoi: 10.1162/NECO_a_00906
e-issn: 1530-888X
issn: 0899-7667
Aparece en las colecciones: (IRII) Artículos
Ficheros en este ítem:
Fichero Descripción Tamaño Formato  
lettermasaki.pdf1,03 MBAdobe PDFVista previa
Mostrar el registro completo

NOTA: Los ítems de Digital.CSIC están protegidos por copyright, con todos los derechos reservados, a menos que se indique lo contrario.