Por favor, use este identificador para citar o enlazar a este item:
http://hdl.handle.net/10261/30078
COMPARTIR / EXPORTAR:
SHARE BASE | |
Visualizar otros formatos: MARC | Dublin Core | RDF | ORE | MODS | METS | DIDL | DATACITE | |
Campo DC | Valor | Lengua/Idioma |
---|---|---|
dc.contributor.author | Agostini, Alejandro | - |
dc.contributor.author | Celaya, Enric | - |
dc.date.accessioned | 2010-12-15T13:40:37Z | - |
dc.date.available | 2010-12-15T13:40:37Z | - |
dc.date.issued | 2006 | - |
dc.identifier.citation | IRI-TR-06-01 (2006) | - |
dc.identifier.uri | http://hdl.handle.net/10261/30078 | - |
dc.description.abstract | A Reinforcement Learning problem is formulated as trying to find the action policy that maximizes the accumulated reward received by the agent through time. One of the most popular algorithms used in RL is Q-Learning which uses an action-value function q(s,a) to evaluate the expectation of the maximum future cumulative reward that will be obtained from executing action a in situation s. Q-Learning, as well as conventional RL techniques, is defined for discrete environments with a finite set of states and actions. The action-value function is explicitly represented by storing values for each state-action (s,a) pair. In order to reach a good approximation of the value function all the (s,a) pairs must be experienced many times but in practical applications the amount of experience for learning to take place is unfeasible. Therefore, the value function must be generalized to infer in situations never experienced so far. The generalization problem has been widely treated in the field of machine learning. Supervised learning directly treats this issue and many generalization techniques have been developed in this field. Any of the representations used in supervised learning could, in principle, be applied to RL. But there are some important issues to take into account that make good generalization in RL very hard to achieve. One of the most remarkable is that the value function is learned while represented. In this work we propose a RL approach that uses a new function representation of the Q function that allows good generalization by capturing function regularities into decision rules. The representation is a kind of Decision List where each rule configures a subspace of the state-action space and provides an approximation of the Q function in its covered region. Rule selection for action evaluation is given by the rule with both, good accuracy in the estimation and high confidence in the related statistics. | - |
dc.language.iso | eng | - |
dc.publisher | CSIC-UPC - Instituto de Robótica e Informática Industrial (IRII) | - |
dc.relation.isversionof | Publisher's version | - |
dc.rights | openAccess | - |
dc.subject | Reinforcement learning | - |
dc.subject | Generalization | - |
dc.subject | Categorizartion | - |
dc.subject | Decision list | - |
dc.subject | Automatic theorem proving | - |
dc.subject | Intelligent robots and autonomous agents | - |
dc.subject | Machine learning | - |
dc.title | Generalization in reinforcement learning with a task-related world description using rules | - |
dc.type | informe técnico | - |
dc.relation.publisherversion | http://www.iri.upc.edu/publications/show/811 | - |
dc.relation.csic | Sí | - |
dc.type.coar | http://purl.org/coar/resource_type/c_18gh | es_ES |
item.cerifentitytype | Publications | - |
item.grantfulltext | open | - |
item.openairecristype | http://purl.org/coar/resource_type/c_18cf | - |
item.fulltext | With Fulltext | - |
item.languageiso639-1 | en | - |
item.openairetype | informe técnico | - |
Aparece en las colecciones: | (IRII) Informes y documentos de trabajo |
Ficheros en este ítem:
Fichero | Descripción | Tamaño | Formato | |
---|---|---|---|---|
Generalization in reinforcement.pdf | 628,19 kB | Adobe PDF | Visualizar/Abrir |
CORE Recommender
Page view(s)
364
checked on 24-abr-2024
Download(s)
142
checked on 24-abr-2024
Google ScholarTM
Check
NOTA: Los ítems de Digital.CSIC están protegidos por copyright, con todos los derechos reservados, a menos que se indique lo contrario.