Mathematics of Operations Research
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


MATHEMATICS OF OPERATIONS RESEARCH
Vol. 33, No. 4, November 2008, pp. 880-898
DOI: 10.1287/moor.1080.0324
This Article
Right arrow Full Text (PDF)
Right arrow References
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Basu, A.
Right arrow Articles by Borkar, V. S.
Right arrow Search for Related Content

A Learning Algorithm for Risk-Sensitive Cost

Arnab Basu, Tirthankar Bhattacharyya, Vivek S. Borkar

Quantitative Methods and Information Systems Area, Indian Institute of Management Bangalore, Bangalore 560076, India
Department of Mathematics, Indian Institute of Science, Bangalore 560012, India
School of Technology and Computer Science, Tata Institute of Fundamental Research, Mumbai 400005, India

arnab.basu{at}iimb.ernet.in
tirtha{at}math.iisc.ernet.in, http://math.iisc.ernet.in/~tirtha
borkar{at}tifr.res.in, http://www.tcs.tifr.res.in/~borkar

A linear function approximation-based reinforcement learning algorithm is proposed for Markov decision processes with infinite horizon risk-sensitive cost. Its convergence is proved using the "o.d.e. method" for stochastic approximation. The scheme is also extended to continuous state space processes.

Key Words: learning algorithm; risk-sensitive cost; function approximation; stochastic approximation
History: Received: December 29, 2006; revision received: November 11, 2007;





HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 2008 by INFORMS.