Mathematics of Operations Research
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


MATHEMATICS OF OPERATIONS RESEARCH
Vol. 32, No. 1, February 2007, pp. 51-72
DOI: 10.1287/moor.1060.0224
This Article
Right arrow Full Text (PDF)
Right arrow References
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Cheevaprawatdomrong, T.
Right arrow Articles by Garcia, A.
Right arrow Search for Related Content

Solution and Forecast Horizons for Infinite-Horizon Nonhomogeneous Markov Decision Processes

Torpong Cheevaprawatdomrong, Irwin E. Schochetman, Robert L. Smith, Alfredo Garcia

Jong Stit Co., Ltd., Bangkok, Thailand
Mathematics and Statistics, Oakland University, Rochester, Michigan 48309
Industrial and Operations Engineering, The University of Michigan, Ann Arbor, Michigan 48109
Systems and Information Engineering, University of Virginia, Charlottesville, Virginia 22901

tonychee{at}yahoo.com
schochet{at}oakland.edu
rlsmith{at}umich.edu, http://www-personal.umich.edu/~rlsmith/
agarcia{at}virginia.edu, http://www.sys.virginia.edu/people/ag.asp

We consider a nonhomogeneous infinite-horizon Markov Decision Process (MDP) problem with multiple optimal first-period policies. We seek an algorithm that, given finite data, delivers an optimal first-period policy. Such an algorithm can thus recursively generate, within a rolling-horizon procedure, an infinite-horizon optimal solution to the original problem. However, it can happen that no such algorithm exists, i.e., the MDP is not well posed. Equivalently, it is impossible to solve the problem with a finite amount of data. Assuming increasing marginal returns in actions (with respect to states) and stochastically increasing state transitions (with respect to actions), we provide an algorithm that is guaranteed to solve the given MDP whenever it is well posed. This algorithm determines, in finite time, a forecast horizon for which an optimal solution delivers an optimal first-period policy. As an application, we solve all well-posed instances of the time-varying version of the classic asset-selling problem.

Key Words: planning horizon; monotone policy; well-posed problem
History: Received: February 4, 2005; revision received: November 24, 2005;





HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 2007 by INFORMS.