On Average Reward Semi-Markov Decision Processes with a General Multichain Structure
L. Jianyong,
Z. Xiaobo
Institute of Applied Mathematics, Academia Sinica, Beijing 100080, China
Department of Industrial Engineering, Tsinghua University, Beijing 100084, China
liujy{at}mail.amss.ac.cn
xbzhao{at}mail.tsinghua.edu.cn
In this paper we investigate average reward semi-Markov decision processes with a general multichain structure using a data-transformation method. By solving the transformed discrete-time average Markov decision processes, we can obtain significant and interesting information on the original average semi-Markov decision processes. If the original semi-Markov decision processes satisfy some appropriate conditions, then stationary optimal policies in the transformed discrete-time models are also optimal in the original semi-Markov decision processes.
Key Words: semi-Markov decision processes; average reward criterion; multichain structure; data-transformation method; optimal policy
History: Received: April 29, 2002;
revision received: June 2, 2003;revision received: June 7, 2003;
Copyright © 2004 by INFORMS.