A class of procedures to compute the optimal value f unction in a Markovian decision problem

Löbus, J.U.

Смотреть

Весь архив
Текущую коллекцию

Главная
Коллекции, полученные в рамках Государственного контракта №07.551.11.4002
Издательство Taylor&Francis
Посмотреть элемент

Автор	Löbus, J.U.
Дата выпуска	1986
dc.description	A class of iteration methods is introduced to find the optimal value function υ<sup></sup> ∈ R <sup>m</sup> in a Markovian decision problem with known optimal stationary policy, represented by a (m, m)-transition-matrix P <sup>δ</sup> and a reward vector γ<sup>δ</sup>∈R <sup>m</sup> . Depending on a (m, m)-para-meter-matrix Q, (I - Q) nonsingular, the Q-iteration (for υ<sup></sup>, when P <sup>δ</sup> and γ <sup>δ</sup> are presented) is explained. Well-known methods are received for special forms of Q. An estimation characterizing the speed of convergence of the Q-iteration is given.
Формат	application.pdf
Издатель	Akademic-Verlag
Копирайт	Copyright Taylor and Francis Group, LLC
Тема	Markovian decision problem
Тема	optimal value function
Тема	iteration methods
Тема	Primary: 90 C 40
Тема	Secondary: 49 C 20
Название	A class of procedures to compute the optimal value f unction in a Markovian decision problem
Тип	research-article
DOI	10.1080/02331938608843148
Electronic ISSN	1029-4945
Print ISSN	0233-1934
Журнал	Optimization
Том	17
Первая страница	399
Последняя страница	409
Аффилиация	Löbus, J.U.; Sektion Mathematik, Friedrich-Schiller-Universität Jena
Выпуск	3
Библиографическая ссылка	Bartmanh, D. Acceleration of the Method of Successive Approximations in Dynamic Programming, Technical University Munich. Preprint
Библиографическая ссылка	Hinderer, K. 1970. Foundation of Non-stationary Dynamic Programming with Discrete Time Parameter, Berlin: Springer.
Библиографическая ссылка	Van Nunek, J.A.E.E. 1976. Contracting Markov Decision Processes, Amsterdam: Mathematisch Centrum.
Библиографическая ссылка	Varga, R.S. 1962. Matrix Iterativ Analysis, Prentice-Hall.

Читать

410.7Кб

Скрыть метаданые

Смотреть

Весь архив

Текущую коллекцию