A class of procedures to compute the optimal value f unction in a Markovian decision problem

Löbus, J.U.

Смотреть

Весь архив
Текущую коллекцию

Главная
Коллекции, полученные в рамках Государственного контракта №07.551.11.4002
Издательство Taylor&Francis
Посмотреть элемент

A class of procedures to compute the optimal value f unction in a Markovian decision problem

Löbus, J.U.; Löbus, J.U.; Sektion Mathematik, Friedrich-Schiller-Universität Jena

Журнал: Optimization

Дата: 1986

Показать метаданные

Аннотация:

A class of iteration methods is introduced to find the optimal value function υ* ∈ R m in a Markovian decision problem with known optimal stationary policy, represented by a (m, m)-transition-matrix P δ and a reward vector γδ∈R m . Depending on a (m, m)-para-meter-matrix Q, (I - Q) nonsingular, the Q-iteration (for υ*, when P δ and γ δ are presented) is explained. Well-known methods are received for special forms of Q. An estimation characterizing the speed of convergence of the Q-iteration is given.

Читать

410.7Кб

Смотреть

Весь архив

Текущую коллекцию

A class of procedures to compute the optimal value f unction in a Markovian decision problem

Аннотация: