A doscounted uniform one-armed bandit problem
Hamada, Toshio; Hamada, Toshio; Department of Management and Information Sciences, Himeji College of Hyogo
Журнал:
Sequential Analysis
Дата:
1992
Аннотация:
The problem considered in this paper is to decide when to stop a sequence of the independent random variables from the uniform distribution on the interval (-1,u). Suppose that the true value of u is unknown and there is the prior knowledge that u has the Pareto distribution with parameters w and a as a prior distribution. The .objective is to maximize the total expected discounted sum of the observations. This problem is formulated by dynamic programming and the optimal strategy is denoted by the critical value function whose value for any parameter vector-value is easily calculated.
211.5Кб