Adding Temporary Memory to ZCS
Cliff Dave; Ross Susi
Журнал:
Adaptive Behavior
Дата:
1994-09-01
Аннотация:
In a recent article, Wilson (1994) described a "zeroth-level" classifier system
(ZCS). ZCS employs a reinforcement learning technique comparable to
Q-learning (Watkins, 1989). This article presents results from the first
reconstruction of ZCS. Having replicated Wilson's results, we extend ZCS in a
manner suggested by Wilson: The original formulation of ZCS has no memory
mechanisms, but Wilson (1994b) suggested how internal "temporary memory"
registers could be added. We show results from adding one-bit and two-bit
memory registers to ZCS. Our results demonstrate that ZCS can exploit
memory facilities efficiently in non-Markov environments. We also show that
the memoryless ZCS can converge on near-optimal stochastic solutions in
non-Markov environments.We then present results from trials using ZCS in Markov environments that
require increasingly long chains of actions before reward is received. Our
results indicate that inaccurate overgeneral classifiers can interact with the
classifier-generation mechanisms to cause catastrophic breakdowns in overall
system performance. Basing classifier fitness on accuracy may alleviate this
problem. We conclude that the memory mechanism in its current form is
unlikely to scale well for situations requiring large amounts of temporary
memory. Nevertheless, the ability to find stochastic solutions when there is
insufficient memory might offset this problem somewhat.
2.073Мб