Description
This paper compares the use of temporal difference learning (TDL) versus co-evolutionary
learning (CEL) for acquiring position evaluation functions for the game of Othello. The paper
provides important insights into the strengths and weaknesses of each approach. The main
findings are that for Othello, TDL learns much faster than CEL, but that properly tuned CEL
can learn better playing strategies. For CEL, it is essential to use parent-child weighted
averaging in order to achieve good performance. Using this method a high quality weighted
piece counter was evolved, and was shown to significantly outperform a set of standard
heuristic weights