Fictitious play is a process by which players of a game adapt their strategy in response to actions selected by opponents. It is known to converge to Nash equilibrium in several classes of games. Numerous variants of fictitious play have been proposed; this work introduces generalised weakened fictitious play (GWFP), a class of processes which includes many of these variants. The GWFP process is proved to converge in the same classes of games as fictitious play, and therefore provides a unified convergence analysis for the fictitious play-like processes. Two new learning processes are then introduced and shown to be GWFP processes. The first, an actor-critic learning process, does not require players to observe opponent actions, or even take account of the fact that they are playing a game. The second is a random belief learning process, in which players construct a belief distribution over opponent strategy instead of a point estimate. Since both are GWFP processes, they converge to Nash equilibrium in the same classes of games as fictitious play is known to converge.
Dr David Leslie
David is a Lecturer in Statistics at the University of Bristol. He received his PhD from Bristol, but in the interim has been a lecturer in the statistics department at Oxford and a post-doctoral researcher at the University of New South Wales. His research interests are in game theory, reinforcement learning, stochastic optimisation, and Bayesian statistics. Currently he is working to develop and apply these techniques in the ALADDIN project.