Seminar

Bandit games

Antoine Salomon (ENPC)

November 4, 2010, 12:30–14:00

Toulouse

Room MC 205

Decision Mathematics Seminar

Abstract

We study strategic interaction between several agents who are facing an exploration vs. exploitation dilemma. In game theory, this situation is well described by models of bandit games. Each player faces a two-arm bandit machine, one arm being safe, the other being risky. At each stage of the game, each player has to decide which arm he uses. If he chooses the risky arm (exploration), he gets a random payoff which gives him partial information on the rentability of his machine. If he chooses the safe arm, he gets a known payoff, but possibly less than what he could have got from exploration. The rentability of the machine depends on an unknown state of the nature, which can be learnt from exploration. Learning is a strategic issue: for instance a player could benefit from others' information without taking risks himself. We study Nash equilibria of such games. We mainly wonder if equilibria are efficient: does a player gain significanlty more from strategic interaction than he would alone? Is there some kind of cooperation that helps getting more information?