Safronov, M.

Experimentation and Learning-by-Doing

CWPE1667

Abstract: I consider a multi-armed bandit problem, where by experimenting with any arm an agent not only learns its payoffs, but also due to learning-by-doing becomes experienced at that arm. Experience provides an additional payoffs to the agent. I study the interaction between the processes of experimentation, and learning-by-doing. The presence of learning-by-doing always reduces the agent's willingness to experiment, regardless of whether the agent is actually experienced at the arm she is currently pulling. Moreover, this effect is nonmonotone in the arrival rate of experience, and reaches maximum at intermediate arrival rates. The arms with extreme arrival rate of experience yield the highest payoff to the agent, making her pulling those arms first. This non-monotonicity result is extended to the case of collective experimentation with two agents, where equilibrium payoffs of the agents reach maximum at extreme arrival rates of experience. If the agent obtains experience by learning certain 'skills' at the arm, then the presence of experimentation effects which skills the agent learns first. If the process of learning-by-doing is deterministic, the agent learns the easier skills first; if the process is stochastic and memoryless, the agent learns the harder skills first.

PDF: https://www.econ.cam.ac.uk/research-files/repec/cam/pdf/cwpe1667.pdf