/papers/

/papers/

../
0408007v1
Abernethy - An efficient algorithm for bandit linear optimization, 2008.pdf
Abernethy, Hazan, Rakhlin - An efficient algorithm for bandit linear optimization.pdf
An optimal high probability algorithm for the contextual bandit problem, 2010.pdf
Audibert, Bubeck - Minmax policies for bandit games.pdf
Audibert, Bubeck - Regret bounds and minimax policies under partial monitoring, 2010.pdf
Aueur, Cesa, Fischer - Finite-time analysis of the multiarmed bandit problem, 2002.pdf
Aueur, Cesa, Freund, Schapire - The nonstochastic multiarmed bandit problem, 2002.pdf
Cesa, Lugosi - Prediction, learning and games, 2006.pdf
Cesa, Lugosi, Stoltz - Minimizing regret with label efficient prediction, 2005.pdf
Freedman - On tail probabilities for martingales, 1972.pdf
Massart - Concentration Inequalities and Model Selection, 2003.pdf
Stoltz - Incomplete information and internal regret in prediction of individual sequences, 2005.pdf
TCS08.pdf
Zinkevich - Online convex programming and generalized infinitesimal gradient ascent (technical), 2003.pdf
Zinkevich - Online convex programming and generalized infinitesimal gradient ascent, 2003.pdf