/papers/
- ../
- 0408007v1
- Abernethy - An efficient algorithm for bandit linear optimization, 2008.pdf
- Abernethy, Hazan, Rakhlin - An efficient algorithm for bandit linear optimization.pdf
- An optimal high probability algorithm for the contextual bandit problem, 2010.pdf
- Audibert, Bubeck - Minmax policies for bandit games.pdf
- Audibert, Bubeck - Regret bounds and minimax policies under partial monitoring, 2010.pdf
- Aueur, Cesa, Fischer - Finite-time analysis of the multiarmed bandit problem, 2002.pdf
- Aueur, Cesa, Freund, Schapire - The nonstochastic multiarmed bandit problem, 2002.pdf
- Cesa, Lugosi - Prediction, learning and games, 2006.pdf
- Cesa, Lugosi, Stoltz - Minimizing regret with label efficient prediction, 2005.pdf
- Freedman - On tail probabilities for martingales, 1972.pdf
- Massart - Concentration Inequalities and Model Selection, 2003.pdf
- Stoltz - Incomplete information and internal regret in prediction of individual sequences, 2005.pdf
- TCS08.pdf
- Zinkevich - Online convex programming and generalized infinitesimal gradient ascent (technical), 2003.pdf
- Zinkevich - Online convex programming and generalized infinitesimal gradient ascent, 2003.pdf