We consider the classic stochastic multiarmed bandit problem with a constraint that limits the total cost incurred by switching between actions to be no larger than a given switching budget. For th…
Problem definition: Motivated by several practical selling scenarios that require previous purchases to unlock future options, we consider a multistage assortment optimization problem, where the se…
This paper investigates the impact of pre-existing offline data on online learning in the context of dynamic pricing. We study a single-product dynamic pricing problem over a selling horizon of T p…
We consider the general (stochastic) contextual bandit problem under the realizability assumption, that is, the expected reward, as a function of contexts and actions, belongs to a general function…