Multi-armed bandit (MAB) algorithms are a class of decision ... A recent paper presented a provably efficient RL algorithm that operates in a linear setting, achieving optimal performance without ...