Beyond worst-case: instance-dependent optimality in reinforcement learning Martin Wainwright University of California, Berkeley