In this talk I will describe recent work to increase the data efficiency of reinforcement learning algorithms via data re-weighting. I will introduce a novel re-weighting technique that allows RL agents to more efficiently use a finite set of samples. The key idea behind this technique is to use importance sampling to convert the empirical distribution of samples to the expected distribution, thus reducing sampling error in the observed data. In the first part of this talk I will describe how this technique leads to more efficient batch policy evaluation and mini-batch policy gradient reinforcement learning. In the second part of the talk I will describe the extension of this work to batch value function learning and introduce a more data efficient version of the fundamental temporal difference learning algorithm.