We propose an algorithm for minimizing expected risk F that shares some properties with randomized incremental aggregated gradient methods as well as dynamic sampling methods. Unlike aggregated gradient methods, which are designed to revisit the training set many times, the algorithm proposed here is well suited for problems involving a very large training set, where one (or a few) passes over the data suffice to produce an acceptable solution. At every iteration, the algorithm updates a collection of gradients from certain past iterations, and as in dynamic sample methods additional gradients are evaluated at the current point. By allowing the amount of information to increase at every iteration the algorithm is able to achieve linear convergence in expected risk F (not just in training error). Numerical results on machine learning test problems illustrate the performance of the method. 
October 16, 2015
12:30 pm (1h)
Discovery Building, Orchard View Room
Stefan Solntsev