Advances in Gradient Descent Methods for Non-Convex Optimization

With a flurry of recent research motivated by applications to machine learning, convergence of gradient descent methods for smooth non-convex unconstrained optimization is well understood in the centralized setting. In this talk I will discuss our progress towards understanding how convergence of gradient descent methods (including SGD and acceleration) is affected by:

  1. Presence of linear inequality constraints,
  2. Compression of gradients in order to minimize communication during distributed computation.

In both cases for convergence to a second-order stationary point linear dependence on the dimension is currently required in otherwise (almost) dimension-free methods. It remains open whether such dependence is necessary.

 

Joint work with Dmitrii Avdiukhin (Indiana University, Bloomington) and Chi Jin (Princeton).

May 13 @ 12:30
12:30 pm (1h)

Zoom

Grigory Yaroslavtsev