University of Chicago

SILO: Understanding and Leveraging Adaptive Algorithms’ Sensitivity to Change-of-Basis

Abstract Adaptive gradient methods—such as Adagrad, Adam, and their variants—have found widespread use in machine learning, signal processing, and many other settings. However many algorithms in this family are not rotationally equivariant: in this talk we examine how a simple change-of-basis in either parameter space or data space can drastically …

SILO: Variational Inference, MCMC, and Dense SoftMax-Cut

Abstract: A famous result in approximation algorithms is that the max-cut problem can be approximated up to (1 + \epsilon) error in polynomial time on dense graphs. Motivated by connections to markov chains, variational inference, and statistical physics, we show that this result can be recovered from a stronger fact: …