In this talk, we introduce an analysis of deep neural networks through convex optimization and geometric (Clifford) algebra. We begin by introducing exact convex optimization formulations for ReLU neural networks. This approach demonstrates that deep networks can be globally trained through convex programs, offering a globally optimal solution. Our results further establish an equivalent characterization of neural networks as high-dimensional convex Lasso models. These models employ a discrete set of wedge product features and apply sparsity-inducing convex regularization to fit data. This framework provides an intuitive geometric interpretation where the optimal neurons represent signed volumes of parallelotopes formed by data vectors. Specifically, we show that the Lasso dictionary is constructed from a discrete set of wedge products of input samples, with deeper network architectures leading to geometric reflections of these features. This analysis also reveals that standard convolutional neural networks can be globally optimized in fully polynomial time. Numerical simulations validate our claims, illustrating that the proposed convex approach is faster and more reliable than standard local search heuristics, such as stochastic gradient descent and its variants. We show a layerwise convex optimization scheme whose performance is comparable to non-convex end-to-end optimization. We also discuss extensions to batch normalization, generative adversarial networks, transformers and diffusion models.
Bio:
Mert Pilanci is an assistant professor of Electrical Engineering at Stanford University. He received his Ph.D. in Electrical Engineering and Computer Science from UC Berkeley in 2016. Prior to joining Stanford, he was an assistant professor of Electrical Engineering and Computer Science at the University of Michigan. In 2017, he was a Math+X postdoctoral fellow working with Emmanuel Candès at Stanford University. Mert’s research interests are in neural networks, machine learning, optimization, and signal processing. His group develops theory and algorithms for solving large scale optimization problems in machine learning. His research also seeks to develop safe and interpretable artificial intelligence and information theoretic foundations of distributed computing.
Discovery Building, Orchard View Room
Mert Pilanci, Stanford University