Multi-view representation learning for speech, language, and beyond


Many types of multi-dimensional data have a natural division into two “views”, such as audio and video or images and text. Multi-view learning refers to techniques that use multiple views of data to learn improved models for each of the views. Theoretical and empirical results indicate that multi-view techniques can improve over single-view ones in certain settings. In many cases multiple views help by reducing noise in some sense. In this talk, I will focus on multi-view learning of representations (features) using canonical correlation analysis (CCA) and related techniques. I will present nonlinear extensions including deep CCA, where the learned representations are the outputs of deep neural networks, and other variants. Finally, I will give recent empirical results.

October 26 @ 12:30
12:30 pm (1h)

Discovery Building, Orchard View Room

Karen Livescu