Beyond the standard benchmarks: on the importance of robust models and where to fine them

Speaker: Jong Wook Kim (OpenAI, https://jongwook.kim/)

Title: Beyond the standard benchmarks: on the importance of robust models and where to fine them

Abstract: A common recipe for writing a machine learning paper is to pick a dataset, train a model on its “train” split, and report the metrics, such as accuracy, evaluated on the “test” split. While this framework works great for competing on the leaderboard and reaching for the SOTA, these models often tend to be brittle and fail to generalize even to slightly different datasets. In this talk, I will share three case studies on CREPE, CLIP, and Whisper, and discuss how the data collection plays a crucial role in real-world generalization. Based on these studies, this talk aims to emphasize the importance of considering robustness while developing any machine learning models.

October 12, 2022

12:30 pm (1h)

Orchard View Room, Virtual

Jong Wook Kim

Video