SILO: Relying on the Metrics of Evaluated Agents

Abstract:

Developers and regulators of online platforms and AI systems face a continuing problem of designing effective evaluation metrics. While tools for collecting and processing data continue to progress, this has not addressed the problem of “unknown unknowns”, or fundamental informational limitations on part of the evaluator. To guide the choice of metrics in the face of this informational problem, we turn to the evaluated agents themselves, who may have more information about how to measure their own outcomes. This talk will cover a theoretical model of this interaction as a principal-agent game, in which we ask: “When does an agent have an incentive to reveal the observability of a metric to their evaluator?”

Bio:

Serena is a postdoc at Harvard, and will be joining the University of British Columbia as an Assistant Professor in Computer Science in 2026. She has also worked with Google Research at 20% time. Her research focuses on understanding and improving the long term societal impacts of AI by rethinking algorithms and their surrounding incentives and practices. Her recent work concerns evaluation processes for AI systems and beyond, including how to discover new metrics, how to mitigate gaming, and how to think about representation and fairness.

October 8, 2025

12:30 pm (1h)

Orchard View Room

Harvard University, Serena Wang

Video