SILO: Towards Plurality: Foundations for Learning from Diverse Human Preferences
Abstract: Large pre-trained models trained on internet-scale data are often not ready for safe deployment out of the box. They are heavily fine-tuned and aligned using large quantities of human preference data. When we want to align an AI/ML model to human preference or values, it is worthwhile to ask …