Systems | Information | Learning | Optimization
 

SILO: Variational inference – reconciling statistical and convergence guarantees

Abstract: As a computational alternative to Markov chain Monte Carlo approaches, variational inference (VI) is becoming increasingly popular for approximating intractable posterior distributions in large-scale Bayesian models due to its comparable efficacy and superior efficiency. Several recent works provide theoretical justifications of VI by proving its statistical optimality for parameter …

SILO: Towards Secure Large Language Models: From Model to System

Abstract: We are witnessing a paradigm shift in AI, transitioning from deep learning models to the era of  Large Language Models (LLMs). This shift signifies a transformative advancement in AI, enabling it to be applied to diverse real-world safety-critical applications.   Despite these impressive achievements, a fundamental question remains: are …

SILO: Self-Improving Transformers: Overcoming Length Generalization Challenges

Abstract: Large language models can perform algorithmic tasks through test-time computation but struggle to generalize far beyond the task difficulty of the training distribution. These limitations manifest across even simple tasks like arithmetic, string manipulation, and maze solving, where transformers learn shortcuts rather than the underlying algorithms. While prior solutions …