SILO: Self-Improving Transformers: Overcoming Length Generalization Challenges

Abstract:

Large language models can perform algorithmic tasks through test-time computation but struggle to generalize far beyond the task difficulty of the training distribution. These limitations manifest across even simple tasks like arithmetic, string manipulation, and maze solving, where transformers learn shortcuts rather than the underlying algorithms. While prior solutions modify transformer architectures with task-specific engineering, we overcome these limitations with a general-purpose, self-improvement approach using standard transformers. Our method starts with models trained on simple problems, then iteratively uses them to generate training data for progressively harder tasks. Scaling this weak-to-strong training approach yields (seemingly) unbounded improvements in both length and hardness generalization, allowing models to solve problem instances far exceeding the difficulty of those in the training data distribution. We find that “controlled sampling” of problem difficulty is key and also the ability to filter out “negative” self labeled examples; without it, generalization performance plateaus. Our results show that careful self-supervision allows small transformers to transcend superficial pattern matching failures and learn multi step algorithms.

January 29, 2025

12:30 pm (1h)

Orchard View Room

Dimitris Papailiopoulos, UW-Madison

video