SILO: Toward improving LLM English perplexity
Abstract Over the past decade there has been a significant effort to improve Large Language Models (LLMs) performance. The fundamental aspect of training LLMs is next-word prediction over a large corpus as measured by the resulting perplexity. Models with lower perplexity consistently improve performance across a variety of downstream tasks …