OLMOTRACE: A New Tool to Understand How Language Models Learn from Trillions of Tokens

🔊

💬 Ask

Researchers have unveiled OLMOTRACE, a novel system that allows users to trace the outputs of language models (LMs) back to the vast datasets they were trained on. This tool provides unprecedented insight into how LMs learn and generate text, offering a way to investigate fact-checking, creativity, and even potential biases or hallucinations.

Modern language models are trained on massive text corpora consisting of trillions of tokens. This makes understanding where the models “learn” specific information a major challenge. OLMOTRACE addresses this problem by finding and highlighting verbatim matches between a language model’s output and documents within its training dataset.

Here’s how it works: A user enters a prompt, and the language model generates a response. OLMOTRACE then analyzes this response, breaking it down into phrases and searching for those exact phrases within the model’s training data. When a match is found, OLMOTRACE displays the document containing the matched text and highlights the specific phrases that were found.

For example, imagine you ask a language model, “Who is Celine Dion?” The model might respond with a paragraph describing her life and career. OLMOTRACE can then highlight sections of that paragraph that directly match text found in the model’s training data, such as a specific sentence describing her birth date and place. Users can then click through to view the original document where that sentence appeared, which provides context for where the language model learned that information.

OLMOTRACE utilizes a technology called infini-gram to efficiently index the massive datasets used to train these models. This allows it to return tracing results within seconds.

The researchers demonstrated three potential use cases for OLMOTRACE:

Fact-Checking: Users can verify the accuracy of factual claims made by the model by tracing them back to their source documents. If a model claims “The Space Needle was built for the 1962 World’s Fair,” OLMOTRACE can surface the training data that taught it this fact.
Tracing “Creative” Expressions: Even seemingly original phrases or ideas generated by a language model may have roots in its training data. OLMOTRACE can reveal the source of inspiration for these creative expressions. For instance, the phrase “I’m going on an adventure,” from a Tolkien-style story generated by the LM, was found in a fan fiction document about the Hobbits in the training data.
Understanding Math Capabilities: OLMOTRACE can reveal how models learned to perform arithmetic operations and solve math problems, by tracing correct solutions directly back to relevant training examples.

OLMOTRACE is available within the AI2 Playground for three flagship OLMo models, including the OLMo-2-32B-Instruct. This includes the complete training data (pre-training, mid-training, and post-training). It is also open-source.

AI Papers Reader

Personalized digests of latest AI research

OLMOTRACE: A New Tool to Understand How Language Models Learn from Trillions of Tokens

Chat about this paper