Democratizing Access to Foundation Model Internals: NNsight and NDIF

💬 Ask

Large language models (LLMs) are rapidly advancing our understanding of artificial intelligence. However, research on these models is limited by their massive size, which requires significant computing resources and often restricts access to their inner workings. Researchers typically rely on black-box APIs or smaller, less capable models, hindering deeper exploration of the complex mechanisms driving these models’ capabilities.

This new paper, “NNsight and NDIF: Democratizing Access to Foundation Model Internals,” addresses these challenges by introducing two new open-source tools: NNsight and NDIF.

NNsight is a Python library that provides a flexible and transparent way to interact with PyTorch-based models. It allows researchers to define interventions that modify model internals by building computation graphs, enabling users to probe activations, manipulate gradients, and even implement custom attention mechanisms.

NDIF (National Deep Inference Fabric) is a collaborative research platform that provides researchers with access to foundation-scale LLMs remotely. This allows scientists to experiment on models that would otherwise be too large to run on their local hardware, freeing them from the constraints of limited resources.

The combination of NNsight and NDIF provides researchers with a powerful toolkit for studying model internals in a way that was previously unattainable.

To better understand the functionality of NNsight and NDIF, consider these examples:

Activation Patching: Researchers can use NNsight to manipulate the activation of a specific neuron in the model. This allows them to observe how changes to a single neuron affect the overall model output. For instance, a user could experiment with changing the activation of a neuron responsible for recognizing “the” in a sentence, observing how this modification influences the model’s prediction of the next word.
Gradient Analysis: NNsight enables researchers to study the flow of gradients through the model. This can be used to understand which parts of the model are most influential in driving the output, and how changes to specific parts affect the overall learning process. For example, one could investigate which parts of the model contribute most strongly to the model’s ability to identify specific entities, like people, places, or organizations.
Remote Execution: With NDIF, researchers can run experiments on large models hosted remotely. This allows them to study the internals of models that would otherwise be too large to run on their local hardware. For example, a researcher could use NDIF to run interventions on a 530 billion parameter LLM, a model that would require over 1 terabyte of GPU memory to load and execute locally.

Benefits of NNsight and NDIF:

Transparent Access: These tools provide researchers with unprecedented transparency into the inner workings of LLMs, enabling them to gain a deeper understanding of how these models function.
Flexible Interventions: NNsight offers a flexible framework for defining custom interventions, allowing researchers to explore a wide range of model properties and behavior.
Scalability: NDIF allows researchers to work with the largest and most powerful LLMs, enabling them to push the boundaries of AI research.
Open-Source and Collaborative: NNsight and NDIF are open-source projects, encouraging community participation and fostering a collaborative environment for research.

NNsight and NDIF represent a significant step towards democratizing access to the internals of foundation models, providing researchers with the tools they need to unlock the full potential of these transformative technologies. This will empower scientists to make breakthroughs in our understanding of how LLMs learn, leading to the development of more powerful, reliable, and interpretable AI systems in the future.

AI Papers Reader

Personalized digests of latest AI research

Democratizing Access to Foundation Model Internals: NNsight and NDIF

Chat about this paper