Llama 3: Meta's new foundation model herd

💬 Ask

Meta has released Llama 3, a new set of foundation models for language that natively supports multilinguality, coding, reasoning, and tool usage.

“Llama 3 is a herd of language models,” explained Meta. “Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens.”

These foundation models are designed to support a large variety of AI tasks. In general, they are trained at massive scale, first with straightforward tasks such as next-word prediction, and then are fine-tuned to follow instructions and align with human preferences. This results in a model that can be used for a wide range of tasks such as answering questions, writing different kinds of creative content, and translating languages.

Llama 3 is the latest iteration in Meta’s Llama family of models. The team has made a number of key improvements over previous versions of Llama, including:

Data: The team has pre-trained Llama 3 on a corpus of about 15T multilingual tokens, compared to 1.8T tokens for Llama 2. They have also made improvements to the quality of the data used, by developing more careful pre-processing and curation pipelines.
Scale: Meta pre-trained Llama 3 using 3.8 × 1025 FLOPs, almost 50× more than the largest version of Llama 2.
Managing Complexity: The team has made design choices that seek to maximize their ability to scale the model development process. For example, they have opted for a standard dense Transformer model architecture rather than for a mixture-of-experts model.

The researchers evaluated Llama 3 on a variety of benchmark datasets, including MMLU, GSM8K, HumanEval, and IFEval. They found that Llama 3’s largest model performs on par with leading language models such as GPT-4 (OpenAI, 2023a) across a variety of tasks.

Meta is making Llama 3 publicly available under an updated version of the Llama 3 Community License. They hope that the open release of this flagship model will spur a wave of innovation in the research community.

“We believe there are three key levers in the development of high-quality foundation models: data, scale, and managing complexity,” Meta explained.

The team has also conducted a variety of human evaluations to assess Llama 3’s performance on tasks such as coding, reasoning, and tool use. They found that Llama 3 performs competitively with other leading models.

Meta has made Llama 3 a foundational model to power its other AI systems. The team is continuing to work on improving Llama 3’s capabilities, including its safety and reliability. They hope that Llama 3 will play a key role in the development of future AI systems.

This paper is not yet published in a peer-reviewed journal, so these results should be considered preliminary. However, Meta’s work on Llama 3 is a significant step forward in the development of powerful and versatile AI systems.

AI Papers Reader

Personalized digests of latest AI research

Llama 3: Meta's new foundation model herd

Chat about this paper