Llama 3: Meta's new foundation model herd
💬 Ask
Meta has released Llama 3, a new set of foundation models for language that natively supports multilinguality, coding, reasoning, and tool usage.
“Llama 3 is a herd of language models,” explained Meta. “Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens.”
These foundation models are designed to support a large variety of AI tasks. In general, they are trained at massive scale, first with straightforward tasks such as next-word prediction, and then are fine-tuned to follow instructions and align with human preferences. This results in a model that can be used for a wide range of tasks such as answering questions, writing different kinds of creative content, and translating languages.
Llama 3 is the latest iteration in Meta’s Llama family of models. The team has made a number of key improvements over previous versions of Llama, including:
- Data: The team has pre-trained Llama 3 on a corpus of about 15T multilingual tokens, compared to 1.8T tokens for Llama 2. They have also made improvements to the quality of the data used, by developing more careful pre-processing and curation pipelines.
- Scale: Meta pre-trained Llama 3 using 3.8 × 1025 FLOPs, almost 50× more than the largest version of Llama 2.
- Managing Complexity: The team has made design choices that seek to maximize their ability to scale the model development process. For example, they have opted for a standard dense Transformer model architecture rather than for a mixture-of-experts model.
The researchers evaluated Llama 3 on a variety of benchmark datasets, including MMLU, GSM8K, HumanEval, and IFEval. They found that Llama 3’s largest model performs on par with leading language models such as GPT-4 (OpenAI, 2023a) across a variety of tasks.
Meta is making Llama 3 publicly available under an updated version of the Llama 3 Community License. They hope that the open release of this flagship model will spur a wave of innovation in the research community.
“We believe there are three key levers in the development of high-quality foundation models: data, scale, and managing complexity,” Meta explained.
The team has also conducted a variety of human evaluations to assess Llama 3’s performance on tasks such as coding, reasoning, and tool use. They found that Llama 3 performs competitively with other leading models.
Meta has made Llama 3 a foundational model to power its other AI systems. The team is continuing to work on improving Llama 3’s capabilities, including its safety and reliability. They hope that Llama 3 will play a key role in the development of future AI systems.
This paper is not yet published in a peer-reviewed journal, so these results should be considered preliminary. However, Meta’s work on Llama 3 is a significant step forward in the development of powerful and versatile AI systems.
Chat about this paper
To chat about this paper, you'll need a free Gemini API key from Google AI Studio.
Your API key will be stored securely in your browser's local storage.