Blog
FEATURED
Groq Supercharges Fast AI Inference For Meta Llama 3.1
New Largest and Most Capable Openly Available Foundation Model 405B Running on GroqCloud™

Mountain View, Calif. – July 23, 2024 – Groq, a leader in fast AI inference, launched Llama 3.1 models powered by its LPU™ AI inference technology. Groq is proud to partner with Meta on this key industry launch, and run the latest Llama 3.1 models, including 405B Instruct, 70B Instruct, and 8B Instruct, at Groq speed. The three models are available on GroqCloud Dev Console, a community of over 300K developers already building on Groq® systems, and on GroqChat for the general public.

Mark Zuckerberg, Founder & CEO, Meta, shared:
“I’m really excited to see Groq’s ultra-low-latency inference for cloud deployments of the Llama 3.1 models. This is an awesome example of how our commitment to open source is driving innovation and progress in AI. By making our models and tools available to the community, companies like Groq can build on our work and help push the whole ecosystem forward.”
"Meta is creating the equivalent of Linux, an open operating system, for AI – not only for the Groq LPU which provides fast AI inference, but for the entire ecosystem. In technology, open always wins, and with this release of Llama 3.1, Meta has caught up to the best proprietary models. At this rate, it's only a matter of time before they'll pull ahead of the closed models. With every new release from Meta, we see a significant surge of developers joining our platform. In the last five months we've grown from just a handful of developers to over 300,000, attracted by the quality and openness of Llama, as well as its incredible speed on the Groq LPU.”
Jonathan Ross, CEO & Founder, Groq
Repost
The Llama 3.1 models are a significant step forward in terms of capabilities and functionality. As the largest and most capable openly available Large Language Model to date, Llama 3.1 405B rivals industry-leading closed-source models. For the first time, enterprises, startups, researchers, and developers can access a model of this scale and capability without proprietary restrictions, enabling unprecedented collaboration and innovation. With Groq, AI innovators can now tap into the immense potential of Llama 3.1 405B running at unprecedented speeds on GroqCloud to build more sophisticated and powerful applications.
With Llama 3.1, including 405B, 70B, and 8B Instruct models, the AI community gains access to increased context length up to 128K and support across eight languages. Llama 3.1 405B is in a class of its own, with unmatched flexibility, control, and state-of-the-art capabilities in general knowledge, steerability, math, tool use, and multilingual translation. Llama 3.1 405B will unlock new capabilities, such as synthetic data generation and model distillation, and deliver new security and safety tools to further the shared Meta and Groq commitment to build an open and responsible AI ecosystem.
With unprecedented inference speeds for large openly available models like Llama 3.1 405B, developers are able to unlock new use cases that rely on agentic workflows to provide a seamless, yet personalized, human-like response for use cases such as: patient coordination and care; dynamic pricing by analyzing market demand and adjusting prices in real-time; predictive maintenance using real-time sensor data; and customer service by responding to customer inquiries and resolving issues in seconds.
GroqCloud has grown to over 300,000 developers in five months, underscoring the importance of speed when it comes to building the next generation of AI-powered applications at a fraction of the cost of GPUs.
To experience Llama 3.1 models running at Groq speed, visit groq.com, and learn more about this launch from Groq and Meta.
About Groq
Groq builds fast AI inference technology. Groq® LPU™ AI inference technology is a hardware and software platform that delivers exceptional AI compute speed, quality, and energy efficiency. Groq, headquartered in Silicon Valley, provides cloud and on-prem solutions at scale for AI applications. The LPU and related systems are designed, fabricated, and assembled in North America. Try Groq speed at www.groq.com.
Join our AI & tools
news weekly newsletter!
Latest Posts

News 1 (Apps)
TECH OpenAI sends internal memo releasing former employees from controversial exit agreements
OpenAI on Thursday backtracked on a controversial decision to, in effect, make former employees choose between signing a non-disparagement agreement that would never expire, or keeping their vested equity in the company.

Chris
May 23, 2024
News 2 (Products)
Google's AI Feature Suggested Using Glue to Keep Cheese on a Pizza
The tool, which gives AI-generated summaries of search results, appeared to instruct a user to put glue on pizza when they searched "cheese not sticking to pizza."

Chris
May 23, 2024

News 3 (Tutorial)
Meta Creates Group to Advise on AI Products
The Meta Advisory Group is composed of outside advisors that Meta's management team will periodically consult with on strategic opportunities related to our technology and product roadmap.

Chris
May 22, 2024

News 1 (Apps)
A new way to generate basketball analytics through tracking with computer vision and AI
In honor of the playoffs, I’d like to showcase what we’ve been working on here at Nexavision — a new way to generate basketball analytics through tracking with computer vision and AI:

Chris
May 23, 2024

News 2 (Products)
Amazon plans to give Alexa an AI overhaul — and a monthly subscription price
Amazon is planning to unveil a souped-up version of its decade-old voice assistant this year and will charge a monthly fee, sources say.

Chris
May 23, 2024

News 3 (Tutorial)
Adobe brings Firefly AI-powered Generative Remove to Lightroom
Adobe announced on Tuesday the addition of a Generative Remove feature for Lightroom. Built atop Firefly, the GenAI feature makes it possible to seamlessly edit objects out of photos. The feature arrives on Tuesday as early access.

Chris
May 22, 2024
:format(webp)/cdn.vox-cdn.com/uploads/chorus_asset/file/25379248/247075_Humane_AI_pin_AKrales_0120.jpg)
News 4 (Apps)
Humane is looking for a buyer after the AI Pin’s underwhelming debut
The startup apparently thinks it’s worth between $750 million and $1 billion despite the deep software flaws and hardware issues of its first product.

Chris
May 22, 2024

News 5 (Products)
Meta introduces Chameleon, a state-of-the-art multimodal model
The architecture of Chameleon can unlock new AI applications that require a deep understanding of both visual and textual information.

Chris
May 22, 2024

News 6 (Tutorial)
In Seoul summit, heads of states and companies commit to AI safety
Government officials and AI industry executives agreed on Tuesday to apply elementary safety measures in the fast-moving field and establish an international safety research network.
