/ciol/media/media_files/2025/12/26/ai-inference-deal-2025-12-26-12-05-41.png)
As the AI infrastructure race intensifies, NVIDIA has struck a notable licensing agreement with AI inference startup Groq, one that reshapes both companies’ leadership and technology roadmaps without triggering a full acquisition.
Groq announced that it has entered into a non-exclusive inference technology licensing agreement with NVIDIA, aimed at expanding access to high-performance, low-cost AI inference at a global scale. The deal also brings a significant leadership shift: Jonathan Ross, founder of Groq, and Sunny Madra, president of Groq, along with other members of the Groq team, will join NVIDIA to help advance and scale the licensed technology.
/filters:format(webp)/ciol/media/media_files/2025/12/26/screenshot-2025-12-26-115646-2025-12-26-12-05-59.png)
Despite the personnel move, Groq emphasised that it will continue to operate as an independent company, with Simon Edwards stepping in as Chief Executive Officer.
Why AI Inference Has Become the New Battleground
AI inference, where trained models respond to real-time user requests, has emerged as one of the most critical layers of the AI stack. While training large models demands massive compute, inference determines how efficiently and economically AI can be deployed at scale across enterprises.
Groq has positioned itself squarely in this space, building low-latency processors designed to serve real-time AI workloads more efficiently. NVIDIA, long dominant in AI training through its GPUs, has increasingly focused on strengthening its inference capabilities as enterprise demand shifts from experimentation to production.
According to media reports, NVIDIA CEO Jensen Huang told employees that the company plans to integrate Groq’s low-latency processors into NVIDIA’s AI factory architecture, expanding the platform’s ability to serve a broader range of inference and real-time workloads.
Leadership Transition Without an Acquisition
While the licensing deal has fuelled speculation, Groq has been explicit about its independence. “Groq will continue to operate as an independent company with Simon Edwards stepping into the role of Chief Executive Officer.”
the company said in its official statement.
Groq also reassured customers that GroqCloud, its inference service platform, will continue to operate without interruption during the transition.
Financial terms of the licensing agreement have not been officially disclosed. However, media reports citing Alex Davis, CEO of Disruptive, which led Groq’s latest financing round, suggest the transaction could be valued at approximately $20 billion in cash. If confirmed, this would surpass NVIDIA’s 2019 acquisition of Mellanox for around $7 billion, though NVIDIA has not characterised the deal as an acquisition.
Groq’s Journey and Market Position
Founded in 2016, Groq was started by a group of former engineers, including Jonathan Ross, one of the creators of Google’s Tensor Processing Unit (TPU). The company has focused exclusively on inference—a segment where performance, latency, and cost efficiency increasingly matter to enterprises deploying AI at scale.
In its most recent funding round in September, Groq was valued at $6.9 billion, having raised $750 million from investors including BlackRock, Neuberger Berman, Samsung, Cisco, Altimeter, and 1789 Capital.
The inference market has become increasingly competitive, with cloud providers, hyperscalers, and chip designers all seeking to optimise how AI models are served in production environments. For NVIDIA, licensing Groq’s technology, rather than absorbing the company outright, offers a way to accelerate innovation while retaining architectural flexibility.
This agreement underscores a broader shift in the AI ecosystem. As AI adoption matures, infrastructure decisions are moving beyond raw compute toward latency, efficiency, and deployment economics. Inference, once treated as a downstream concern, is now central to how AI delivers value in real-world applications.
For Groq, the deal provides validation of its inference-first approach while allowing it to remain operationally independent. For NVIDIA, it adds another layer to its evolving AI factory strategy, one that increasingly spans training, inference, and real-time workloads.
As enterprises push AI from pilot projects into core operations, partnerships like this highlight how the AI stack is being reshaped not just by scale, but by speed.
/ciol/media/agency_attachments/c0E28gS06GM3VmrXNw5G.png)
Follow Us