Nvidia Unveils Rubin CPX: A Next-Gen AI Chip
Nvidia has officially announced the Rubin CPX, a new AI chip designed to handle massive-context computing at the AI Infrastructure Summit.
CEO Jensen Huang called it the “first CUDA GPU built specifically for massive-context AI,” highlighting how it can process millions of tokens at once.
This launch marks a major step in AI infrastructure, giving businesses a way to run more advanced models, process huge datasets, and push the boundaries of generative AI.
What Makes Rubin CPX Unique?
The Rubin CPX isn’t just another GPU. It’s designed for long-context AI, where systems need to handle millions of words, lines of code, or frames of video.
Instead of trying to do every AI task in one chip, Nvidia split the process into two phases:
- Context Phase (Rubin CPX): Handles the heavy math work of understanding large inputs, like entire codebases or long videos.
- Generation Phase (Rubin GPUs): Produces output one token at a time, optimized for memory use.
This “disaggregated inference” architecture boosts throughput by nearly 50% per GPU compared to traditional designs.
Key Features
Nvidia packed powerful specs into the Rubin CPX:
- 30 petaflops of compute power.
- 3x faster attention processing than its GB300 systems.
- 128GB of GDDR7 memory with a 512-bit bus.
- Designed for 1M+ token contexts.
- 4 NVENC and 4 NVDEC video processors built in.
By focusing on efficiency, Rubin CPX uses a monolithic die design, making it cheaper to produce while still delivering extreme performance.
Where It Can Be Used
Rubin CPX is built for industries that need to handle massive datasets:
- Software Development: AI can scan entire repositories with millions of lines of code, helping developers with large-scale projects.
- Video AI: Can process hour-long videos or generate ultra-high-quality content, including 8K editing and 3D rendering.
- Enterprise AI: Powers reasoning-heavy models, retrieval-based AI systems, and multi-step agentic AI applications.
Companies like Cursor, Runway, and Magic are already early adopters, testing Rubin CPX for coding and creative AI tools.
The Vera Rubin NVL144 CPX Platform
Rubin CPX won’t work alone—it’s part of Nvidia’s new Vera Rubin NVL144 CPX platform, which combines:
- 144 Rubin CPX GPUs (context phase).
- 144 Rubin GPUs (generation phase).
- 36 Vera CPUs for orchestration.
- Together, they deliver:
- 8 exaflops of compute power.
- 7.5x more performance than Nvidia’s previous GB300 system.
- 100TB memory and 1.7 petabytes/sec bandwidth.
- This makes it one of the most powerful AI racks ever built.
Competition and Strategy
Rubin CPX puts Nvidia ahead of competitors like AMD, Intel, and Tesla, who are also moving into AI chips.
Its use of GDDR7 instead of expensive HBM memory makes it more cost-effective while still offering record-breaking performance.
The chip’s monolithic design also lowers production costs compared to dual-die chips, making it attractive for large-scale deployment.
Availability
The Rubin CPX will be available at the end of 2026 in several formats:
Server cards for existing data centers.
- Complete rack systems with NVL144 CPX platforms.
- Compute trays for custom upgrades.
- This gives companies flexibility in how they adopt the new technology.
News Gist
Nvidia has launched the Rubin CPX AI chip, the first GPU built for massive-context computing.
With 30 petaflops, 128GB GDDR7 memory, and 3x faster performance, it powers million-token AI tasks, software development, video processing, and enterprise reasoning models.
FAQs
Q1. What is Nvidia Rubin CPX?
It’s a new AI chip designed for massive-context computing, capable of handling over 1 million tokens.
Q2. When was Rubin CPX announced?
It was announced on September 9, 2025, at the AI Infrastructure Summit in Santa Clara, California.
Q3. What makes Rubin CPX different from other GPUs?
It splits AI processing into two phases—context (Rubin CPX) and generation (Rubin GPUs)—for 50% faster throughput.
Q4. What industries will benefit most?
Software development, video AI, enterprise reasoning models, and generative AI systems.
Q5. When will it be available?
The Rubin CPX is scheduled for release at the end of 2026.
Q6. How powerful is it?
Each Rubin CPX delivers 30 petaflops of compute power and supports 128GB of GDDR7 memory, making it one of the most powerful AI processors ever built.