Nvidia Pitches DGX SuperPOD Subscription, DPU Servers To Enterprises

For enterprises that don’t want to spend $7 million to $60 million for Nvidia’s DGX SuperPOD AI supercomputer, the chipmaker is now giving customers the option of paying $90,000 a month for remote access that includes new cloud-based software for accelerating commercial AI deployments. The GPU juggernaut also disclosed that new Nvidia-certified servers are coming from OEMs with the chipmaker’s data processing units and Arm-based CPUs.

ARTICLE TITLE HERE

Nvidia is continuing its major sales push into the enterprise with a new subscription-based solution to help accelerate commercial AI deployments as well as a new batch of certified GPU servers that will include the company’s BlueField DPUs and Arm-based CPUs.

The Santa Clara, Calif.-based company announced at Computex 2021 that the Nvidia Base Command Platform is a new cloud-hosted development hub, offered jointly with storage vendor NetApp, that will include access to Nvidia’s DGX SuperPOD AI supercomputers and NetApp’s data management solution hosted at an Equinix data center for a monthly subscription that starts at $90,000.

[Related: Nvidia’s Manuvir Das: We ‘Mimicked’ VMware For Enterprise AI]

id
unit-1659132512259
type
Sponsored post

Manuvir Das, Nvidia’s head of enterprise computing, said Base Command turns Nvidia’s DGX AI systems into an “internal shareable environment” that allows multiple researchers and data scientists to utilize the same GPU resources to work on AI projects at the same time. This will reduce the complexity of managing AI workloads while making high-performance GPU compute more accessible for customers, he added.

“What we’re doing here is we’re really lowering the barrier to entry to experience this best-of-breed system and equipment,” he said in a pre-briefing with journalists and analysts.

The Base Command software will provide access to a wide range of AI and data science tools, including the Nvidia NGC software catalog, and it will appear as a single pane of glass that will make it easy to share resources through a graphical user interface and commend line APIs. The software will also include monitoring and reporting dashboards.

Das said Base Command is meant for customers who don’t have their own DGX SuperPOD, which can cost anywhere from $7 million to $60 million depending on the size of deployment, and he expects those customers will eventually either buy their own SuperPOD system or move to the cloud.

An important feature of Base Command is the ability to work in a hybrid cloud environment. As such, Nvidia said Google Cloud plans to add support for Base Command in its marketplace later this year, and Base Customers will also be able to deploy their workloads to Amazon Web Services’ SageMaker service.

“This hybrid AI offering will allow enterprises to write once and run anywhere with flexible access to multiple Nvidia A100 Tensor Core GPUs, speeding AI development for enterprises that leverage on-demand accelerated computing,” said Manish Sainani, director of product management for ML infrastructure at Google Cloud, in a statement.

Das said adding a subscription pricing model for Nvidia’s expensive DGX SuperPOD clusters is about giving enterprise customers consumption options that they are comfortable and familiar with.

“All we’re saying is, we now think of ourselves as a mainstream provider of hardware and software to enterprise customers, so in that vein, we’re happy to jump on board with any of these models,” he said.

Das added that Base Command will be sold through Nvidia’s channel partners.

“We believe very strongly in the channel, and we’ve worked really well with the channel now for the offerings we already have,” he said. “You should fully expect the same philosophy to apply to Base Command as well. It’s just another offering from our portfolio.”

Base Command is in early access now, and the subscription program will launch in the summer, according to the company. Customers will be able to start with DGX SuperPOD deployments of three DGX systems and scale up to 20 nodes total.

New Nvidia-Certified Systems Coming With DPUs, Arm CPUs

As part of Nvidia’s enterprise push, the company said it is expanding its Nvidia-Certified Systems program to include servers from OEMs that will incorporate its BlueField data processing units and eventually Arm-based CPUs.

New Nvidia-certified servers with BlueField-2 DPUs are expected to arrive from ASUS, Dell Technologies, Gigabyte, QCT and Supermicro later this year, adding to the more than 50 GPU servers from OEMs that have already been certified to run the Nvidia AI Enterprise software suite for AI and data analytics workloads as well as Nvidia Omniverse Enterprise for design collaboration and simulation.

Nvidia introduced the BlueField-2 DPU last year as a component that can replace a standard network interface card and offload critical networking, storage and security workloads from the CPU while enabling new security and hypervisor capabilities in data centers.

The company said a single BlueField-2 DPU “can provide the same data center services that could require up to 125 CPU cores, freeing up server CPU cycles to run a broad range of business-critical applications.” The DPUs are supported by Red Hat, VMware and other software infrastructure vendors.

In a recent interview with CRN, Das said he expects BlueField-2’s security features, which includes real-time network visibility, detection and response capabilities, will compel enterprises to refresh their servers with DPUs installed.

“It’s actually the NIC on the server, where all the packets are flowing through anyway, so it’s very natural and efficient to inspect the packet right there while you’re already processing it,” he said.

But Das said he also expects the total cost of ownership benefits the DPU’s CPU-offload capabilities will compel customers to adopt the component when they are getting ready for a server refresh.

“I think in that case, the entire [CPU] offloading capability will be what drives the choice of DPU in the refresh, because my servers can do 30 percent more work if I put a pretty cost-effective DPU in there, so, of course, that would be attractive,” he said.

Next year, the Nvidia-Certified Systems program will introduce another new component type to GPU servers — Arm-based CPUs — which will mark an important milestone for the company as it plans to embrace the alternative chip architecture and acquire Arm for $40 billion.

The first batch of Nvidia-certified servers with Arm CPUs will come from Gigabyte and Wiwynn, and they will use CPUs based on Arm’s Neoverse CPU designs.

To help developers take advantage of Arm CPUs, the company has collaborated with Gigabyte to introduce an Arm HPC Developer Kit, which includes hardware and software for high-performance computing, AI and scientific computing application development. The server kit will use Arm-based Altra CPUs from Ampere Computing, an up-and-coming server chipmaker, and it will feature two Nvidia A100 GPUs, two BlueField-2 DPUs and the Nvidia HPC software development kit.

Das said Nvidia is working with multiple Arm CPU vendors for future Nvidia-certified systems, and he expects OEM systems will eventually include the company’s own Arm-based data center CPU, Grace, after it has been introduced in Nvidia’s own DGX systems in 2023.