NVIDIA B300 HGX GPU Rental
Now available!
Built for training and inference of models with hundreds of billions to trillions of parameters
Output performance increased by 30x
This curve shows the key parameters that determine token revenue output for an AI factory. The vertical axis represents GPU token throughput per second (TPS) for a one-megawatt (MW) AI factory, while the horizontal axis quantifies per-user interactivity and response speed in TPS. At the optimal intersection of throughput and response speed, HGX B300 can increase total AI factory output performance by 30x compared with the NVIDIA Hopper architecture, maximizing token revenue.
Scalable training for large AI models
The HGX B300 platform can deliver up to 2.6x higher training performance for large language models such as DeepSeek-R1. With more than 2 TB of high-speed memory and 14.4 TB/s of NVLink switch bandwidth, it supports large-scale model training and high-throughput GPU-to-GPU communication.
The Blackwell Ultra platform supports real-time video generation from world foundation models such as NVIDIA Cosmos™, delivering 30x higher performance than Hopper. This enables customized, photorealistic, spatially and temporally consistent video for physical AI applications.
Technology breakthroughs
The latest NVIDIA technologies deliver unprecedented performance and efficiency for AI workloads
AI inference
Compared with NVIDIA Blackwell GPUs, NVIDIA Blackwell Ultra significantly improves Tensor Core performance, delivers 2x attention-layer acceleration, and increases AI floating-point compute performance (FLOPS) by 1.5x.
High-capacity HBM3E architecture
NVIDIA Blackwell Ultra GPUs provide 1.5x the HBM3E memory capacity of the previous generation and combine it with greater AI compute capability, significantly increasing AI inference throughput, especially at the longest context lengths.
Fifth-generation NVIDIA NVLink
Unlocking the full potential of accelerated computing requires seamless communication between every GPU. Fifth-generation NVIDIA NVLink™ is a scalable interconnect technology that can significantly improve AI inference model performance.
B300 GPU Server Specifications
| Feature | Specification |
|---|---|
| GPU Type | NVIDIA B300 Tensor Core GPU |
| GPU Memory | 288 GB |
| CPU Model | 2x Intel Xeon Emerald Rapids 6767P (SSTPP 2.7-2.9GHz) |
| Droplet vCPU | 224 |
| Droplet Memory | 3,600 GiB |
| Boot Disk (NVMe) | 2,046 GiB |
| Scratch Disk | 40,096 GiB |
| Transfer Allowance | 60 TB |
| GPU Fabric (GPU-to-GPU internal) | NVLINK - 800Gbps (6.4Tbps per node) |
| East/West Fabric (GPU-to-GPU multinode) | 8x800Gbps/GPU - 6.4Tb Total |
| Network Protocol | RoCEv2 |
| North/South Network (Public Egress/Ingress) | 10 Gbps public |
| VPC Network | 25 Gbps private |
B300 GPU servers use the latest NVIDIA Blackwell Ultra architecture and availability is limited. Advance reservations are recommended to secure the required compute capacity.
Our technical team will help assess your requirements and tailor the configuration that best fits your business.
NVIDIA HGX B300 Specifications
| HGX B300 | Specification |
|---|---|
| Form Factor | 8x NVIDIA Blackwell Ultra SXM |
| FP4 Tensor Core¹ | 144 PFLOPS | 108 PFLOPS |
| FP8/FP6 Tensor Core² | 72 PFLOPS |
| INT8 Tensor Core² | 3 POPS |
| FP16/BF16 Tensor Core² | 36 PFLOPS |
| TF32 Tensor Core² | 18 PFLOPS |
| FP32 | 600 TFLOPS |
| FP64/FP64 Tensor Core | 10 TFLOPS |
| Total Memory | 2.1 TB |
| NVIDIA NVLink | Fifth generation |
| NVIDIA NVLink Switch™ | NVLink 5 Switch |
| NVLink GPU-to-GPU Bandwidth | 1.8 TB/s |
| Total NVLink Bandwidth | 14.4 TB/s |
| Networking Bandwidth | 1.6 TB/s |
| Attention Performance³ | 2x |
B300 GPU Server Procurement Guide
B300 Pricing, Configurations, and Rental Scenarios
This guide brings together B300 pricing considerations, configuration differences, delivery options, and related articles to help enterprises complete an initial evaluation before requesting a quote.
Pricing and configuration considerations
B300 is a new generation of high-end Blackwell compute. Pricing is typically affected by GPU count, bare-metal or cluster delivery, network interconnect, storage capacity, rental term, and availability window. Define the training, inference, or DeepSeek deployment requirements first, then request monthly, cluster, or dedicated-pool pricing.
Use cases
- Enterprise AI platforms that need higher throughput and a longer lifecycle than H100 or H200.
- Training, fine-tuning, and inference clusters for large models such as DeepSeek, Qwen, and Llama.
- Teams with steady GPU usage that want dedicated capacity to reduce queuing and supply risk.
- Procurement teams evaluating the cost and performance of B300, H200, and MI300X.
GPU and cloud server comparison
B300
Flagship Blackwell compute
For teams with defined budgets, high concurrency, and large models; focus on availability windows and cluster delivery.
H200
High-memory inference and training
Balances memory capacity with ecosystem maturity for most large-model inference and fine-tuning workloads.
H100
Mature, stable mainstream GPU
A mature ecosystem and extensive documentation make it a stable baseline for training and inference.
Frequently asked questions
Why is there no fixed B300 price?
B300 GPU server pricing varies with the delivery model, rental term, node count, network, and storage plan. A fixed price can be misleading; request a quote using the target model, concurrency, and budget range.
Is B300 a direct replacement for H100?
B300 is the stronger upgrade when the goal is higher throughput, a newer architecture, and a longer lifecycle. H100 and H200 remain worth comparing when mature frameworks and cost control are the priority.
Is B300 better for training or inference?
It suits both. For training, focus on multi-GPU interconnect and stable supply; for inference, focus on memory, throughput, batching, and peak serving demand.
Related reading and procurement guides
How to Choose B300 or H200 for DeepSeek
Compare B300 and H200 performance and cost trade-offs for DeepSeek deployments.
NVIDIA GPU Model and Selection Comparison
Understand the positioning of B300, H200, H100, L40S, and other GPUs.
DeepSeek GPU Sizing
Estimate GPU resources from model size, concurrency, and context length.
Reserve Now · Secure Next-generation Compute
Reserve B300 GPU servers early to support your AI strategy with powerful compute

Scan to add a dedicated advisor
Get a tailored plan
- One-to-one technical consultation to assess your AI compute requirements
- Tailored configurations to optimize cost and performance
- Priority delivery to secure 2025 compute capacity
- End-to-end technical support for successful AI delivery
