
Vlad Shulman
Featured in:
medium.com
Articles
-
Mar 28, 2024 |
baseten.co | Matt Howard |Vlad Shulman |Pankaj Gupta |Philip Kiely
NVIDIA H100 GPUs support Multi-Instance GPU (MIG), which lets us serve models on fractional GPUs. We can get two H100 MIG models serving instances per H100 GPU, each with about half of the power of a full GPU. Splitting H100 GPUs into two parts allows for more flexibility in hardware choice for model inference.
Try JournoFinder For Free
Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.
Start Your 7-Day Free Trial →