Pankaj Gupta

Co-Founder

Pankaj Gupta

Model performance

FP8: Efficient model inference with 8-bit floating point numbers

Philip Kiely

Pankaj Gupta

Pankaj Gupta

1 other

8-bit floating point numbers

Model performance

40% faster Stable Diffusion XL inference with NVIDIA TensorRT

Philip Kiely

Pankaj Gupta

Pankaj Gupta

2 others

40% faster SDXL

Model performance

Unlocking the full power of NVIDIA H100 GPUs for ML inference with TensorRT

Philip Kiely

Pankaj Gupta

Pankaj Gupta

1 other

H100 w/ TensorRT-LLM

Model performance

Faster Mixtral inference with TensorRT-LLM and quantization

Philip Kiely

Pankaj Gupta

Pankaj Gupta

2 others

Faster Mixtral inference

Infrastructure

Technical deep dive: Truss live reload

Pankaj Gupta

Pankaj Gupta

Live reload

Explore Baseten today

Start deploying

Talk to an engineer