Customer stories

We're creating a platform for progressive AI companies to build their products in the fastest, most performant infrastructure available.

Get in touch

Trusted by top engineering and machine learning teams

What our customers are saying

See all

Sahaj Garg,
Co-Founder and CTO
With Baseten, we gained a lot of control over our entire inference pipeline and worked with Baseten's team to optimize each step.

Lily Clifford,
Co-founder and CEO
Rime's state-of-the-art p99 latency and 100% uptime is driven by our shared laser focus on fundamentals, and we're excited to push the frontier even further with Baseten.

Isaiah Granet,
CEO and Co-Founder
Baseten enabled us to achieve something remarkable—delivering real-time AI phone calls with sub-400 millisecond response times. That level of speed set us apart from every competitor.

Waseem Alshikh,
CTO and Co-Founder of Writer
Inference for custom-built LLMs could be a major headache. Thanks to Baseten, we're getting cost-effective high-performance model serving without any extra burden on our internal engineering teams. Instead, we get to focus our expertise on creating the best possible domain-specific LLMs for our customers.

Sahaj Garg,
Co-Founder and CTO
Inference for custom-built LLMs could be a major headache. Thanks to Baseten, we’re getting cost-effective high-performance model serving without any extra burden on our internal engineering teams. Instead, we get to focus our expertise on creating the best possible domain-specific LLMs for our customers.

You guys have literally enabled us to hit insane revenue numbers without ever thinking about GPUs and scaling. We would be stuck in GPU AWS land without y'all. Truss files are amazing, y'all are on top of it always, and the product is well thought out. I know I ask for a lot so I just wanted to let you guys know that I am so blown away by everything Baseten.
Sahaj Garg, Co-Founder and CTO

Sahaj Garg,
Co-Founder and CTO
You guys have literally enabled us to hit insane revenue numbers without ever thinking about GPUs and scaling. We would be stuck in GPU AWS land without y'all. Truss files are amazing, y'all are on top of it always, and the product is well thought out. I know I ask for a lot so I just wanted to let you guys know that I am so blown away by everything Baseten.

Customer Stories

Wispr Flow creates effortless voice dictation with Llama on Baseten

Wispr Flow runs fine-tuned Llama models with Baseten and AWS to provide seamless dictation across every application.

Read case study

Wispr Flow runs fine-tuned Llama models with Baseten and AWS to provide seamless dictation across every application.

Read case study

Rime serves speech synthesis API with stellar uptime using Baseten

Rime AI chose Baseten to serve its custom speech synthesis generative AI model and achieved state-of-the-art p99 latencies with 100% uptime in 2024

Read case study

Rime AI chose Baseten to serve its custom speech synthesis generative AI model and achieved state-of-the-art p99 latencies with 100% uptime in 2024

Read case study

Bland AI breaks latency barriers with record-setting speed using Baseten

Bland AI leveraged Baseten’s state-of-the-art ML infrastructure to achieve real-time, seamless voice interactions at scale.

Read case study

Bland AI leveraged Baseten’s state-of-the-art ML infrastructure to achieve real-time, seamless voice interactions at scale.

Read case study

Custom medical and financial LLMs from Writer see 60% higher tokens per second with Baseten

Writer, the leading full-stack generative AI platform, launched new industry-specific LLMs for medicine and finance. Using TensorRT-LLM on Baseten, they increased their tokens per second by 60%.

Read case study

Writer, the leading full-stack generative AI platform, launched new industry-specific LLMs for medicine and finance. Using TensorRT-LLM on Baseten, they increased their tokens per second by 60%.

Read case study

Patreon saves nearly $600k/year in ML resources with Baseten

With Baseten, Patreon deployed and scaled the open-source foundation model Whisper at record speed without hiring an in-house ML infra team.

Read case study

With Baseten, Patreon deployed and scaled the open-source foundation model Whisper at record speed without hiring an in-house ML infra team.

Read case study

Explore Baseten today

Start deploying

Talk to an engineer