Predibase Launches New Offering to Fine-tune and Serve 100x More LLMs at No Additional Cost – Try Now with LLaMA-2 for Free

Dramatically increases training speed while reducing deployment costs and complexity; Predibase also introduces Predibase AI Cloud with A100s for efficiently training even the largest open-source LLMs

SAN FRANCISCO–(BUSINESS WIRE)–Predibase, the developer platform for open-source AI, today announced the availability of their software development kit (SDK) for efficient fine-tuning and serving. This new offering enables developers to train smaller, task-specific LLMs using even the cheapest and most readily available GPU hardware within their cloud. Fine-tuned models can then be served using Predibase’s lightweight, modular LLM serving architecture that dynamically loads and unloads models on demand in seconds. This allows multiple models to be served without additional costs. This approach is so efficient that Predibase can now offer unlimited fine-tuning and serving of LLaMA-2-13B for free in a 2-week trial.

“More than 75% of organizations won’t use commercial LLMs in production due to concerns over ownership, privacy, cost, and security, but productionizing open-source LLMs comes with its own set of infrastructure challenges,” said Dev Rishi, co-founder and CEO of Predibase. “Even with access to high-performance GPUs in the cloud, training costs can reach thousands of dollars per job due to a lack of automated, reliable, cost-effective fine-tuning infrastructure. Debugging and setting up environments require countless engineering hours. As a result, businesses can spend a fortune even before getting to the cost of serving in production.”

By simplifying deployments and providing optimizations that make training jobs run efficiently and reliably on all hardware, Predibase’s new fine-tuning offering provides a solution for the high-end GPU shortage and, in effect, democratizes LLM adoption.

“Enterprise practitioners seeking to put LLMs to work on corporate data are quickly learning that bigger is not always better when it comes to model parameters and GPU cluster size,” said Bradley Shimmin, Chief Analyst at Omdia. “Through recent innovations such as parameter efficient fine-tuning (PEFT) and model quantization (e.g., QLoRA), companies are achieving great results on more commodity hardware by fine-tuning smaller, often open-source LLMs using a limited amount of highly curated data. The challenge that remains, however, is how to operationalize these methods during development and then bring the final results forward into production in a cost-effective yet performant manner.”

Overall, Predibase enables organizations to realize up to a 50x improvement in training speed for task-specific models and a 15x reduction in deployment costs. These gains are made possible through several innovations, including:

  • Automatic Memory-Efficient Fine-Tuning: Predibase compresses any open-source LLM to make it trainable on commodity GPUs (such as the Nvidia T4). Built on top of the open-source Ludwig framework for declarative model building, users need only specify the base model, dataset, and a prompt template. Predibase’s training system then automatically applies 4-bit quantization, low-rank adaptation, memory paging/offloading, and other optimizations to ensure training succeeds on whatever hardware is available – at the fastest speeds possible – in only a few lines of code.
  • Serverless Right-Sized Training Infrastructure: Predibase’s built-in orchestration logic will find the most cost-effective hardware in your cloud to run each training job, with built-in fault tolerance, metric and artifact tracking, and one-click deployment capabilities.
  • Cost-Effective Serving for Fine-Tuned Models: Businesses can configure each LLM deployment to scale up and down with traffic using either stand-alone or dynamic hosting. Dynamically served LLMs can be packed with hundreds of other specialized fine-tuned LLMs, resulting in over 100x cost reduction compared with dedicated deployments. Each fine-tuned LLM can be loaded and queried in seconds following fine-tuning—no need to deploy each model on a separate GPU.

Alongside the new fine-tuning and serving capabilities, Predibase has also introduced the Predibase AI Cloud, a service for selecting the most cost-effective compute resources optimized for your workload with support for multiple environments and regions. Available upon request, the AI Cloud now provides access to managed A100 GPUs optimized for distributed training and serving. With these capabilities, Predibase offers a complete range of solutions for efficiently fine-tuning the largest and most compute-intensive LLM workloads.

“We adopted Predibase to save our team months of effort developing infrastructure for training and serving complex open-source LLMs. With Predibase, we can experiment and iterate faster with less custom work and have the option to deploy models in our own cloud,” said Damian Cristian, Co-Founder and CEO of Koble, an investment platform that uses AI to identify early-stage companies that outperform the market. “Now we don’t need to worry about scaling our infrastructure as we grow because Predibase supports efficient fine-tuning and serving of even the largest models like LLaMA-2-70B in production on A100 GPUs.”

More information can be found at

About Predibase

Predibase is the fastest and most efficient way for developers to customize and deploy LLMs in the cloud. As the developer platform for open-source AI, Predibase makes it easy for engineering teams to fine-tune and serve any open-source AI model on state-of-the-art serverless infrastructure and is currently in use with Fortune 500 enterprises through innovative startups like, Paradigm, Sekure Payment Experts, and World Wildlife Fund. Built by the team that created the internal AI platforms at Apple and Uber, Predibase is fast, efficient, and scalable for any size job. Most importantly, Predibase is built on open-source foundations and can be deployed in your cloud so all of your data and models stay in your control.

For more information or to get started with a free trial, visit or follow @predibase.


PR Contact
Raymond Fenton

Voxus PR

error: Content is protected !!