Skip to content

MosaicML Launches Inference API and Foundation Series for Generative AI; Leading Open Source GPT Models, Enterprise-Grade Privacy and 15x Cost Savings

SAN FRANCISCO – (BUSINESS WIRE) – May 9, 2023, MosaicML, the leading Generative AI infra­struc­ture provider, announced MosaicML Inference and its foundation series of models for enterprises to build on. This new offering allows developers to quickly, easily, and affordably deploy Generative AI models for 15x less than other comparable services. With the addition of inference capa­bil­i­ties, MosaicML now offers a complete, end-to-end solution for Generative AI training and deployment at the most efficient cost available today.

Generative AI models have quickly become a catalyst for innovation across industries from healthcare to financial services to e‑commerce. However, off-the-shelf models have well-documented issues around data security, model trans­parency, and avail­ability. Access to the alternative — custom Generative AI models — has been limited, until now.

We believe that MosaicML Inference is a game-changer for Generative AI. It radically reduces the cost of serving large models and enables enterprises to do so in their own secure envi­ron­ments. Together with the MosaicML Foundation Series, enterprises now have more capa­bil­i­ties than ever before to achieve their own state-of-the-art AI without concerns about cost, scale, and security.” Naveen Rao, CEO

Orga­ni­za­tions are Building Custom LLMs on MosaicML

Today, orga­ni­za­tions including Replit, Stanford, and Twelve Labs are building their own custom VLMs and LLMs on MosaicML because of the maximum control, privacy, and cost effi­cien­cies it affords. MosaicML customers have found that smaller models trained on their own domain-specific data perform better than large generic models like GPT 3.5, the original model behind ChatGPT.

Using the MosaicML platform, we were able to train and deploy our Ghostwriter 2.7B LLM for code generation with our own data within a week and achieve leading results.” Amjad Masad, CEO, Replit

MosaicML Inference Curates the Best Open Source Models

MosaicML Inference delivers maximum flexibility and choice for developers who want to add Generative AI to their appli­ca­tions. Developers can choose to deploy their own custom LLMs, or choose from a curated selection of the best open source LLMs available today, including the MosaicML Foundation Series of Models, Instructor-XL, Dolly, and GPTNeoX. The cost and time advantages of MosaicML Inference are attrib­ut­able to efficient ML systems engineering and opti­miza­tions that enable you to serve smaller lightweight domain-specific models.

MosaicML Inference offers two tiers for Generative AI developers to get started easily with their model deployments:

  1. Starter Tier: Open source models curated and hosted by MosaicML are offered as API endpoints for easy starts when adding Generative AI to applications.
  2. Enterprise tier: Custom models developed by enterprises to address specific business use cases. Model and data are fully secured in the customer’s enterprise environment.

MosaicML Foundation Series

The MosaicML Foundation Series are pre-trained GPT-style models for customers to fine tune and deploy. The LLMs in this series are in many cases higher performing than comparable open source models, with unique capa­bil­i­ties that go beyond GPT‑4. The first set of models in the series will be open-sourced to the community starting this week.

MosaicML Inference Delivers Privacy & Control

According to a recent KPMG study of 225 US executives, while two-thirds of executives believe that Generative AI will have a major impact on their business, nearly the same percentage say they are still one or two years away from deploying extensively into their operations. Two of the main reasons? Concerns about cyber security (81%) and data privacy (78%) issues.

In addition to unprece­dented cost effi­cien­cies, with MosaicML Inference orga­ni­za­tions can also develop and deploy their own generative AI models with complete data privacy and control. Developers can deploy on a secure cluster hosted by MosaicML or on their infra­struc­ture of choice such as AWS, Oracle Cloud Infra­struc­ture, and GCP. Developers can turn a saved model checkpoint into a secure, inexpensive API hosted within their own virtual private cloud (VPC) environment in under a minute. Inference data never leaves the secured environment of the user’s infra­struc­ture. MosaicML Inference also offers continuous monitoring of cluster and model metrics for enterprise-grade DevOps, ensuring complete trans­parency for model behavior.

To learn more about MosaicML’s revo­lu­tionary tools for building and training advanced AI models, visit

About MosaicML

MosaicML is an AI software infra­struc­ture company that enables orga­ni­za­tions to train, fine tune, and deploy Generative AI with full data privacy and model ownership. The MosaicML platform and the MosaicML foundation series of models provide enterprises with state-of-the-art capa­bil­i­ties that can signif­i­cantly accelerate their Generative AI transformation.

Press Contact