A Beginners Guide to Developing Gen AI Applications on Databricks with Mosaic AI

Generative AI applications
A Beginners Guide to Developing Gen AI Applications on Databricks with Mosaic AI

Dr. Hassan Sherwani

Data Analytics & AI/ML Practice Head

December 3, 2024

Implement Mosaic AI to Build Your Own Gen AI Apps
Generative AI’s role in the workplace cannot be overstated; it is the engine that holds promise to power the future of work itself. The market for generative AI platforms has grown exponentially since last year. It is projected to grow further, driven by the increasing adoption of AI across industries. IDC research reveals that the AI platform software market is set to grow by $153 billion in 2028.
Databricks with Mosaic AI aims to provide AI developers and engineers with a platform to build, deploy, and manage high-grade generative AI (GenAI) applications efficiently. Our blog covers all you need to know about Mosaic AI’s capabilities for model serving, agent frameworks, and Retrieval Augmented Generation (RAG) applications.
Developing Gen AI Applications on Databricks

Introducing Mosaic AI

Mosaic AI is one of Databricks’ latest solutions. It is part of MosaicML, which was acquired by Databricks in 2023 to provide a solution for enterprises that want to create and customize their generative AI models.  Mosaic AI offers a platform where AI developers can serve and query Gen AI models using open-source LLMs or third-party large language models (LLMs) such as OpenAI or Anthropic. These models can handle diverse tasks like language generation, code execution, data retrieval, etc. With Databricks, developers can also access powerful tools for building and managing RAG-based AI agents, which combine external knowledge retrieval with language model generation.

How to Deploy Gen AI Models on Mosaic AI for Simple Use Cases

The Mosaic AI model supports serving and querying two types of generative AI models:
  • Foundation Model APIs that are optimized for inference tasks, such as DBRX Instruct, Llama-2-70B-chat, BGE-Large, and Mistral-7B, are available with pay-per-token pricing for immediate use.
  • Databricks support external models such as GPT-4 and Claude, and developers can establish rate limits and access control.
AI Models on Mosaic AI

For deployment of the generative AI models, developers can create the model serving endpoints, which can then be queried through API calls or integrated into other apps. Creating the AI model’s serving endpoints simply requires three steps: selecting the type of model (foundation vs. external), configuring the endpoints, and using provisioned compute resources to deploy the models.

Creating Retrieval Augmented Generation (RAG) Applications with Databricks Mosaic AI

Retrieval augmented generation, aka RAG AI applications, is the next step for enterprises that wish to change how they work. A typical RAG architecture consists of the following steps:
  • Retrieval: To fetch supporting information, a user’s request is used to query an external data source, such as a vector database or an SQL query.
  • Augmentation: The retrieved data (structured or unstructured) is combined with the user’s query and passed as input to the LLM.
  • Generation: The LLM uses this input to generate a response enhanced by the external data source.

By incorporating external data sources during inference, the RAG AI design pattern enhances LLMs due to constant access to updated, contextually relevant information. This improves responses, making it ideal for applications requiring proprietary or real-time data, such as retrieving customer information from a database or summarizing recent documents/articles for knowledge management. Furthermore, with Databricks tools like Delta Lake, Unity Catalog, and Vector Search, users can leverage efficient, scalable retrieval mechanisms.

How to Build a RAG-based Chatbot

  • Ingest Data: Use Databricks Jobs to ingest data from proprietary sources and store it in Delta Lake or Unity Catalog volumes.
  • Process Documents: Parse, extract, and chunk documents to prepare them for retrieval. Use Model Serving to generate embeddings for these chunks.
  • Vectorize Queries: Use Model Serving to embed user queries into vectors that can be compared with document embeddings stored in the vector database.
  • Retrieve Relevant Data: Perform vector similarity searches using Databricks Vector Search to retrieve relevant data chunks.
  • Augment and Generate: Pass the retrieved data to the LLM for response generation, augmented by the context provided by the data retrieval process.

Developing RAG Applications on Databricks with Mosaic Agent AI Framework

Databricks simplifies the creation of RAG applications by having products for data management, vector search, and model serving. Databricks Mosaic Agent AI Framework has tools for building, deploying, and evaluating AI agents for RAG applications. Critical features of the Mosaic AI Agent framework include:

  • The ability to create tools that agents use to perform actions beyond language generation, such as retrieving data, executing code, or interfacing with external services like sending emails or Slack messages. These tools are defined in Python or Unity Catalog functions, enabling developers to balance performance and governance needs.
  • The Mosaic AI Agent framework allows integration with third-party libraries and frameworks, including LangChain and LangGraph, to create custom agent behaviors.
  • With native MLflow integration, developers can create, log, and parameterize agents, making it easier to iterate and experiment with different agent configurations and tool combinations.
  • Ability to leverage Unity Catalog Functions for structured, governed, and discoverable tools for AI agents. For instance, developers can create SQL-based functions that allow agents to retrieve specific data or execute predefined tasks.

Evaluating the Generative AI Models on Databricks

Once deployed, continuous evaluation and monitoring of the generative AI application is crucial. Databricks provides tools for evaluating model performance, monitoring latency, and logging requests and responses. Features like token streaming, request/response logging, and feedback capture through inference tables allow developers to gain deep insights into how the application performs in real-world scenarios.

Agent Tracing and Debugging

Mosaic AI’s agent tracing features, integrated with MLflow, allow developers to log, analyze, and compare traces across agent executions. This is especially helpful for debugging multi-step AI workflows like RAG, where understanding each step’s contribution to the final result is vital to optimizing performance.

How Royal Cyber Can Help Build Generative AI Apps with Databricks Mosaic AI

Developers can efficiently create and manage production-quality AI systems with Databrick’s powerful tools for model serving, AI agent frameworks, and Retrieval Augmented Generation (RAG) workflows. The ability to integrate open-source and third-party models and robust data retrieval and augmentation capabilities make Databricks a go-to platform for building cutting-edge GenAI solutions.

Royal Cyber, a trusted Databricks partner, brings extensive experience in AI, data engineering, and cloud solutions. Our data and AI experts are well-versed in leveraging Databricks’ full potential to create custom generative AI models tailored to specific business use cases. Collaborating with Royal Cyber ensures that your organization benefits from industry-leading AI expertise, helping you stay ahead of the competition in the rapidly evolving landscape of AI-driven transformation. For more information, contact us at [email protected].

Author

Priya George

 

Recent Blogs
  • How to Write Test Cases: Introduction and Best Practices
    Learn to write effective test cases. Master best practices, templates, and tips to enhance software …
    Read More »
  • MuleSoft Admin Co-Pilot: Revolutionize Integration Management
    In today’s fast-paced digital landscape, seamless data integration is crucial for business
    Read More »
  • Revolutionizing Customer Support with Salesforce Einstein GPT for Service Cloud
    Harness the power of AI with Salesforce Einstein GPT for Service Cloud. Unlock innovative ways …
    Read More »