Dr. Hassan Sherwani
Data Analytics & AI/ML Practice Head
December 3, 2024
Home » Blogs » Data Bricks » A Beginners Guide to Developing Gen AI Applications on Databricks with Mosaic AI
Introducing Mosaic AI
Mosaic AI is one of Databricks’ latest solutions. It is part of MosaicML, which was acquired by Databricks in 2023 to provide a solution for enterprises that want to create and customize their generative AI models. Mosaic AI offers a platform where AI developers can serve and query Gen AI models using open-source LLMs or third-party large language models (LLMs) such as OpenAI or Anthropic. These models can handle diverse tasks like language generation, code execution, data retrieval, etc. With Databricks, developers can also access powerful tools for building and managing RAG-based AI agents, which combine external knowledge retrieval with language model generation.
How to Deploy Gen AI Models on Mosaic AI for Simple Use Cases
- Foundation Model APIs that are optimized for inference tasks, such as DBRX Instruct, Llama-2-70B-chat, BGE-Large, and Mistral-7B, are available with pay-per-token pricing for immediate use.
- Databricks support external models such as GPT-4 and Claude, and developers can establish rate limits and access control.
For deployment of the generative AI models, developers can create the model serving endpoints, which can then be queried through API calls or integrated into other apps. Creating the AI model’s serving endpoints simply requires three steps: selecting the type of model (foundation vs. external), configuring the endpoints, and using provisioned compute resources to deploy the models.
Creating Retrieval Augmented Generation (RAG) Applications with Databricks Mosaic AI
- Retrieval: To fetch supporting information, a user’s request is used to query an external data source, such as a vector database or an SQL query.
- Augmentation: The retrieved data (structured or unstructured) is combined with the user’s query and passed as input to the LLM.
- Generation: The LLM uses this input to generate a response enhanced by the external data source.
By incorporating external data sources during inference, the RAG AI design pattern enhances LLMs due to constant access to updated, contextually relevant information. This improves responses, making it ideal for applications requiring proprietary or real-time data, such as retrieving customer information from a database or summarizing recent documents/articles for knowledge management. Furthermore, with Databricks tools like Delta Lake, Unity Catalog, and Vector Search, users can leverage efficient, scalable retrieval mechanisms.
How to Build a RAG-based Chatbot
- Ingest Data: Use Databricks Jobs to ingest data from proprietary sources and store it in Delta Lake or Unity Catalog volumes.
- Process Documents: Parse, extract, and chunk documents to prepare them for retrieval. Use Model Serving to generate embeddings for these chunks.
- Vectorize Queries: Use Model Serving to embed user queries into vectors that can be compared with document embeddings stored in the vector database.
- Retrieve Relevant Data: Perform vector similarity searches using Databricks Vector Search to retrieve relevant data chunks.
- Augment and Generate: Pass the retrieved data to the LLM for response generation, augmented by the context provided by the data retrieval process.
Developing RAG Applications on Databricks with Mosaic Agent AI Framework
Databricks simplifies the creation of RAG applications by having products for data management, vector search, and model serving. Databricks Mosaic Agent AI Framework has tools for building, deploying, and evaluating AI agents for RAG applications. Critical features of the Mosaic AI Agent framework include:
- The ability to create tools that agents use to perform actions beyond language generation, such as retrieving data, executing code, or interfacing with external services like sending emails or Slack messages. These tools are defined in Python or Unity Catalog functions, enabling developers to balance performance and governance needs.
- The Mosaic AI Agent framework allows integration with third-party libraries and frameworks, including LangChain and LangGraph, to create custom agent behaviors.
- With native MLflow integration, developers can create, log, and parameterize agents, making it easier to iterate and experiment with different agent configurations and tool combinations.
- Ability to leverage Unity Catalog Functions for structured, governed, and discoverable tools for AI agents. For instance, developers can create SQL-based functions that allow agents to retrieve specific data or execute predefined tasks.
Evaluating the Generative AI Models on Databricks
Once deployed, continuous evaluation and monitoring of the generative AI application is crucial. Databricks provides tools for evaluating model performance, monitoring latency, and logging requests and responses. Features like token streaming, request/response logging, and feedback capture through inference tables allow developers to gain deep insights into how the application performs in real-world scenarios.
Agent Tracing and Debugging
Mosaic AI’s agent tracing features, integrated with MLflow, allow developers to log, analyze, and compare traces across agent executions. This is especially helpful for debugging multi-step AI workflows like RAG, where understanding each step’s contribution to the final result is vital to optimizing performance.
How Royal Cyber Can Help Build Generative AI Apps with Databricks Mosaic AI
Developers can efficiently create and manage production-quality AI systems with Databrick’s powerful tools for model serving, AI agent frameworks, and Retrieval Augmented Generation (RAG) workflows. The ability to integrate open-source and third-party models and robust data retrieval and augmentation capabilities make Databricks a go-to platform for building cutting-edge GenAI solutions.
Author
Priya George
Recent Posts
- A Beginners Guide to Developing Gen AI Applications on Databricks with Mosaic AI December 3, 2024
- Databricks DBRX: All You Need to Know to Implement the Future of AI December 3, 2024
- Smart Apparel Analyzer: AI-Powered Clothing Description Generator | Demo December 2, 2024
- Mastering the My List Feature in SAP Commerce: A Comprehensive Guide December 2, 2024
- Learn to write effective test cases. Master best practices, templates, and tips to enhance software …Read More »
- In today’s fast-paced digital landscape, seamless data integration is crucial for businessRead More »
- Harness the power of AI with Salesforce Einstein GPT for Service Cloud. Unlock innovative ways …Read More »