Sign in to confirm you’re not a bot
This helps protect our community. Learn more
Deploying and Monitoring LLM Inference Endpoints
In this session we will dive into deploying LLMs to Production Inference Endpoints and then putting in place automated monitoring metrics and alerts to help track model performance and suppress potential output issues such as toxicity. We will also cover the process of optimizing LLMs using RAG, for relevant, accurate, and useful outputs. You will leave this session with a comprehensive understanding about deploying LLMs to production and monitoring the models for issues such as Toxicity, relevance, and accuracy. Try this Computer Vision model and other common AI use cases using the Wallaroo.AI Azure Inference Server Freemium Offer on Azure Marketplace - (https://aka.ms/Wallaroo-Inference) and also try the Free Wallaroo.AI Community Edition - (https://aka.ms/Wallaroo.AI-Free) [eventID:23187]

Follow along using the transcript.

Microsoft Reactor

114K subscribers
Live chat replay is not available for this video.