Sign in to confirm you’re not a bot
This helps protect our community. Learn more
Build a multi-LLM chat application with Azure Container Apps
In this demo, explore how to leverage GPU workload profiles in ACA to run your own model backend, and easily switch, compare, and speed up your inference times. You will also explore how to leverage LlamaIndex to ingest data on-demand, and host models using Ollama. Then finally, decompose the application as a set of microservices written in Python, deployed on ACA. #microsoftreactor #multillm #llms #azurecontainerapps #azure #chatapp [eventID:22137]

Follow along using the transcript.

Microsoft Reactor

113K subscribers
Chat Replay is disabled for this Premiere.