Sign in to confirm you’re not a bot

This helps protect our community. Learn more

Exploring Kaito to streamline AI inference model deployment in Azure Kubernetes

Microsoft Reactor

114K subscribers

210 views Streamed 9 months ago

*About this session:* Roy Kim will be presenting Kaito, an operator streamlining AI/ML inference model deployment in Kubernetes. Discover how Kaito simplifies deployment of large open-source inference models like Falcon and LLAMA2. Learn its unique features: managing large model files with container images, preset GPU configurations, auto-provisioning GPU nodes, and hosting on Microsoft Container Registry (MCR). See how Kaito simplifies the workflow of onboarding large AI inference models in Kubernetes. *Learn more and develop your skills in Azure Kubernetes Service with this Microsof Learn training module:* https://aka.ms/IntroToAKSLearn3 [eventID:22967]

...more

...more

Exploring Kaito to streamline AI inference model deployment in Azure Kubernetes

*About this session:* Roy Kim will be presenting Kaito, an operator streamlining AI/ML inference model deployment in Kubernetes. Discover how Kaito simplifies deployment of large open-source inference models like Falcon and LLAMA2. Learn its unique features: managing large model files with container images, preset GPU configurations, auto-provisioning GPU nodes, and hosting on Microsoft Container Registry (MCR). See how Kaito simplifies the workflow of onboarding large AI inference models in Kubernetes. *Learn more and develop your skills in Azure Kubernetes Service with this Microsof Learn training module:* https://aka.ms/IntroToAKSLearn3 [eventID:22967]

Transcript

Follow along using the transcript.

Microsoft Reactor

114K subscribers

Live chat replay is not available for this video.