Sign in to confirm you’re not a bot
This helps protect our community. Learn more
These chapters are auto-generated

Intro

0:00

Triton Inference Server Open Source Software For Fast, Scalable, Simplified Inference Serving

1:00

Overview of Triton Inference Server

1:38

Triton Supports Wide Range of Backends

2:28

Microsoft Teams Live Captioning Service

3:43

CPU-based architecture

4:40

Exploring GPUs to reduce service cost

5:19

Challenges & Solutions

8:21

Background Project Z-Code

17:56

Sparse Model for Efficient Scaling

18:45

Background Machine Translation Paradigm Shift

19:57

Sparse Model architecture Mixture of Experts

20:51

Background Limitations of the current paradigm

22:19

Scaling Challenge

23:29

Multi-task Multilingual Mixture of Experts: Training improvements

25:44

MT Quality Improvement with Z-code MoE

26:59

Document Translation Service and Triton integration

27:46

Resources

35:53
Azure Cognitive Service deployment: AI inference with NVIDIA Triton Server | BRKFP04
15Likes
2,110Views
2022May 27
Join us to see how Azure Cognitive Services utilize NVIDIA Triton Inference Server for inference at scale. We highlight two use cases: deploying first-ever Mixture of Expert model for document translation and acoustic model for Microsoft Teams Live Captioning. Tune in to learn about serving models with NVIDIA Triton, ONNX Runtime and custom backends. Additional Resource: Live Captions in Teams meeting -- https://support.microsoft.com/en-us/o... Southpaw Token-based load balancer @Scale announcement -- https://atscaleconference.com/videos/... Triton GitHub -- https://github.com/triton-inference-s... Custom C/C++ backends -- https://github.com/triton-inference-s... Writing a custom Python backend -- https://github.com/triton-inference-s... Southpaw Token-based load balancer -- https://atscaleconference.com/videos/... Custom Python backends -- https://github.com/triton-inference-s... Triton GitHub -- https://github.com/triton-inference-s... Recommended Next Step: Please visit the NVIDIA showcase -- https://mybuild.microsoft.com/en-US/p... Microsoft Build 2022

Follow along using the transcript.

Microsoft Developer

588K subscribers