This helps protect our community. Learn more

Deploy OpenAI Services at Scale Using Provision Throughput Units

Microsoft Developer

588K subscribers

1.5K views 5 months ago

…

...more

Deploy OpenAI Services at Scale Using Provision Throughput Units

35Likes

1,542Views

Nov 52024

In this episode of the Azure Essentials Show, Thomas and David discuss how businesses can implement and scale generative AI using Azure OpenAI Service. They explore different deployment options, focusing on standard and provisioned deployments, and provide demos on optimizing these deployments with Azure best practices. David explains the concept of Provisioned Throughput Units (PTUs) and offers practical tips for estimating PTU needs, checking quota, and purchasing reservations to ensure reliable performance and cost efficiency. Resources • Understanding Azure OpenAI Service deployment types https://learn.microsoft.com/azure/ai-... • Azure OpenAI Service Provisioned Throughput Units (PTU) onboarding https://learn.microsoft.com/azure/ai-... • Optimize spend and performance with Azure OpenAI Service provisioned reservations https://aka.ms/azure-pricing-AOAI-sta... • Save costs with Microsoft Azure OpenAI Service Provisioned Reservations https://learn.microsoft.com/azure/cos... • Save with Azure reservations https://learn.microsoft.com/azure/cos... • Explore essential resources! https://www.azure.com/solutions/azure... Related episodes • Watch additional pricing videos https://aka.ms/AzurePricingVideos • Watch the Azure Essentials Show https://aka.ms/AzureEssentialsShow Connect • Thomas Maurer / thomasmaurer2 • David Huntley / davidhuntley Chapters 0:00 Introduction 1:10 Pay-as-you-go 1:25 Provisioned deployments 1:45 PTUs explained 2:19 Demo: capacity calculator 3:35 Demo: Checking quotas 4:21 Demo: Create provision deployment 5:47 Hourly vs. reservations 6:30 Capacities are not guaranteed 7:17 Demo: Purchasing reservations 9:55 Monitoring usage 10:27 Tip: Create deployments then reservations 10:59 Resources

Transcript

Follow along using the transcript.

Microsoft Developer

588K subscribers

Deploy OpenAI Services at Scale Using Provision Throughput Units

Chapters View all

Introduction

Introduction

Introduction

Pay-as-you-go

Pay-as-you-go

Pay-as-you-go

Provisioned deployments

Provisioned deployments

Provisioned deployments

PTUs explained

PTUs explained

PTUs explained

Demo: capacity calculator

Demo: capacity calculator

Demo: capacity calculator

Demo: Checking quotas

Demo: Checking quotas

Demo: Checking quotas

Demo: Create provision deployment

Demo: Create provision deployment

Demo: Create provision deployment

Hourly vs. reservations

Hourly vs. reservations

Hourly vs. reservations

Microsoft Developer

Deploy OpenAI Services at Scale Using Provision Throughput Units

Comments

Chapters

Introduction

Introduction

Introduction

Pay-as-you-go

Pay-as-you-go

Pay-as-you-go

Provisioned deployments

Provisioned deployments

Provisioned deployments

PTUs explained

PTUs explained

PTUs explained

Demo: capacity calculator

Demo: capacity calculator

Demo: capacity calculator

Demo: Checking quotas

Demo: Checking quotas

Demo: Checking quotas

Demo: Create provision deployment

Demo: Create provision deployment

Demo: Create provision deployment

Hourly vs. reservations

Hourly vs. reservations

Hourly vs. reservations

Capacities are not guaranteed

Capacities are not guaranteed

Capacities are not guaranteed

Demo: Purchasing reservations

Demo: Purchasing reservations

Demo: Purchasing reservations

Monitoring usage

Monitoring usage

Monitoring usage

Tip: Create deployments then reservations

Tip: Create deployments then reservations

Tip: Create deployments then reservations

Resources

Resources

Resources

Description

Chapters View all

Microsoft Developer

Transcript

Essential Azure Skilling and Guidance

Chapters

Chapters