Background:
There would be scenarios where Customers would require to find the Storage Account capacity with segregation of soft delete, Version, snapshot and Active data.
Calculate the size of a Blob storage Blob Inventory:
The Azure Storage blob inventory feature provides an overview of your containers, blobs, snapshots, and blob versions within a storage account. Use the inventory report to understand various attributes of blobs and containers such as your total data size, age, encryption status, immutability policy, and legal hold and so on. The report provides an overview of your data for business and compliance requirements.
For more details, please visit here : Azure Storage blob inventory | Microsoft Learn
We can make use of the Blob Inventory report and Azure Synapse Workspace to calculate Capacity of storage account to understand how much soft deleted capacity , Active data capacity and version capacity . This approach would be advisable if there is Soft Delete, versioning or snapshots enabled on the Storage Account. For this approach, we need to first enable the Blob Inventory report on the storage account and the next step would be to calculate the capacity using Azure Synapse Workspace.
Step 1: - Enable Inventory Report
The Add a rule page appears.
If you select Blob, then under Blob subtype, choose the types of blobs that you want to include in your report, and whether to include blob versions, snapshots, deleted blobs in your inventory report.
Inventory output :
Each inventory rule generates a set of files in the specified inventory destination container for that rule. The inventory output is generated under the following path: https://<accountName>.blob.core.windows.net/<inventory-destination-container>/YYYY/MM/DD/HH-MM-SS/<rulename. where:
Step 2:- Azure Synapse Workspace
Create an Azure Synapse workspace where you will execute a SQL query to report the inventory results.
After you create your Azure Synapse workspace, do the following steps.
Sample query to find the Soft deleted Data.
If you have single csv file give the complete path of the same and in case multiple csv files give the path till where CSV file is present and add *.csv instead of specific file name
select SUM("Content-Length") as ctl
from openrowset(
bulk 'https://Storageaccountname.blob.core.windows.net/Conatinername/*.csv',
format = 'csv',
parser_version = '2.0',
HEADER_ROW =TRUE
) with ("Name" Varchar(200) COLLATE Latin1_General_100_BIN2_UTF8,"Content-Type" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Content-Encoding" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Content-Language" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Content-CRC64" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Content-MD5" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Cache-Control" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8, "Content-Disposition" VARCHAR(100) COLLATE Latin1_General_100_BIN2_UTF8, BlobType VARCHAR(100) COLLATE Latin1_General_100_BIN2_UTF8,AccessTier varchar(100) COLLATE Latin1_General_100_BIN2_UTF8, "Snapshot" VARCHAR(100) COLLATE Latin1_General_100_BIN2_UTF8, VersionId VARCHAR(100) COLLATE Latin1_General_100_BIN2_UTF8,IsCurrentVersion varchar(100) COLLATE Latin1_General_100_BIN2_UTF8, AccessTierChangeTime varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Deleted" Varchar(100) COLLATE Latin1_General_100_BIN2_UTF8 ,"Content-Length" BigInt) as rows
where Deleted='true'
Sample query to find the active data
select SUM("Content-Length")
from openrowset(
bulk 'https://Storageaccountname.blob.core.windows.net/conatinername/*.csv',
format = 'csv',
parser_version = '2.0',
HEADER_ROW =TRUE
) with ("Name" Varchar(200) COLLATE Latin1_General_100_BIN2_UTF8,"Content-Type" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Content-Encoding" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Content-Language" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Content-CRC64" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Content-MD5" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Cache-Control" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8, "Content-Disposition" VARCHAR(100) COLLATE Latin1_General_100_BIN2_UTF8, BlobType VARCHAR(100) COLLATE Latin1_General_100_BIN2_UTF8,AccessTier varchar(100) COLLATE Latin1_General_100_BIN2_UTF8, "Snapshot" VARCHAR(100) COLLATE Latin1_General_100_BIN2_UTF8, VersionId VARCHAR(100) COLLATE Latin1_General_100_BIN2_UTF8,IsCurrentVersion varchar(100) COLLATE Latin1_General_100_BIN2_UTF8, AccessTierChangeTime varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Deleted" Varchar(100) COLLATE Latin1_General_100_BIN2_UTF8 ,"Content-Length" BigInt) as rows
where "Deleted" is null
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.