Azure Managed Lustre with Automatic Synchronisation to Azure BLOB Storage
Published Nov 30 2023 08:29 AM 2,359 Views
Microsoft

Introduction

This blog post walks through how to setup an Azure Managed Lustre Filesystem (AMLFS) that will automatically synchronise to an Azure BLOB Storage container. The synchronisation is achieved using the Lustre HSM (Hierarchical Storage Management) interface combined with the Robinhood policy engine and a tool that reads the Lustre changelog and synchronises metadata with the archived storage. The lfsazsync repository on GitHub contains a Bicep template to deploy and setup a virtual machine for this purpose.

 

Disclaimer: The lfsazsync deployment is not a supported Microsoft product you are responsible for the deployment and operation of the solution. There are updates that need applying to AMLFS that will require a Support Request to be raised through the Azure Portal. These updates could effect the stabaility of AMLFS and customer requiring the same level of SLA should speak to their Microsoft representative.

Initial Deployment

The following is required before running the lfsazsync Bicep template:

  • Virtual Network
  • Azure BLOB Storage Account and container (HNS is not supported)
  • AMLFS deployed without HSM enabled

The lfsazsync repository contains a test/infra.bicep example to create the required resources:

 

lfsazsync-prerequisite.jpg

 

To deploy, first create a resource group, e.g.

TODO: set the variables below
resource_group=
location=
az group create --name $resource_group --location $location
 

Then deploy into this resource group:

az deployment group create --resource-group $resource_group --template-file test/infra.bicep
 

Note: The bicep file has parameters for names, ip ranges etc. that should be set if you do not want the default values.

 

Updating the AMLFS settings

Once deployment is complete, navigate to the Azure Portal, locate the AMLFS resource and click on "New Support Request". The following shows the suggested request to get AMLFS updated:

 

amlfs-support-request.jpg

 

The lctl commands needed are listed here.

 

Deploying Azure BLOB Storage Synchronisation

The lfsazsync deployment sets up a single virtual machine for all tasks. The HSM copytools could be run on multiple virtual machines to increase transfer peformance. The bandwidth for archiving and retrieval is constrained to approximately half the network bandwidth available to the virtual machine. It is important to note that the same network will be utilized for both accessing the Lustre filesystem and accessing Azure Storage. This should be considered when deciding the virtual machine size. The virtual machine sizes and expected network performance is available here.

 

The Bicep template has the following parameters:

 

Parameter Description
subnet_id The ID of the subnet to deploy the virtual machine to
vm_sku The SKU of the virtual machine to deploy
admin_user The username of the administrator account
ssh_key The public key for the administrator account
lustre_mgs The IP address/hostname of the Lustre MGS
storage_account_name The name of the Azure storage account
storage_container_name The container to use for synchonising the data
storage_account_key A SAS key for the storage account
ssh_port The port used by sshd on the virtual machine
github_release Release tag where the robinhood and lemur will be downloaded from
os The OS to use for the VM (options: ubuntu2004 or almalinux87)

 

The SAS key can be generated using the following Azure CLI command:

# TODO: set the account name and container name below
account_name=
container_name=

start_date=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
expiry_date=$(date -u +"%Y-%m-%dT%H:%M:%SZ" --date "next month")

az storage container generate-sas \
   --account-name $account_name \
   --name $container_name \
   --permissions rwld \
   --start $start_date \
   --expiry $expiry_date \
   -o tsv
 

The following Azure CLI command can be used to get the subnet ID:

# TODO: set the variable below
resource_group=
vnet_name=
subnet_name=

az network vnet subnet show --resource-group $resource_group --vnet-name $vnet_name --name $subnet_name --query id --output tsv
 

The following Azure CLI command can be used to deploy the Bicep template (as an alterative to setting environment variables, the parameters could be set in a parameters.json file):

# TODO: set the variables below
resource_group=
subnet_id=
vmsku="Standard_D32ds_v4"
admin_user=
ssh_key=
lustre_mgs=
storage_account_name=
storage_container_name=
storage_sas_key=
ssh_port=
github_release="v1.0.1"
os="almalinux87"

az deployment group create \
    --resource-group $resource_group \
    --template-file lfsazsync.bicep \
    --parameters \
        subnet_id="$subnet_id" \
        vmsku=$vmsku \
        admin_user="$admin_user" \
        ssh_key="$ssh_key" \
        lustre_mgs=$lustre_mgs \
        storage_account_name=$storage_account_name \
        storage_container_name=$storage_container_name \
        storage_sas_key="$storage_sas_key" \
        ssh_port=$ssh_port \
        github_release=$github_release \
        os=$os
 

After this call completes the virtual machine will be deployed although it will take more time to install and import the metadata from Azure BLOB storage into the Lustre filesystem. The progress can be monitored by looking at the /var/log/cloud-init-output.log file on the virtual machine.

 

Monitoring

The install will set up three systemd services for lhsmd, robinhood and lustremetasync. The log files are located here:

  • 'lhsmd': /var/log/lhsmd.log
  • 'robinhood': /var/log/robinhood*.log
  • 'lustremetasync': /var/log/lustremetasync.log

 

Default archive settings

The synchronisation parameters can be controlled through the Robinhood config file, /opt/robinhood/etc/robinhood.d/lustre.conf. Below are some of the default settings and their locations in the config file:

 

Name Default Location
Archive interval 5 minutes lhsm_archive_parameters.lhsm_archive_trigger
Rate limit 1000 files lhsm_archive_parameters.rate_limit.max_count
Rate limit interval 10 seconds lhsm_archive_parameters.rate_limit.period_ms
Archive threshold last modified time > 30 minutes lhsm_archive_parameters.lhsm_archive_rules
Release trigger 85% of OST usage lhsm_archive_parameters.lhsm_release_trigger
Small file release last access > 1 year lhsm_archive_parameters.lhsm_release_rules
Default file release last access > 1 day lhsm_archive_parameters.lhsm_release_rules
File remove removal time > 5 minutes lhsmd.lhsmd_remove_rules

 

To update the config file, edit the file and then restart the robinhood service, systemctl restart robinhood.

The lustremetasync service is processing the Lustre ChangeLog continuously. Therefore, actions will happen immediately unless there is a lot of IO all at once where it may take a few minutes to catch up. The following operations will be handled:

 

  • Create/delete directories

    Directories are created in BLOB storage as an empty object with the name of the directory. There is metadata on this file to indicate that it is a directory. The same object is deleted when removed on the filesystem.

  • Create/delete symbolic links

    Symbolic links are create in BLOB storage as an empty object with the name of the symbolic link. There is metadata on this file to indicate that it is a symbolic link and this contains the path that it is linking to. The same object is deleted when removed on the filesystem.

  • Moving files or directories

    Moving files or directories requires everything being moved to be restored to the Lustre filesystem. The files are then marked as dirty in their new location and the existing files are deleted from BLOB storage. Robinhood will handle archiving the files again in their new location.

  • Updating metadata (e.g. ownership and permissions)

    The metadata will only be updated for archived files that isn't modified. Modified files will have the metadata set when Robinhood updated the archived file.

 

References

Co-Authors
Version history
Last update:
‎Nov 30 2023 08:29 AM
Updated by: