Deploy an Altair PBS Professional cluster in Azure CycleCloud
Published Jan 31 2023 03:15 PM 4,634 Views

Throughout the Azure HPC panorama, Azure CycleCloud is the tool of choice where HPC workloads requiring a scheduler like OpenPBS, Slurm or IBM LSF are migrated on Azure in a full cloud or in a hybrid scenario.

 

The role of Azure CycleCloud is to act as a deployment tool and an orchestrator for HPC clusters with a specific scheduler. As of today, among the multiple schedulers for which there are active templates and projects Altair® PBS Professional® is still lacking an integration on Microsoft Azure GitHub

 

This article is aimed at presenting the development of an integration for Altair® PBS Professional® built on top of the Azure CycleCloud PBS Pro project and in guiding a user through the configuration process of a cluster inside an Azure CycleCloud instance.

 

Creation of an Azure CycleCloud instance 

A prerequisite of this tutorial is the creation of an Azure CycleCloud instance. This can be achieved following the tutorials from the official documentation or following the TechCommunity guide to deploy Azure CycleCloud using an Infrastructure As Code and bicep templates

 

The Azure CycleCloud version on which this tutorial has been tested is Azure CycleCloud 8.3 (8.3.0-3062). However, it is expected that it will work appropriately in a span of versions around the target one.

 

Configuration of Azure CycleCloud CLI

After the creation of Azure CycleCloud instance, the next necessary step is the configuration of Azure CycleCloud CLI.

Azure CycleCloud CLI can be configured on any Windows or Linux system that can reach through HTTPS (port 443) the Azure CycleCloud instance.

 

Azure CycleCloud CLI can be used to perform programmatically a certain number of operations on Azure CycleCloud instance, without the need of interacting with the user interface.

Moreover, it allows to perform operations like cluster template importing that cannot be executed from the UI.

 

The quickest way to access an Azure CycleCloud CLI is to connect in SSH to the Azure CycleCloud instance where the Altair PBS Professional cluster should be deployed. The Azure CycleCloud VM image comes with an Azure CycleCloud CLI installation bundled.

 

In order to initialize the Azure CycleCloud CLI configuration, after connecting to the Azure CycleCloud VM in SSH using the parameters configured at VM creation time, let’s execute the following command:

 

 

cyclecloud initialize

 

 

As a first output we will get:

 

 

CycleServer URL: [http://localhost]

 

 

Since we are configuring the Azure CycleCloud CLI on the same VM where Azure CycleCloud is running, let’s just confirm with “Enter”. In case this operation is performed on an external machine, the correct URL should be provided as an input to the commandline.

 

In case you have not configured an SSL Certificate signed by a trusted Certification Authority, you will be asked to trust the untrusted SSL Certificate. This is ok for a testing environment, but it is suggested you configure your SSL Certificate in a production Azure CycleCloud instance.

 

 

Detected untrusted certificate.  Allow?: [yes]

 

 

After confirming with Enter, you will be asked Azure CycleCloud username and password (the same use to login in the web UI of Azure CycleCloud). Enter username and hit enter:

 

 

Using https://localhost
CycleServer username: [azureuser] wolfgang.desalvador

 

 

And then password:

 

 

CycleServer password:

Generating CycleServer key...

Initial account already exists, skipping initial account creation.

CycleCloud configuration stored in /home/azureuser/.cycle/config.ini

 

 

If you get the output above, the Azure CycleCloud CLI has been properly configured.

You can try a simple command like:

 

 

cyclecloud show_cluster

 

 

Which should list all the clusters (or none if you have not created one yet).

 

Import of the updated Azure CycleCloud OpenPBS project code

In order to support the extended logic required in Azure CycleCloud for node provisioning through Jetpack, it is necessary to import the updated project code in Azure CycleCloud server instance from the GitHub repository created as a fork to implement Altair PBS Professional support.

Please pay attention to the fact that this operation will update in the Azure CycleCloud instance the project code for all OpenPBS clusters that will be created after this point in time. 

In order to revert back to the previous code base, you should follow again the same process using as URL the one of the official OpenPBS CycleCloud repository, with the git tag matching the original version coming with Azure CycleCloud installed version.

 

In order to import the updated code, elevate yourself as root after connecting to Azure CycleCloud server in SSH.

 

 

sudo su

 

 

Then, let's insert the following content into a file named:

/opt/cycle_server/config/data/pbspro.txt

 

 

AdType = "Cloud.Project"
Version = "2.0.17"
ProjectType = "Scheduler"
Url = "https://github.com/wolfgang-desalvador/cyclecloud-pbspro/releases/2.0.17"
AutoUpgrade = false
Name = "pbspro"

 

 

If import has been successful, after a few seconds the template name should change with the “imported” extension attached:

 

 

ls /opt/cycle_server/config/data/

marketplace_site_id.txt.imported  pbspro.txt.imported  settings.txt.imported  theme.txt.imported

 

 

 

Import of Altair PBS Professional cluster template

As a next step, it is necessary to import Altair PBS Professional cluster template inside Azure CycleCloud instance.

This operation requires again Azure CycleCloud CLI. We will execute this again from the Azure CycleCloud VM shell, after connecting in SSH.

 

As a first step, let’s download the project files from the GitHub repository created as a fork to implement Altair PBS Professional support.

 

 

cyclecloud project fetch https://github.com/wolfgang-desalvador/cyclecloud-pbspro/releases/2.0.17 altair-pbs-pro

 

 

After download is completed, let’s go in the folder with Altair PBS Professional cluster template to check that the correct version is mentioned:

 

 

cd altair-pbs-pro/templates

 

 

In the pbspro.txt file, let's locate the PBSPro definition and insert the desired version in the options. The version should match the one in installer file name.

 

 

        [[[parameter PBSVersion]]]
        Label = Altair PBS Version
        Config.Plugin = pico.form.Dropdown
        Config.Entries := {[Label="Altair PBS Pro 2021.1"; Value="2021.1.4"]}
        DefaultValue = 2021.1.4

 

 

Label is purely an UI esthetic feature, the important elements are Value and DefaultValue .

 

Let's proceed importing the template with the Azure CycleCloud CLI:

 

 

cyclecloud import_template -f pbspro.txt

 

 

Result should look like the following:

 

 

Importing default template in pbspro.txt....
------------------------------------
Altair PBS Professional : *template*
------------------------------------
Resource group:
Cluster nodes:
    server: Off -- -- 
Total nodes: 1

 

 

Import of Altair PBS Professional binaries

As an additional step to allow Jetpack to provision and configure nodes with the commercial Altair PBS Professional version, they should be made available in CycleCloud project user blobs.

This is a necessary step since they cannot be redistributed in any form in Azure CycleCloud, but should be uploaded by the licensee in the Blob Storage account.

 

You need to use your own Altair PBS Professional licenses and installers according to your Altair PBS Professional license agreement.

This example will use the 2022.1.1 version, which has been tested with the template.

 

The necessary Altair PBS Professional binaries are the following (the same works with RHEL 8.x with .el8.x86_64.rpm):

 

 

pbspro-client-2022.1.1.el7.x86_64.rpm
pbspro-execution-2022.1.1.el7.x86_64.rpm
pbspro-server-2022.1.1.el7.x86_64.rpm

 

 

To copy the installers with azcopy to the Azure CycleCloud storage account, use the following commands (the same works with RHEL 8.x with .el8.x86_64.rpm):

 

 

azcopy cp pbspro-client-2022.1.1.el7.x86_64.rpm https://<storage-account-name>.blob.core.windows.net/cyclecloud/blobs/pbspro
azcopy cp pbspro-execution-2022.1.1.el7.x86_64.rpm https://<storage-account-name>.blob.core.windows.net/cyclecloud/blobs/pbspro
azcopy cp pbspro-server-2022.1.1.el7.x86_64.rpm https://<storage-account-name>.blob.core.windows.net/cyclecloud/blobs/pbspro

 

 

A guide to configure AzCopy can be found on Azure Learn for Windows and for Linux. In the commands it is necessary to handle also authentication that can be done through Azure AD  or using SAS Tokens.

 

Alternatively it is possible to upload binaries using Azure Portal

Creation of an Altair PBS Professional cluster with the new template

After all the steps mentioned above have been completed, it is possible to proceed in the standard way to create Altair PBS Professional clusters from Azure CycleCloud UI.

As a first step, let’s click on the “Create New Cluster” button:

 

1_a.png

 

Let’s select the new Template for Altair PBS Professional.

 

2_a.png

The cluster creation process follows the standard process, similar to an OpenPBS Cluster:

  • Selection of Cluster Name

3.png

  • Selection of VM Types for Server and Execution notes, autoscaling limits, spot configuration and subnet where resources should be deployed

4.png

  • Definition of NFS mounts (/shared and /sched) folders and additional NFS storages

5.png

In the Advanced settings tab, other than the usual settings typical of an OpenPBS cluster, there is an additional field to be specified: the license server of Altair PBS Professional. This should be reachable by all the nodes in the cluster on the designated port (both server and execution nodes).

The Altair PBS Professional versions that will be available in the drop-down menu will be the ones configured at the time of template import.

 

6_a.png

We remember that it is possible to update the template with the following CycleCloud CLI command, but creation Wizard in the UI must be restarted (web page reload is also suggested):

 

 

cyclecloud import_template -f pbspro.txt --force

 

 

After the cluster configuration is completed, we can now start the cluster:

 

8.png

 

After cluster is ready (server node is running and configuration is completed) it is possible to test a simple command on the head node to verify autoscaling capabilities:

 

 

qsub -I

 

 

This command will trigger the provisioning of one node in an interactive PBS session.

 

12.png

 

After a few minutes, the nodes is up and running and we will get a shell open on the execution node

 

15.png

 

Additional verifications

It is suggested to check the logs from PBS MoM (on execution nodes) and from PBS Server (on head node) to be sure that the RPMs have been installed correctly and all the OS dependencies are correctly found.

 

In our case, for example, on CentOS 7 the following warning is present in PBS MoM service log. In this case it is suggested to check if the missing packages can be installed on the target Operating System or if it is possible to get a binary compiled with the appropriate dependencies matrix for the target Operating System.

 

 

 

Jan 22 20:04:02 ip-0A02000A pbs_init.d[11470]: ###########################################################
Jan 22 20:04:02 ip-0A02000A pbs_init.d[11470]: WARNING: Unable to find OpenSSL version 1.1.1, hence python's ssl module will not work
Jan 22 20:04:02 ip-0A02000A pbs_init.d[11470]: ###########################################################

 

 

Project status

Current GitHub repository is a fork of main cyclecloud-pbspro project, and a Pull Request is currently been proposed.
If changes will be accepted, this template will be available out of the box in future versions of Azure CycleCloud.
Authors will try on a best effort and MIT licensing base to maintain the fork synchronized with the main branch.

 

#AzureHPCAI #AzureHPC

Version history
Last update:
‎Jan 31 2023 03:15 PM
Updated by: