Choosing the Right Azure Spot VM for Rendering Workloads Using MoonRay
Published Feb 14 2024 08:00 AM 1,564 Views
Microsoft

Authored by: Aimee Garcia, PM AI Benchmarking, Rick Shahid, Azure HPC + AI Media and Entertainment Solution Architect, Isayah Reed, Senior Software Engineer, Jon Weisner, Director, Global Black Belt – HPC + AI Media and Entertainment, Joe Greenseid, Principal Product Manager

 

MoonRay is an open-source Monte Carlo Ray Tracing (MCRT) renderer developed by DreamWorks Animation. It has been utilized in all of DreamWorks Animation’s feature films such as “How to Train Your Dragon: The Hidden World” and “Puss In Boots: The Last Wish” as well as future unreleased titles. MoonRay delivers a broad range of images while maintaining a focus on computational , with features such as distributed rendering, a variety of filters and layer-able materials, and vectorization capabilities to optimize performance and the quality of rendered frames. It also includes a robust library of shaders and other features critical to production film development (more information about MoonRay is available here).

 

While rendering engines may run on either CPUs, GPUs, or both, major studios in the filmmaking industry most commonly employ CPU-based infrastructure for a variety of factors, including cost-effectiveness, return on investment (ROI), and technical requirements. In turn, in-studio software developers and ISVs focus on CPU performance and support. For more on the differences between CPU and GPU based rendering, there are many articles available.  

 

This blog aims to quantify the performance and cost of running MoonRay with a variety of scenes on multiple generations of Azure’s H-series family of HPC-focused virtual machines.

 

Scenes

 

We benchmarked three distinct scenes— Kitchen, Bathroom, and Alab, each of which is freely available from the OpenMoonRay GitHub repository. The Kitchen and Bathroom scenes were created by artists Jay-Artist and Mareck, respectively, and have been subsequently curated by Benedikt Bitterli. From a computational standpoint, the kitchen and bathroom scenes share similarities. Both scenes feature ample indirect light through large windows and primarily consist of elements like countertops and wood.

 

ALab is a comprehensive production scene crafted by Animal Logic and designed to be shared with the broader community for various purposes including demonstrations, training resources, and assessing Universal Scene Description (USD) compatibility across different software and pipelines. ALab offers over 300 assets, featuring high-quality textures and two characters with looping animations seamlessly integrated into shot contexts.

 

MoonRay utilizes its own format called RDL2, available in text and binary versions. The choice of format has minimal impact on computation time, with no significant differences in rendering efficiency. Notably, Alab is several hundred times larger than Kitchen and Bathroom.

 

MoonRay Table1.png

 

Optimizing Workloads with Azure HPC VMs

 

Azure HPC H-series VMs are designed to meet the unique demands of HPC applications, providing large CPU core counts, the highest available levels of memory bandwidth, and InfiniBand networking for scaling MPI workloads to extremely high levels of performance while maintaining cost efficiency. Below are summary specifications of three generations of H-series VMs, each utilized in this analysis.

 

MoonRay Table2.png

 

Performance Analysis of Azure HPC VMs

 

We created a custom script to establish a consistent environment, including the download of the Kitchen, Bathroom and ALab scenes, in preparation for rendering. All hardware threads from each VM were utilized as there is no cost penalty to doing so (MoonRay is a freely available renderer with no per-core licensing that might limit a customer’s ability to pragmatically utilize all available hardware threads). We rendered the three scenes ten times each and averaged the results for comparative purposes. The repetition of the rendering process also helps to identify any inconsistency of performance results so we can be highly confident the results shown here are what Azure customers would experience if running these tests themselves. All benchmarks on HBv2, HBv3 and HBv4 utilized CentOS Linux.

 

MoonRay Figure1.png

MoonRay Table3.png

 

In the three scenes, HBv4 consistently demonstrated an improvement in rendering speed—approximately 35% and 24% compared to HBv2 and HBv3 in the Kitchen scene, around 34% and 27% in the Bathroom scene, and roughly 53% and 41% in ALab compared to HBv2 and HBv3, respectively.

 

HBv4 consistently outperforming both HBv2 and HBv3 is not surprising, given that HBv4 has the newest processor with a higher level of instructions per clock, higher frequencies across all cores, and more hardware threads per CPU. We also found HBv3 performed approximately 5-10% better than HBv2 due to the higher IPC of Zen3 cores over Zen2.

 

For a customer looking to optimize for performance alone, HBv4 is the clear choice. Most customers, however, do not consider just performance. Cost/performance is also another major driver.

 

Cost/performance Analysis of Azure HPC VMs as of Early 2024

 

For our cost/performance analysis, we based our calculations on the hourly. Opting for Spot pricing addresses a common cost concern among Azure rendering customers, providing optimal pricing without sacrificing performance. Spot pricing is dynamic and can change over time based on availability of capacity. With HBv4 being our newest HPC VM, there is often less free capacity, so the Spot pricing discount will be less than more established VMs with bigger global deployments such as HBv2 or HBv3. That means that Spot pricing analyses are a “moment in time” result that can and will change. Azure’s rendering customers are very eager to see if the newly launched HBv4 instances are worth migrating to from HBv2 or HBv3 VMs. This analysis aims to address that question as of late 2023/early 2024, with the understanding as described above that HBv4 capacity is more constrained than HBv2 or HBv3, but we expect this to change over time in response to standard usage demand and evolution of Azure’s global deployment of HBv4.

 

As a workload, rendering is “embarrassingly parallel,” thus making the possibility of eviction of one or more nodes from a multi-node rendering job much less impactful than a tightly coupled simulation such as CFD. In a tightly coupled simulation, the untimely eviction of just one node will cause the entire simulation to abort, wasting money spent on computation on all nodes in the job to that point (at the very least since the last checkpoint). For an embarrassingly parallel application, however, the eviction of one node doesn’t affect the overall job, thus minimizing the impact of the untimely eviction.

 

Rendering customers commonly do this tradeoff analysis and often find lower pricing for Spot machines and overall TCO upside outweighs the impact of periodically restarting the evicted node’s render task. To conduct this analysis, we converted the output time of each VM to hours and assessed the comparative performance of HBv2, HBv3, and HBv4. Because “cost/performance” analyzes the lowest amount of cost for a normalized amount of performance, this means the lower the value the more cost performant a platform is – in other words, how little money was spent rendering the scene. 

 

MoonRay Figure2.png

 

In our cost/performance analysis, HBv4 was consistently more expensive than both HBv2 and HBv3 to complete the rendering tests by approximately 4-6 times, while HBv3 was approximately 5-8 more cost effective than HBv2 across the three rendered scenes. Again, bear in mind this comparison is largely a function of Spot pricing as of October 2023 and the ability of this particular class of customer and workload to be resilient to the tradeoffs involved with Spot evictions.

 

For many rendering customers, however, the cost/performance data is clear. If they can utilize Spot virtual machines, abundantly available HBv2 and HBv2 VMs currently offer economics upside relative to the brand new and heavily demanded HBv4 VMs.

 

Conclusion

 

During the benchmarking process HBv4 consistently outperformed HBv2 and HBv3 across the three different scenes. For rendering workloads run in Spot virtual machines, however, the greater Q4 2023 Spot discounts of HBv2 and HBv3 lead them to have better cost/performance. This shows the importance of recognizing the full context in which a customer operates, including the tradeoffs that can be made provided a workload can be tolerant to those tradeoffs.

 

Additional Information

Resources

 

Co-Authors
Version history
Last update:
‎Feb 14 2024 08:00 AM
Updated by: