Why Conductor Chose HashiCorp Nomad over Kubernetes for VFX
Running special effects rendering jobs in the cloud is no easy task, but it's one that Conductor has figured out how to do, without relying on Kubernetes.
Visual effects (VFX) rending is among the most intensive compute tasks, as it often requires large numbers of clustered systems with high-powered GPUs.
Until fairly recently, the power to render visual effects was largely relegated to costly on-premises deployments, but that's no longer the case. Founded in 2017, Oakland, California-based Conductor Technologies has built out a cloud platform that enables movie makers to render visual effects as service in a usage-based model.
In a session at the HashiConf Europe virtual conference that took place earlier this month, Conductor's Lead Software Engineer Jonathan Cross (pictured, left) and Senior DevOps Engineer Carlos Robles (pictured, right) outlined how their company built out its cloud VFX rendering platform.
Conductor uses different public cloud platforms, including Google and Amazon Web Services, and started out using each platform's managed services to help enable workload orchestration. On Google, that meant using Kubernetes with the Google Kubernetes Engine (GKE). What Conductor discovered over time and through experience was that there was a better, more performant approach by running its own instances of the open-source HashiCorp Nomad, which is a rival to Kubernetes.
How Conductor Orchestrates Cloud VFX Workloads
Conductor's platform has been used on big-name Hollywood movies and productions, including Blade Runner 2049, Deadpool, Game of Thrones, Hellboy and Stranger Things, among others, Robles said.
Much like how modern DevOps teams iterate releases by pushing new code to a branch and then merging that code, Robles said VFX designers push new bits of a special effects sequence incrementally.
"Conductor can be thought of as a continuous deploymentplatform for VFX workloads," he said. "We automate the process of converting assets and dependencies into a finalized render output in much the same way that a CI/CD [continuous integration/continuous deployment] processor is going to convert a code base and its dependencies into a built-in packaged app that can be deployed."
Conductor provides movie makers with scalable and secure VFX platforms where cost is based on the amount of compute time and services used, according to Robles. With a time-based system, having faster rendering is critical.
"Because we charge on demand and by the minute, we don't want to spend any time or money on compute unless it's actively rendering," he said.
Making Movie Magic with HashiCorp Nomad
Conductor doesn't have a large team of engineers, and as such Robles said there was a requirement to have a back-end system that can be highly automated in a repeatable manner.
The original idea was to have a managed service where the orchestration is handled by the cloud provider as a way to make things easier for Robles and his team. That didn't happen because the Conductor team discovered that when running the GKE, it was better optimized for long-running predictable workloads and not the unpredictable batch type of workloads that Conductor was submitting. Conductor also didn't have the visibility needed for auto-scaling as demand grows, he added.
That led Conductor to build out its own Nomad-based system to orchestrate rendering workloads. The configuration of the system is driven by an infrastructure-as-code policy from the Hashicorp Terraform system. With Terraform, Conductor is able to define how it wants its Nomad node deployed and configured.
To get a quantitative idea of how much better it is to run its rendering job through HashiCorp Nomad as opposed to Kubernetes, Conductor's Cross conducted a basic benchmarking evaluation.
On startup time, Cross said Nomad had a 63% advantage over GKE. He attributed the Nomad speed win to the fact that Conductor was able to highly customize the image, removing any unnecessary steps to optimize startup time.
"We went from zero to production in a month and migrated most of our customers to Nomad," Cross said. "So, for those who think that having to own or operate a cluster will be a time sink, we found Nomad as easy to get as going with managed services, but with much more flexibility."
About the Author
You May Also Like