Rackspace Calls on Software Devs to Break CPU Performance Barriers
The days of getting software improvements by the brute force performance improvement of a generic CPU are slowing down drastically.
April 15, 2016
A number of various laws of physics have evidently conspired to prevent the performance-doubling strategies of processor builders from being carried out interminably. Two months ago, Intel began a campaign of letting down the server manufacturing world gently. At the annual ISSCC conference in San Francisco, Intel Executive Vice President Bill Holt admitted to attendees that, beyond the 7 nm lithography process, engineers would need to resort to out-and-out quantum physics extrapolations if it expects to maintain the same performance improvements as when Moore’s Law was moving swimmingly along.
The OpenPOWER Foundation brings together industry engineers from Google, Micron Technology, NVIDIA, Samsung, and of course, IBM — whose 24-core, 14 nm Power9 processors are on track for H2 2017. Even put together, these organizations may be as short on quantum physicists as Intel. So in a presentation for the annual OpenPOWER summit last week, Rackspace Distinguished Engineer Aaron Sullivan made a suggestion that would have sounded outlandish, if the alternative on the table didn’t involve striking some kind of a bargain with quarks.
Stated simply, Sullivan suggested that the consortium seek out Python developers to work together to resolve the performance barricade issues in their spare time.
A Whole New Drawing Board
“Today, developers are really not where we need them. We’ve done this to ourselves in the industry with things like Java and Python,” said Sullivan. Showing a diagram depicting the relative disparity between digital logic (the raw programming that determines the schematics of integrated circuits) and scripting (the types of jobs that produce Web pages), he continued, “today, most of our developers live in the world of scripting and very abstract languages.”
It was evidence for a much broader case: As the performance requirements of data centers continue to scale linearly as though physics were not standing in the way, many are coming to believe, engineers’ attention should shift from “cramming” more components onto integrated circuits (to borrow Gordon Moore’s phrase) to improving the performance of workloads which those ICs process.
It’s a case that’s been made in the pages of Datacenter Knowledge before. Wrote AppliedMicro Vice President John Williams last November, “The future of the data center is a broad set of solutions using cost-effective, energy-efficient processors.” One of those solutions, Williams wrote, could be a workload accelerator — a class of hardware designed specifically to improve the performance of software.
That’s exactly what Rackspace’s Sullivan is alluding to. IBM is currently producing what it calls a Coherent Accelerator Processor Interface for its Power8 processors. Not unlike leveraging GPUs for highly-parallel math operations, CAPI enables highly parallel operations to be offloaded from the CPU to an FPGA, which is programmable in a way that’s a bit more similar to programming software than architecting integrated circuits.
Experience Bottlenecks
A bit more similar, but not much, said Kurt Marko, veteran consultant and analyst with MarkoInsights, in an interview with Datacenter Knowledge. While some software devs may understand the basics of how instructions are translated into Boolean logic and down into microscopic gates, their experience would not translate to an ability to design FPGAs.
“He (Sullivan) was talking about how CPU architectures enable hardware accelerators for specific applications — GPUs being one of them, but FPGAs being the notable, other category,” explained Marko, who was in attendance Wednesday. “Where he was going with that Python developer comment was: When you develop FPGAs, they’ve got their own design language. . . Until you can make that hardware customization easy enough for Python developers, it’s going to have limited appeal.”
OpenPOWER competitor Intel has been thinking along much the same lines as CAPI, as was made evident by its acquisition last year of FPGA producer Altera. What CPU makers are realizing is that the overall performance of servers can still be perceived to increase, by the people who run workloads on those servers. The problem at hand is how to shift the burden — maybe slightly, maybe significantly — from IC engineers to IT professionals. (The path Marko foresees to making this shift happen, is the subject of his blog post for Diginomica published Thursday.)
As Sullivan explained to OpenPOWER Summit, the education and experience necessary for a person to become a traditional hardware engineer simply won’t scale to the same breadth as a traditional software developer. And a small subset of software devs are working with the lower elements in the hardware stack — at the operating system and driver level.
“We have very clever developers, and we love what they do because they make our lives more convenient and more interesting,” said Sullivan. “And those entrepreneurs are Python developers. So what if your world mostly consists of this sort (Python devs) and you want to retool?”
Road Closed Ahead
As Marko confirmed for us, Sullivan wasn’t trying to draw any correlations between FPGAs and Python. Rather, his point was to illustrate that scripters constituted the bulk of the mindset for today’s devs — even though he made that point in what Marko called a “mirthful” way. It’s those devs to whom FPGA will somehow need to appeal, in order to acquire the depth of contribution necessary to make workload performance improve the way it had been at the dawn of the multicore era.
“The industry needs to have better design tools for building customizable hardware circuits that implement specific software features in hardware,” said Marko. “The thing that I’m seeing — from Google, IBM, OpenPOWER, and Intel too — is that Moore’s Law is wheezing. It’s not keeping up. The days of just getting software improvements just by brute force performance improvement of a generic CPU, are slowing down drastically.
“The way to keep up with that historical performance curve,” he continued, “isn’t to try and just wring out faster, general-purpose CPUs. It’s to apply the hardware to more specific problem domains, so that you’re actually accelerating specific functions that are time-consuming in software.”
In closing his Wednesday speech, Aaron Sullivan recalled that every successful effort at bringing new developers to the table for Linux improvements, has been through the distribution of more productive tools for open source efforts. But even these were served up with an extra helping of evangelism.
“You want to get to the moon? You ’ve got to bring a lot of people from different disciplines together; you give them a big, crazy challenge; you tell them how awesome it would be if we got there. And it’s not just the shuttle you build; you’ve got to build the platform underneath it as well.”
About the Author
You May Also Like