Making Software 10x Faster with Low-Level CPU Optimizations

ITPro Today

September 14, 2015

Making Software 10x Faster with Low-Level CPU Optimizations

Speaker: Sasha Goldshtein Modern processors are extremely complex. Writing fast code means not only avoiding slow APIs but also taking advantage of every last bit of performance the processor has to offer. In this session we'll review some key performance wins you can get from modern processors by properly using instruction-level parallelism, vectorizing loops, avoiding store-to-load forwarding stalls, making better use of the CPU cache, and employing other low-level optimizations that a regular profiler won't tell you about. To improve performance methodically without having to guess where the bottleneck lies, we'll use Intel VTune Amplifier, a low-level performance profiler that has incredible insight into CPU optimizations.

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like