Nvidia updates programming tools to speed up GPU performance
Nvidia has made improvements to its underlying software tools to make it easier to write programs for faster execution across CPUs and graphics processors.
The company on Thursday announced CUDA 6, which will make programming easier for supercomputers, servers, PCs and, to a lesser extent, smartphones. The goal of CUDA is to provide underlying tools so programmers can off-load processing from CPUs to GPUs, which is faster for technical and graphics applications.
CUDA 6 offers unified memory, an advanced management feature that makes GPU memory as readily accessible as CPU memory. Previously, data had to be moved from CPUs to GPUs for execution and then moved back, creating two data pools. But with unified memory, a developer doesn’t have to manage where data is to take advantage of the GPU.
The memory management feature will figure out whether to dispatch data to CPU or GPU memory, and that will reduce the need for programmers to add lines of code to define where the data should be sent, said Sumit Gupta, general manager of Tesla Accelerated Computing products at Nvidia.
The developer doesn’t have to worry about GPU and CPU memory because those won’t be handled by programming but by directives in CUDA, said Dan Olds, principal analyst at Gabriel Consulting Group.
“This addition to CUDA is evolutionary, not revolutionary,” Olds said.
For example, GPUs are often used in servers for desktop virtualization and automating memory placement could speed up the deployment of OS instances via virtual machines, Olds said.
CUDA’s unified memory feature is similar to the HSA (Heterogeneous System Architecture) Foundation’s HUMA specification, in which different memory types in a system are shared between processors. The specification allows programmers to write applications without worrying about which memory resource the code is transferred to. Nvidia is not a member of the HSA Foundation, which was founded last year by companies including Advanced Micro Devices, ARM, Qualcomm and others.
The feature also falls in line with Nvidia’s goal to make CPU and GPU memory a shared resource at the hardware level. Nvidia’s upcoming Tegra 6 mobile processor code-named Parker will pool CPU and GPU memory in devices, servers and PCs to expand the addressable memory available to programs. Currently, GPU and CPU memory are divided, though the unified memory feature in CUDA 6 should alleviate that problem at a software layer until the implementation of upcoming hardware improvements.
Some of the world’s fastest supercomputers use GPUs as co-processors to speed up computing. The last CUDA update, version 5.5, added support for ARM CPUs. CUDA 6 has also has improved libraries that could speed up calculations on graphics processors.