The International Conference for High Performance Computing, Networking, Storage and Analysis
local_malloc(): malloc() for OpenCL __local memory.
Student: John Kloosterman (Calvin College)
Supervisor: Joel Adams (Calvin College)
Abstract: One of the complexities of writing kernels in OpenCL is managing the scarce per-workgroup __local memory on a device. For instance, temporary blocks of __local memory are necessary to implement algorithms like non-destructive parallel reduction. However, all __local memory must be allocated at the beginning of a kernel, and programmers are responsible for tracking which buffers can be reused in a kernel. We propose and implement an extension to OpenCL that provides a malloc()-like interface for allocating workgroup memory. This extension was implemented by using an extension to the Clang compiler to perform a source-to-
source transformation on OpenCL C programs.