Tuesday, January 27, 2015

memory buffers

clCreateBuffer(): visible to gpu only

clCreateBuffer() with CL_MEM_USE_HOST_PTR, visible to gpu and cpu, but implicitly copying data from host to device, reducing performance.

clCreateBuffer()  + CL_MEM_ALLOC_HOST_PTR. This allocates memory that both CPU and GPU can use without a copy. visible to both cpu and gpu


For more info, check this thread, http://malideveloper.arm.com/downloads/deved/tutorial/SDK/opencl/memory_buffers_tutorial.html.

About zero-copy examples, check this.
https://software.intel.com/en-us/articles/getting-the-most-from-opencl-12-how-to-increase-performance-by-minimizing-buffer-copies-on-intel-processor-graphics



Address Space Qualifier

OpenCL implements the following disjoint address spaces:
 __global, __local, __constant and __private.

__global uses the global memory

__local uses local memory

__constant uses constant memory, read-only

__private: it signals that the variable belongs to only the thread

The generic address space name for arguments to a function in a program, or local variables of a function is __private. All function arguments shall be in the __private address space.

Monday, January 26, 2015

Typo on AMD OpenCL 20 Blog

I noticed there are some typos on amd blog related to opencl 2.0, http://developer.amd.com/community/blog/2014/10/24/opencl-2-shared-virtual-memory/

"clSVMEnqueueMap and clSVMEnqueueUnmap" should be "clEnqueueSVMMap and clEnqueueSVMUnmap".