Thursday, October 1, 2015

install catalyst driver on ubuntu 14.04

Download the driver from here.
I am using HD 7970M. You can select your device driver from http://support.amd.com/en-us/download.

If you choose the ubuntu version, there will be three deb packages to download.

Here are the names of my downloaded files.
[1]fglrx_15.201-0ubuntu1_amd64_UB_14.01.deb
[2]fglrx-amdcccle_15.201-0ubuntu1_amd64_UB_14.01.deb
[3]fglrx-core_15.201-0ubuntu1_amd64_UB_14.01.deb
[4]fglrx-dev_15.201-0ubuntu1_amd64_UB_14.01.deb

Before installing the debian packages, run the following command to solve the dependency requirement.
$sudo dpkg --add-architecture i386
$sudo apt-get install lib32gcc1 libc6-i386 dkms
( option packages $sudo apt-get install debhelper dh-modaliases execstack dpkg-dev )

When you run into any depedency issues, try to run $sudo apt-get install -f
You may need to $sudo apt-get update at some point.

Then $sudo dpkg -i *.deb

I am using a laptop with hybrid graphics cards.
$sudo amdconfig –initial –adapter=all

Then reboot, login, $clinfo or $glrxinfo, to see the driver is installed or not.

If the program hangs or not correct drivers are found, don't panic, try the following steps.
(https://help.ubuntu.com/community/BinaryDriverHowto/AMD
http://askubuntu.com/questions/465778/gnome-ubuntu-14-04-aticonfig-no-supported-adapters-detected-ati-radeon-6770m)

sudo cp /etc/X11/xorg.conf /etc/X11/xorg.conf.BAK
sudo apt-get purge fglrx*
reboot
sudo apt-get install fglrx
sudo apt-get install fglrx xvba-va-driver libva-glx1 libva-egl1 vainfo
sudo amdconfig --adapter=all --initial
reboot
if you are introduced with a black screen, don't panic.
ctrl + alt + f1
sudo amdconfig –initial –adapter=all
reboot
check $fglrxinfo
or
sudo apt-get install mesa-utils
glxheads




Sunday, March 29, 2015

about open 2.1

[1] c++ 11 vs. c++ 14
C++14 is intended to be a small extension over C++11, featuring mainly bug fixes and small improvements.

* Function return type deduction

* Alternate type deduction on declaration
decltype for reference or non-reference type

* Relaxed constexpr restrictions
C++11 constexpr functions could only contain a single expression that is returned (as well as static_asserts and a small number of other declarations)
C++14 will relax these restrictions.


* Variable templates
In prior versions of C++, only functions, classes or type aliases could be templated. C++14 now allows the creation of variables that are templated.

* Aggregate member initialization

* Binary literals
Numeric literals in C++14 can be specified in binary form.

* Digit separators

* Generic lambdas
In C++11, lambda function parameters need to be declared with concrete types. C++14 relaxes this requirement, allowing lambda function parameters to be declared with the auto type specifier.

* Lambda capture expressions

* The attribute [[deprecated]]


Tuesday, January 27, 2015

memory buffers

clCreateBuffer(): visible to gpu only

clCreateBuffer() with CL_MEM_USE_HOST_PTR, visible to gpu and cpu, but implicitly copying data from host to device, reducing performance.

clCreateBuffer()  + CL_MEM_ALLOC_HOST_PTR. This allocates memory that both CPU and GPU can use without a copy. visible to both cpu and gpu


For more info, check this thread, http://malideveloper.arm.com/downloads/deved/tutorial/SDK/opencl/memory_buffers_tutorial.html.

About zero-copy examples, check this.
https://software.intel.com/en-us/articles/getting-the-most-from-opencl-12-how-to-increase-performance-by-minimizing-buffer-copies-on-intel-processor-graphics



Address Space Qualifier

OpenCL implements the following disjoint address spaces:
 __global, __local, __constant and __private.

__global uses the global memory

__local uses local memory

__constant uses constant memory, read-only

__private: it signals that the variable belongs to only the thread

The generic address space name for arguments to a function in a program, or local variables of a function is __private. All function arguments shall be in the __private address space.

Monday, January 26, 2015

Typo on AMD OpenCL 20 Blog

I noticed there are some typos on amd blog related to opencl 2.0, http://developer.amd.com/community/blog/2014/10/24/opencl-2-shared-virtual-memory/

"clSVMEnqueueMap and clSVMEnqueueUnmap" should be "clEnqueueSVMMap and clEnqueueSVMUnmap".