Accéder au contenu.
Menu Sympa

starpu-devel - Re: [Starpu-devel] StarPU SVN and OpenCL on CPU

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] StarPU SVN and OpenCL on CPU


Chronologique Discussions 
  • From: George Russell <george@codeplay.com>
  • To: Sylvain HENRY <sylvain.henry@inria.fr>
  • Cc: starpu-devel@lists.gforge.inria.fr
  • Subject: Re: [Starpu-devel] StarPU SVN and OpenCL on CPU
  • Date: Thu, 24 Feb 2011 17:35:46 +0100
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Hi,

I have applied the patch + rebuilt; now, at least two of the OpenCL examples run.

examples/basic_examples/variable_kernels_opencl_kernel.cl
examples/filters/fblock

In others, the behaviour seems the same as before, with deadlock / looping

examples/basic_examples/block_opencl_kernel.cl
examples/basic_examples/vector_scal_opencl_kernel.cl
examples/matvecmult
examples/spmv/dw_spmv

I have attached a patch to fix compilation of some of the OpenCL kernels on ATI (just adding f suffixes, as required in OpenCL).

Cheers,
George

On 24/02/2011 11:33, Sylvain HENRY wrote:
4D663411.3080600@inria.fr"> Hi,

See the attached patch. However, I'm using Git and I don't know if this patch is compatible with other patch systems.
If you want to apply it with Git:

git svn clone --stdlayout svn://scm.gforge.inria.fr/svn/starpu
cd starpu
git apply blabla.patch

Cheers
Sylvain

Le 23/02/2011 21:40, George Russell a écrit :
4D6570A2.80108@codeplay.com"> Hi,

Great that it works ;-)

Could you post a patch against the StarPU SVN trunk for this? I am not terribly concerned about the "hackiness" of this at the moment, as I simply want to see how I can get OpenCL code running in StarPU on the CPU as I at present don't have access to an OpenCL GPU in a context where StarPU will build!

Cheers,
George

On 23/02/2011 20:32, Sylvain HENRY wrote:
4D6560BF.9090803@inria.fr"> Hi George,

I succeeded in using StarPU with the AMD Stream SDK on CPU but it requires a hack.

The problem is that StarPU uses asynchronous data transfers and uses polling (clGetEventInfo(CL_EVENT_COMMAND_EXECUTION_STATUS...)) to know when a transfer is terminated. But it seems that the asynchronous data transfer (i.e. memcpy) is never performed by AMD Stream. A blocking call is required for the transfer to be performed...

A first simple solution would be to detect when the worker is an OpenCL CPU device and to somehow force the call to be blocking.

The solution I used is different. Because I wanted to avoid the superfluous memcpy, I used an OpenCL buffer created with the flag CL_MEM_USE_HOST_PTR. Each time StarPU wants to transfer data from host memory to the OpenCL CPU device, it calls a function with the source address and the target buffer. This is where I free the buffer and replace it with a new one "mapping" the data.

This hack worked for me because I only had OpenCL devices (no CPU worker from a StarPU point of view, only OpenCL ones). With CPU workers, data may get corrupted as we are faking a copy.

The long term solution would be for StarPU to consider OpenCL CPU devices as CPU workers, that is, workers performing their computations in host memory (and using CPU cores... but this is another story).

Cheers
Sylvain



Index: examples/basic_examples/variable_kernels_opencl_kernel.cl
===================================================================
--- examples/basic_examples/variable_kernels_opencl_kernel.cl (revision
3238)
+++ examples/basic_examples/variable_kernels_opencl_kernel.cl (working copy)
@@ -18,6 +18,6 @@
{
const int i = get_global_id(0);
if (i == 0)
- input[i] = input[i] + 1.0;
+ input[i] = input[i] + 1.0f;
}

Index: examples/mandelbrot/mandelbrot.c
===================================================================
--- examples/mandelbrot/mandelbrot.c (revision 3238)
+++ examples/mandelbrot/mandelbrot.c (working copy)
@@ -21,7 +21,7 @@
#endif
#include <sys/time.h>
#include <math.h>
-
+#include <limits.h>
#ifdef STARPU_HAVE_X11
#include <X11/Xlib.h>
#include <X11/Xutil.h>
Index: examples/incrementer/incrementer_kernels_opencl_kernel.cl
===================================================================
--- examples/incrementer/incrementer_kernels_opencl_kernel.cl (revision
3238)
+++ examples/incrementer/incrementer_kernels_opencl_kernel.cl (working copy)
@@ -18,6 +18,6 @@
{
const int i = get_global_id(0);
if (i == 0 || i == 3)
- input[i] = input[i] + 1.0;
+ input[i] = input[i] + 1.0f;
}




Archives gérées par MHonArc 2.6.19+.

Haut de le page