Accéder au contenu.
Menu Sympa

starpu-devel - [Starpu-devel] [BUG] tests/datawizard/reclaim.c: hangs forever in task_wait_for_all when using OpenCL.

Objet : Developers list for StarPU

Archives de la liste

[Starpu-devel] [BUG] tests/datawizard/reclaim.c: hangs forever in task_wait_for_all when using OpenCL.


Chronologique Discussions 
  • From: Cyril Roelandt <cyril.roelandt@inria.fr>
  • To: starpu-devel@lists.gforge.inria.fr, ludovic.stordeur@inria.fr
  • Subject: [Starpu-devel] [BUG] tests/datawizard/reclaim.c: hangs forever in task_wait_for_all when using OpenCL.
  • Date: Fri, 09 Mar 2012 03:29:45 +0100
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Hello,

I am currently trying to write OpenCL versions of our CUDA codelets for all the programs found in examples/ and tests/.

It is going well so far, but I jave just stumbled upon the weirdest bug in tests/datawizard/reclaim.c. I applied the following patch :

--- tests/datawizard/reclaim.c (revision 6089)
+++ tests/datawizard/reclaim.c (working copy)
@@ -34,7 +34,7 @@
# define BLOCK_SIZE (64*1024*1024)
#endif

-static unsigned ntasks = 1000;
+static unsigned ntasks = 1;

#ifdef STARPU_HAVE_HWLOC
static uint64_t get_total_memory_size(void)
@@ -54,9 +54,9 @@

static struct starpu_codelet dummy_cl =
{
- .where = STARPU_CPU|STARPU_CUDA,
.cpu_funcs = {dummy_func, NULL},
.cuda_funcs = {dummy_func, NULL},
+ .opencl_funcs = {dummy_func, NULL},
.nbuffers = 3,
.modes = {STARPU_RW, STARPU_R, STARPU_R}
};

This patch is quite simple. It is worth noting that "dummy_func" is an empty function: it could not be any dummier. I have changed the value of ntasks because otherwise, the test takes quite a lot of time to run.

When running this test on a CPU, everything works as expected :

$ STARPU_NCPUS=1 STARPU_NCUDA=0 STARPU_NOPENCL=0 time -p ./tests/datawizard/reclaim
...
real 2.46
user 0.02
sys 1.98

With CUDA, things start to take an awful lot of time :

$ STARPU_NCPUS=0 STARPU_NCUDA=1 STARPU_NOPENCL=0 time -p ./tests/datawizard/reclaim
...
real 46.55
user 33.84
sys 15.13

With OpenCL, the program just hangs for ever: the program is stuck in starpu_task_wait_for_all().

I've run all these commands on Hannibal.

This reminds me of the bug Ludovic is currently struggling with : starpu_task_wait_for_all() never returns, and he uses OpenCL kernels. Ludovic, iirc, you started to debug this issue, so, could you tell us more about it ?

Cyril.





Archives gérées par MHonArc 2.6.19+.

Haut de le page